Your AI Agent's Plugins Could Be Malware. Nobody's Checking.

341 malicious skills found in a single AI agent marketplace. We've seen this movie before with npm and Docker Hub. The sequel is worse.

Mar 01, 2026

Koi Security just audited 2,857 skills on ClawHub, one of the growing marketplaces where AI agents pick up their tools. Out of 2,857 skills scanned, 341 were malicious. That’s nearly 12 percent.

Multiple campaigns. Different bad actors. All hiding in the tools your AI agent trusts by default.

If you’re deploying AI agents in any capacity, this is the story you need to understand.

What Are AI Agent Skills?

Think of AI agent skills like browser extensions, but for your AI. They’re plugins that give an AI agent new capabilities. Search a database. Pull a file. Execute code. Call an API. Post to Slack. Read your email.

The AI model itself is the brain. Skills are the hands. And just like you wouldn’t let a stranger’s hands rummage through your filing cabinet, you shouldn’t let unvetted skills connect to your AI agent’s toolchain.

But that’s exactly what’s happening.

Most AI agent frameworks, including those built on the Model Context Protocol (MCP), allow users and developers to install skills from community marketplaces. The model calls the skill when it needs a specific capability. The skill executes and returns results. The model trusts what comes back.

There’s no sandboxing. No code review. No permission scoping in most implementations. The agent calls the tool, the tool runs, and whatever it returns shapes the agent’s next action.

We’ve Seen This Movie Before

If this sounds familiar, it should.

Remember when npm had a malicious package problem? Developers would install a package called something like “cross-env” and get “crossenv” instead, a typosquatted package that exfiltrated environment variables. Same thing happened with PyPI. Same thing with Docker Hub, where researchers found hundreds of trojanized container images sitting in public registries.

Every time we build a marketplace based on trust and convenience, the same pattern emerges. Bad actors show up, name their malicious package something plausible, and wait for people to install it without checking.

The AI agent skill marketplace is the same story with higher stakes.

Why the Sequel Is Worse

Here’s where it diverges from traditional supply chain attacks.

When you install a malicious npm package, it runs once during installation or build. It does its damage and you move on, patch it out, rotate your secrets.

A malicious AI agent skill is different. The agent calls it repeatedly throughout its operation. Every time the agent needs that capability, it invokes the skill, passes it data, and acts on whatever comes back. A malicious skill doesn’t just compromise a single build pipeline. It compromises every decision the agent makes going forward.

Imagine an AI agent that helps your team manage customer support tickets. It has a skill for searching your knowledge base. If that skill is malicious, it can exfiltrate every query your team runs. Or it can return subtly wrong answers that cause your team to give bad information to customers. Or it can inject instructions into the agent’s context that redirect its behavior entirely.

That last one is the convergence of supply chain poisoning and prompt injection. The skill doesn’t just steal data. It hijacks the agent’s reasoning by injecting malicious content into the context the agent uses to make decisions.

The Trust Problem

The core issue is that AI agents operate on implicit trust. When the model calls a skill, it assumes the skill is doing what it says it does. There’s no verification layer. No integrity check. No behavioral monitoring in most frameworks.

This is the same mistake we made with browser extensions in 2012, with npm packages in 2018, and with container images in 2020. We keep building trust-by-default ecosystems and then acting surprised when they get exploited.

The difference now is that AI agents are being deployed in increasingly sensitive contexts. Customer data. Financial systems. Internal communications. Code repositories. Each skill connection is a potential exfiltration path that the agent itself will happily use because it doesn’t know any better.

What Needs to Happen

If you’re deploying AI agents in production, you need a skill vetting process. Not eventually. Now.

Audit every skill your agents use. Know what code they’re running, who wrote it, and what data they have access to. Don’t install community skills without reviewing the source.

Implement least-privilege for skills. A skill that searches a knowledge base shouldn’t have write access. A skill that formats text shouldn’t have network access. Scope permissions to the minimum required functionality.

Monitor skill behavior in production. Log what data skills receive and what they return. Look for anomalies. A skill that suddenly starts returning different response patterns or making unexpected network calls is a red flag.

Push your AI framework vendors for better security primitives. Sandboxing. Code signing. Behavioral attestation. The tools need to catch up to the threat.

We spent years learning not to run code from strangers. Now we’re handing AI agents a basket of unvetted tools and telling them to go nuts.

Same lesson. New wrapper. The stakes are just higher this time.

Security Signal

Discussion about this post

Ready for more?