What You're Installing When You Add an MCP Server
We indexed over 25,000 MCP servers from public registries and built a free API for querying dependency risk by server. Phase 1 and 2 findings from the mistaike.ai research pipeline.
There's a simple question most MCP users can't answer before installing a server:
What am I actually installing?
When you add an MCP server to your agent, you're not just adding a tool. You're inheriting its code, its dependencies, and its behaviour. In many cases that includes a large and often opaque dependency tree, along with whatever known vulnerabilities exist within it.
To better understand this, we ran a large-scale analysis of MCP servers drawn from public registries. This post covers the first two phases of that work: inventory and dependency risk.
We're also publishing the results as a public API so anyone can query the data directly.
Public API: mistaike.ai/cve-registry — no API key required.
Phase 1 — Inventory
We began by collecting MCP servers from public registry sources and normalising them into a single dataset.
Across sources, this produced a working indexed dataset of over 25,000 distinct MCP implementations drawn from two registries. The goal of Phase 1 was coverage, not judgement: what exists in the ecosystem?
Phase 2 — Dependency and CVE Scanning
We then analysed repositories and dependency graphs to identify known vulnerability exposure.
For each server, we:
- enumerated dependencies
- mapped them to known CVEs and advisories
- tracked counts and worst-case severity
- recorded package footprint where available
The output is a server-level view of dependency risk — something that doesn't exist in standard vulnerability databases, which index packages, not deployable tools. The live index currently covers over 6,000 servers.
What the Index Shows
The CVE registry returns entries sorted by CVE count by default. Some examples from the upper end of the distribution, described by category:
- A multi-agent framework for orchestrating AI pipelines — 103 known vulnerabilities in its dependency tree, 4 rated critical. The high count is largely attributable to transitive dependencies from pulling in a broad AI ecosystem. This server is well-maintained; the vulnerabilities are in its supply chain, not its own code.
- A console automation server that exposes shell command execution to agents — 65 known vulnerabilities, worst severity critical.
- A developer CLI management server — 47 known vulnerabilities, worst severity critical.
- An infrastructure configuration server used for network operations — 46 known vulnerabilities, worst severity critical.
These examples come from the first page of results. The full index is queryable at mistaike.ai/cve-registry.
What Stood Out
Several patterns emerged from the data.
Dependency risk is widespread. A meaningful portion of MCP servers carry known vulnerabilities through their dependency trees. In some cases, individual servers accumulate dozens or more CVEs — often through transitive dependencies rather than direct code. The multi-agent framework example above is a good illustration: the author's code isn't the problem; the problem is what it depends on, and what those dependencies depend on.
Severity alone isn't a sufficient signal. Some servers with very high CVE counts have only low-severity issues. Others with fewer total CVEs include critical-severity packages. Both dimensions matter when assessing risk.
Dependency sprawl is common. Many MCP servers pull in large numbers of packages, increasing both attack surface and maintenance burden. Combined with unpinned dependencies — which resolve to the latest version at install time — this creates non-deterministic builds and makes remediation harder.
None of these patterns are unique to MCP. They reflect broader software supply chain issues. What makes MCP different is where these servers run: often locally, often with access to files, tokens, APIs, and developer workflows.
Why Publish This as an API
Public CVE databases exist, but they don't map cleanly to deployable units like MCP servers.
If you're deciding whether to install an MCP server, the question isn't which CVEs exist in the ecosystem? It's what known vulnerability exposure am I inheriting if I run this server?
The public CVE registry is designed to answer that question directly.
How to Use It
The CVE registry supports:
- search by name or repository (
?search=) - filter by severity (
?severity=critical|high|medium|none) - sort by CVE count, severity, or recency (
?sort=) - pagination (
?page=,?page_size=)
Example uses:
- check a server before adding it to your agent configuration
- integrate into CI/CD to flag high-risk servers before deployment
- build dashboards tracking ecosystem risk over time
- prioritise manual review of servers with critical-severity exposure
How This Fits the Broader Research
The CVE index is Phase 2 of a larger analysis pipeline.
Later phases focus on runtime behaviour: what MCP servers actually do when executed, what network connections they make, and whether that behaviour aligns with user expectations and documentation.
Initial work on a deeper behavioural analysis phase examined a subset of servers at runtime. The results were largely reassuring: 86% of servers examined showed no concerning behaviour beyond their documented purpose. A small number showed behaviours worth investigating further. The five most significant:
Undisclosed telemetry on a local execution tool. A server designed for local desktop automation — file operations, terminal access — silently calls Google Analytics and a first-party telemetry endpoint on every tool invocation. The server runs locally; the tracking does not.
User queries sent over plain HTTP to a bare IP. A domain research tool routes user input — including project descriptions, keywords, and repository context — over unencrypted HTTP to a server identified only by a raw IP address running an LLM. No transport security. Not mentioned in documentation.
Steganographic Unicode watermarking. A server embeds invisible Unicode characters into every response it produces. The characters encode a persistent machine identifier that travels with the output wherever it goes — into the AI's context, into logs, into any downstream system. Undisclosed, not opt-in, not visible.
Query logging and AI platform profiling. A search-oriented server stores every query verbatim in a server-side database against the user's API key, building a 90-day history. It also inspects environment variables at startup to identify which AI client is in use and embeds this in outbound requests. The README describes two tools; the server exposes nineteen.
Unredacted user inputs forwarded to third-party analytics. A blockchain data server sends the full contents of each tool call — including user-supplied arguments — to a third-party analytics platform on every invocation, along with the calling client's name and version.
These findings are being manually validated and shared with the relevant maintainers before full publication. All of the servers involved are listed across the major MCP registries — the official MCP directory, Glama, and Smithery. These are not obscure or fringe listings — they are servers a developer would encounter through normal discovery.
A full behavioural analysis post will follow that process.
Important Caveats
This dataset should be treated as a signal, not a verdict.
A CVE doesn't necessarily mean a vulnerability is exploitable in your environment. Some issues exist in unused code paths, or may already be mitigated in practice. Dependency graphs can both overstate and understate real-world risk.
At the same time, known vulnerabilities are relevant. They indicate maintenance posture, upgrade cadence, and potential exposure that's worth understanding before deployment.
What's Missing Today
One structural gap became clear during this work: there is currently no standard way for MCP servers to declare external network dependencies, telemetry behaviour, data categories transmitted, or whether such behaviour is optional.
As a result, users often rely on documentation, source code review, or trust alone to understand what a server does. That doesn't scale as the ecosystem grows.
Conclusion
The MCP ecosystem is growing quickly. That growth brings a need for better visibility into what is being installed and executed.
This work focuses on the first layer of that visibility: dependency risk. By publishing a public CVE index mapped directly to MCP servers, the aim is to make it easier for developers and organisations to understand what they're adopting before they run it.
This is not a claim that the ecosystem is unsafe. Many servers show minimal or manageable exposure. But every MCP server brings its own supply chain. Understanding that supply chain is a necessary first step.
About This Data
This API includes publicly known vulnerability data only. It does not include behavioral analysis, telemetry findings, or unverified security concerns, which are handled separately and may be subject to validation and responsible disclosure processes.
This data is intended to support risk awareness and prioritization, not to label projects as insecure or malicious. Users should review context, validate findings, and consider their own threat model before making decisions.
CVE Registry: mistaike.ai/cve-registry — no API key required.
Feedback and questions welcome.