AI tool poisoning is a critical issue that highlights a significant vulnerability in enterprise agent security. It occurs when AI agents select tools from shared registries based on natural-language descriptions, without any human verification of the accuracy of these descriptions. This oversight was brought to light by the author's submission to the CoSAI secure-ai-tooling repository, which was split into two separate issues: selection-time threats and execution-time threats. The author argues that tool registry poisoning is not a single vulnerability but rather a complex issue with multiple facets at every stage of a tool's lifecycle.
The author emphasizes the importance of distinguishing between artifact integrity and behavioral integrity. While artifact integrity controls, such as code signing, SLSA, and SBOMs, focus on verifying whether an artifact matches its description, behavioral integrity is crucial for agent tool registries. Behavioral integrity ensures that a tool behaves as intended and does not act on unintended data. The author provides examples of attack patterns that artifact-integrity checks fail to detect, such as prompt-injection payloads and behavioral drift.
The author warns against the misconception that applying SLSA and Sigstore to agent tool registries will solve the problem. This approach, similar to the HTTPS certificate mistake of the early 2000s, only addresses identity and integrity assurances without addressing the actual trust question. To address this, the author proposes a verification proxy that sits between the model context protocol (MCP) client (the agent) and the MCP server (the tool). This proxy performs three validations: discovery binding, endpoint allowlisting, and output schema validation.
The behavioral specification is a key component of this solution, providing a machine-readable declaration of a tool's external endpoints, data reads and writes, and side effects. This specification is included in the tool's signed attestation, ensuring tamper-evident and verifiable runtime behavior. The author argues that while a lightweight proxy can add minimal overhead, full data-flow analysis is more suitable for high-assurance deployments.
The author outlines the strengths and limitations of both provenance and runtime verification. Provenance alone cannot catch post-publication attacks, while runtime verification without provenance lacks a baseline for comparison. Therefore, a combination of both layers is necessary. The author suggests a gradual rollout strategy, starting with endpoint allowlisting at deployment time, followed by output schema validation and discovery binding for high-risk tool categories.
In conclusion, the author emphasizes the importance of addressing AI tool poisoning by implementing a comprehensive solution that combines provenance and runtime verification. This approach ensures that enterprise agents can securely select and interact with tools from shared registries, mitigating the risks associated with tool registry poisoning.