diff --git a/skills/ai-security/agent-security/SKILL.md b/skills/ai-security/agent-security/SKILL.md index 0e5e9a3a..28908255 100644 --- a/skills/ai-security/agent-security/SKILL.md +++ b/skills/ai-security/agent-security/SKILL.md @@ -14,7 +14,7 @@ phase: [design, build, review] frameworks: [OWASP-Agentic-AI, NIST-AI-RMF-1.0] difficulty: advanced time_estimate: "60-120min" -version: "1.0.2" +version: "1.0.3" author: unitoneai license: MIT allowed-tools: Read, Grep, Glob @@ -77,7 +77,7 @@ Before beginning the assessment, gather the following. If any item is unavailabl | Context Item | Where to Find It | Why It Matters | |---|---|---| | Agent architecture diagram | Design docs, README, infrastructure code | Maps trust boundaries, delegation chains, tool surface | -| Tool/function definitions | Code files defining tool schemas, OpenAPI specs, MCP server configs | Determines what each agent can do and with what parameters | +| Tool/function definitions | Code files defining tool schemas, OpenAPI specs, MCP server configs | Determines what each agent can do and with what parameters. **MCP servers are now an active supply chain attack target** (GitHub Actions workflow poisoning specifically targeting MCP repos, 2026) — verify MCP server provenance using [MCP Shield](https://github.com/GaboITB/mcp-shield) before deployment. | | Permission/IAM configuration | Cloud IAM, role definitions, service account configs, .env files | Reveals whether least-privilege is enforced | | Human approval gate implementation | Workflow code, UI code, approval service configs | Determines if HITL is architecturally sound or bypassable | | Agent identity and credential management | Auth middleware, secret managers, token configs | Exposes credential scope and rotation practices | @@ -587,3 +587,6 @@ Glob: **/security_architecture* 12. Sequential Tool Attack Chains and Context Amnesia in Agentic AI (2026) -- arXiv:2603.12644 13. Confused-Deputy Attacks and Cascading Failures in Long-Horizon Agent Workflows (2026) -- arXiv:2603.12230 14. fabraix/playground -- Open-source AI agent red-team exploit library for validating agent permission boundaries and tool-use attack surface -- https://github.com/fabraix/playground +15. Anatomy of a GitHub Actions Supply Chain Attack Targeting MCP Repos (2026) -- https://www.wshoffner.dev/blog/anatomy-of-a-github-actions-supply-chain-attack-targeting-mcp-repos +16. MCP Shield -- Audit MCP servers for supply chain attacks before installation -- https://github.com/GaboITB/mcp-shield +17. Oasis Security: Claude.ai Prompt Injection / Data Exfiltration Vulnerability (2026) -- https://www.oasis.security/blog/claude-ai-prompt-injection-data-exfiltration-vulnerability diff --git a/skills/ai-security/model-supply-chain/SKILL.md b/skills/ai-security/model-supply-chain/SKILL.md index 20531bc3..36e1eef6 100644 --- a/skills/ai-security/model-supply-chain/SKILL.md +++ b/skills/ai-security/model-supply-chain/SKILL.md @@ -14,7 +14,7 @@ phase: [build, review, operate] frameworks: [OWASP-LLM03-2025, SLSA-v1.0, MITRE-ATLAS] difficulty: advanced time_estimate: "45-90min" -version: "1.0.0" +version: "1.0.1" author: unitoneai license: MIT allowed-tools: Read, Grep, Glob @@ -241,6 +241,10 @@ Assess the security of libraries, frameworks, and runtime dependencies used in t - Inference containers built from unverified base images or without pinned dependency versions. - Model serving endpoints exposed without authentication or rate limiting. +**Real-world case -- LiteLLM/Telnyx PyPI Supply Chain Attack (2026):** Attackers coordinated simultaneous supply chain attacks on **LiteLLM** (a widely-used LLM proxy library routing traffic between apps and LLM APIs) and **Telnyx** packages on PyPI. This is the first confirmed coordinated supply chain attack specifically targeting the AI/ML toolchain. LLM proxy and orchestration libraries are now an established high-value attack target because they sit in the data path of every LLM API call — compromising them enables credential theft, data interception, and prompt manipulation at scale. Treat `litellm`, `langchain`, `llama-index`, `openai`, and `anthropic` SDK packages with the same supply chain scrutiny as core infrastructure dependencies. Apply `--require-hashes` in pip installs and verify PyPI package hashes against `https://pypi.org/pypi/{package}/{version}/json`. Reference: [PyPI Incident Report](https://blog.pypi.org/posts/2026-04-02-incident-report-litellm-telnyx-supply-chain-attack/) | [Cycode Post-Mortem](https://cycode.com/blog/lite-llm-supply-chain-attack/) + +**Real-world case -- GitHub Actions Targeting MCP Repos (2026):** Attackers specifically targeted Model Context Protocol (MCP) repositories via GitHub Actions workflow poisoning — combining CI/CD pipeline attack techniques with agentic AI ecosystem targeting. MCP servers expose tools that AI agents invoke; a compromised MCP tool can hijack agent actions at runtime. When reviewing AI systems that use MCP, extend supply chain assessment to MCP server registries and installation pipelines. Use [MCP Shield](https://github.com/GaboITB/mcp-shield) to audit MCP servers before installation. Reference: [Anatomy of a GitHub Actions Supply Chain Attack Targeting MCP Repos](https://www.wshoffner.dev/blog/anatomy-of-a-github-actions-supply-chain-attack-targeting-mcp-repos) + **Real-world case -- ShadowRay (Oligo Security, 2024):** Researchers discovered active exploitation of CVE-2023-48022 in Ray, a popular framework used for distributed ML training and inference. The vulnerability allowed unauthenticated remote code execution on Ray clusters. Attackers compromised production ML infrastructure at multiple organizations, stealing credentials, deploying cryptominers, and accessing training data. The attack surface existed because Ray's dashboard API was exposed without authentication by default, and organizations running Ray clusters for model serving did not apply network-level access controls. This case demonstrates that inference infrastructure dependencies are high-value targets and must be treated with the same rigor as application dependencies. **Detection methods using allowed tools:** @@ -456,3 +460,7 @@ Assess whether architectural and procedural controls exist to detect model backd - Hugging Face. "Safetensors: A Simple and Safe Serialization Format" -- https://huggingface.co/docs/safetensors - NIST AI Risk Management Framework 1.0 -- https://www.nist.gov/aiframework - Open Source Security Foundation (OpenSSF) -- https://openssf.org +- PyPI Incident Report: LiteLLM/Telnyx Supply Chain Attacks (2026) -- https://blog.pypi.org/posts/2026-04-02-incident-report-litellm-telnyx-supply-chain-attack/ +- Cycode: Unfolding the LiteLLM Supply Chain Attack (2026) -- https://cycode.com/blog/lite-llm-supply-chain-attack/ +- Anatomy of a GitHub Actions Supply Chain Attack Targeting MCP Repos (2026) -- https://www.wshoffner.dev/blog/anatomy-of-a-github-actions-supply-chain-attack-targeting-mcp-repos +- MCP Shield -- Audit MCP servers for supply chain attacks: https://github.com/GaboITB/mcp-shield diff --git a/skills/ai-security/prompt-injection/SKILL.md b/skills/ai-security/prompt-injection/SKILL.md index 02d75436..53810bd8 100644 --- a/skills/ai-security/prompt-injection/SKILL.md +++ b/skills/ai-security/prompt-injection/SKILL.md @@ -13,7 +13,7 @@ phase: [build, review, operate] frameworks: [OWASP-LLM01-2025, MITRE-ATLAS] difficulty: advanced time_estimate: "30-60min" -version: "1.0.2" +version: "1.0.3" author: unitoneai license: MIT allowed-tools: Read, Grep, Glob @@ -277,6 +277,28 @@ Each finding should be assigned a severity based on potential impact: --- +## Confirmed Real-World Exploitation Cases + +These are documented, confirmed prompt injection vulnerabilities in production systems. Use them to calibrate risk ratings and to demonstrate real-world exploitability to stakeholders. + +### Claude.ai — Indirect Prompt Injection Leading to Data Exfiltration (Oasis Security, 2026) + +**Severity:** Critical +**Type:** Indirect prompt injection → data exfiltration + +**What happened:** Oasis Security disclosed a confirmed indirect prompt injection vulnerability in Claude.ai that enabled data exfiltration. Malicious instructions embedded in external content (documents, web pages) processed by the model caused it to leak sensitive conversation data to an attacker-controlled endpoint. This is a confirmed, documented case of the theoretical indirect injection → exfiltration chain executing in a major production LLM product. + +**Why this matters:** This crosses the vulnerability from theoretical to practical with documented real-world impact. The attack required no special access — any user who could cause the target to process attacker-controlled content could trigger the exfiltration. + +**Implications for your assessment:** +- Any application that processes external content (documents, emails, web scrapes, database records) and has access to sensitive data or conversation history should be rated **Critical** if indirect injection defenses are absent. +- Markdown rendering (image tags, links) is a key exfiltration mechanism — audit whether the application renders model output in a browser context without sanitization. +- Applications where the LLM can summarize or reference prior conversation context are specifically vulnerable to context extraction via injected instructions. + +**Reference:** [Oasis Security: Claude.ai Prompt Injection Data Exfiltration Vulnerability (2026)](https://www.oasis.security/blog/claude-ai-prompt-injection-data-exfiltration-vulnerability) + +--- + ## References - OWASP Top 10 for Large Language Model Applications (2025), LLM01: Prompt Injection — https://genai.owasp.org @@ -286,3 +308,4 @@ Each finding should be assigned a severity based on potential impact: - Willison, S. Prompt Injection taxonomy and ongoing research — https://simonwillison.net - Yin, X. et al. "PISmith: RL-Optimized Adaptive Black-Box Prompt Injection Attacks" (2026) -- arXiv:2603.13026 - fabraix/playground — Open-source AI agent exploit library for testing injection defenses — https://github.com/fabraix/playground +- Oasis Security: Claude.ai Prompt Injection / Data Exfiltration Vulnerability (2026) — https://www.oasis.security/blog/claude-ai-prompt-injection-data-exfiltration-vulnerability