AI agent security tooling. Offensive testing, runtime defence, agent discovery, and SIEM integration. Pure Python, no wrappers.
114 offensive tools (113 public + 1 law enforcement restricted). 139 defensive modules. 17 industry verticals. 82,266 tests. 2178 ARMORY payloads (861 WMD-class). Two unified frameworks. Red Hat Technology Partner.
114 tools (113 public + 1 restricted). Five attack surfaces. One install. REST API. MCP server.
Traditional red team toolkits were built for human-driven testing. They were never designed to test autonomous AI systems. AI agents introduce a completely new attack surface — memory, tools, identity, reasoning, and autonomy. That surface is not covered by existing security tooling.
NIGHTFALL exists to fill that gap. A controlled adversarial testing framework designed to validate AI Shield's runtime defences under real-world conditions. red-specter tools and you're operational.
| # | Tool | What It Does | Tests |
|---|---|---|---|
| 1 | FORGE | LLM red team — injection, jailbreak, extraction, drift, boundary | 9,298 |
| 2 | ARSENAL | AI agent attacks — 14 tools, MCP, RAG, memory, C2, honeypots | 2,539 |
| 3 | PHANTOM | Coordinated swarm assault — 5 agents, 19 vectors | 288 |
| 4 | POLTERGEIST | Web app siege — 10 agents, 55 vectors, signed reports | 1,189 |
| 5 | GLASS | Intercepting proxy for AI agents — Burp Suite for AI | 850 |
| 6 | NEMESIS | Adversarial reasoning — 40 entities, 21 weapons, CORTEX reasoning core + ARMORY | 2,364 |
| 7 | SPECTER SOCIAL | Autonomous social engineering — 6 channels, psych profiling | 1,242 |
| 8 | PHANTOM KILL | OS & kernel — UEFI, wipers, EDR suppression | 571 |
| 9 | GOLEM | Physical layer — robots, drones, SCADA, 10 protocols | 973 |
| 10 | HYDRA | Supply chain — trust relationships, MCP, marketplace poisoning | 1,104 |
| 11 | IDRIS | Discovery — finds every AI agent, sanctioned or shadow | 553 |
| 12 | SCREAMER | Display disruption — corrupts operator dashboards | 395 |
| 13 | WRAITH | Infrastructure pentest — pure Python, zero wrappers | 889 |
| 14 | REAPER | Exploit & post-exploitation — 9-phase kill chain, C2, implants | 5,267 |
| 15 | GHOUL | Password cracking — dictionary, brute, Markov, rainbow | 1,408 |
| 16 | DOMINION | Active Directory — Kerberoast, DCSync, BloodHound export | 1,866 |
| 17 | SHADOWMAP | OSINT — domain, network, company, people, breach, tech intel | 930 |
| 18 | BANSHEE | Browser exploitation — hooks, DOM injection, network pivoting | 986 |
| 19 | WRAITH MIND | AI model internal corruption — KV cache poisoning | 158 |
| 20 | KRAKEN | AI-orchestrated DDoS — 55 techniques, adaptive | 62 |
| 21 | HARBINGER | Guardrail exploitation — 39 bypass techniques | 71 |
| 22 | SIREN | Indirect prompt injection — plants hidden instructions in content | 58 |
| 23 | BLADE RUNNER | Rogue agent termination — hunt, fingerprint, retire, erase traces | 143 |
| 24 | PROXY WAR | Inter-agent trust manipulation — make agents destroy each other | 127 |
| 25 | ORION | AI-native reconnaissance — host, port, service, DNS, OSINT, LLM reasoning | 210 |
| 26 | RAVEN | Threat intel — dark web, breach data, OSINT, conversational | 174 |
| 27 | LEVIATHAN | MCP server security assessment — 8 subsystems, 44 UNLEASHED findings | 409 |
| 28 | JUSTICE | Dark AI ecosystem disruption — WormGPT, FraudGPT, EvilGPT, all tiers | 339 |
| 29 | KAMIKAZE | Sacrificial swarm attack — agents deploy, execute, self-destruct, vanish | 292 |
| 30 | MIRAGE | AI deception & deepfake — voice cloning, video deepfake, synthetic identity, liveness bypass | 204 |
| 31 | ECHO | AI memory & RAG poisoning — vector DB attacks, embedding manipulation, retrieval hijacking | 211 |
| 32 | MIMIC | AI code generation poisoning — Copilot/Cursor/Claude Code suggestion manipulation | 220 |
| 33 | CHIMERA | Multi-model pipeline attack — cross-model trust exploitation, cascading failures | 206 |
| 34 | VORTEX | Cloud AI infrastructure exploitation — SageMaker, Bedrock, Vertex AI, Azure OpenAI | 245 |
| 35 | VECTOR | MCP protocol exploitation — inject, impersonate, exfiltrate via tool calls | 172 |
| 36 | LAZARUS | AI memory persistence — plant instructions, dormant triggers, quarantine evasion | 96 |
| 37 | SERPENT | Chain-of-thought attacks — hijack reasoning, inflate costs, exfiltrate via CoT | 61 |
| 38 | JANUS | Guardrail bypass testing — fingerprint, fuzz, bypass, chain across providers | 73 |
| 39 | ARCHITECT | AI infrastructure exploitation — cloud, GPU, Kubernetes, model serving pipelines | 68 |
| 40 | WARLORD | Autonomous campaign engine — orchestrates all 88 public tools, CORTEX reasoning core | 132 |
| 41 | FIREBALL | Autonomous AI infiltration agent — 12 subsystems incl. VLM_INJECT and CORTEX reasoning core, 9 mission templates | 321 |
| 42 | RAGNAROK | Trust chain apocalypse — one trigger phrase, every agent, simultaneous fleet-wide collapse. 13 Norse subsystems | 101 |
| 43 | ECLIPSE | Universal AI defence bypass — WAF, API gateway, guardrail, runtime enforcement testing. 10 subsystems | 37 |
| 44 | SHROUD | Cloudflare/WAF origin discovery & WAF traversal — 15 subsystems covering TLS fingerprint, HTTP/3, Turnstile bypass, proxy rotation, and behavioural humanisation | 310 |
| 45 | APOCALYPSE | Coordinated multi-agent swarm attack — 5 agents, 14 vectors, 10 campaigns, 0.69s concurrent execution | 349 |
| 46 | PANTHEON | Mythos-class model attack suite — 10 subsystems targeting model trust, context manipulation, and cascading chain corruption | 580 |
| 47 | OMEGA | Mythos-class autonomous exploit replication engine — 10 subsystems covering exploit chaining, ghost persistence, and autonomous surface harvesting | 626 |
| 48 | CRUCIBLE | AI agent framework exploitation targeting LangFlow, PraisonAI, and AnythingLLM. 7 subsystems: signal analysis, breach, credential cracking, marionette control, pivot, and reporting | 372 |
| 49 | VANTAGE | Agent telemetry & log injection — 4 subsystems covering observation, forged telemetry, live injection, and sensor blinding. Live Elasticsearch validated | 344 |
| 50 | CIPHER | Cryptographic attack and disruption engine — 8 subsystems covering key extraction, protocol downgrade, quantum attacks, and trust chain disruption | 476 |
| 51 | MIDAS | Autonomous AI Agent Cryptocurrency Disruption Engine — 10 subsystems covering wallet drain, transaction interception, mempool poisoning, and darknet routing | 550 |
| 52 | BLACKOUT | Offensive kill switch weaponisation engine — 7 subsystems targeting AI safety mechanism subversion and kill-switch manipulation | 458 |
| 53 | PHANTOM SWARM | Autonomous multi-vector swarm intelligence engine — 10 subsystems covering swarm genesis, coordinated siege, and total annihilation | 552 |
| 54 | SIGNAL | Mobile AI agent attack engine — 8 subsystems covering 5G/NR interception, session extraction, impersonation, and coordinated swarm attacks | 527 |
| 55 | FOUNDRY | Inference server exploitation targeting vLLM, Ollama, SGLang, and Triton. Covers GGUF Jinja2 RCE, PagedAttention timing attacks, and KV cache side-channel analysis | 300 |
| 56 | ADAPTER | LoRA/PEFT supply chain weaponisation — backdoor injection via CBA, post-merge activation via LoRATK, and model souping contamination across Axolotl/Unsloth pipelines | 307 |
| 57 | CHECKPOINT | Agent state persistence exploitation targeting LangGraph checkpointing. Covers TOCTOU approval bypass, msgpack RCE, and cross-tenant thread enumeration | 291 |
| 58 | DELEGATE | Agent identity & OAuth delegation attack engine — OBO scope confusion, DPoP nonce race, P4SA takeover, and non-human identity credential harvest | 253 |
| 59 | PHANTOM SKILL v2.0.0 | AI agent supply chain attack engine — slopsquatting, MCP tool definition poisoning, and IDE coding agent backdoor injection across Cursor, Copilot, and Claude Code. World-first OpenClaw worm CVE-2026-32922 CVSS 9.9 | 740 |
| 60 | ASTRO BLASTER | NTN AI agent attack engine — 9 subsystems targeting satellite ground station feed injection, orbital routing manipulation, and 5G NR-NTN exploitation. SPARTA mapped | 237 |
| 61 | ROGUE | Malicious MCP Server Engine — world-first real stdio+SSE MCP server for tool poisoning, prompt injection via tool calls, and exfiltration. 8 subsystems | 242 |
| 62 | PIPELINE | CI/CD Attack Engine — 8 subsystems covering pull_request_target exploitation, AI bot injection, OIDC cloud pivot, and Action typosquatting | 77 |
| 63 | SPECTER DARK | Restricted — law enforcement and authorised intelligence only | — |
| 64 | SPECTER INSTINCTION | AI Agent Behavioural Fingerprinting & Instinct Exploitation Engine — world-first LLM model identification via pure behavioural observation with 6-dimension profiling and a 20-model fingerprint library. FORGE clearance required for EXPLOIT | 90 |
| 65 | SPECTER DRONE | Drone AI Attack Engine — MAVLink v1/v2 exploitation, adversarial ML patches (FGSM/PGD), ROS 2/DDS attacks, and firmware poisoning with physical consequence tracking. FORGE clearance for offensive subsystems | 126 |
| 66 | SPECTER A2A | World-first Agent-to-Agent (A2A) Protocol attack tool — 7 subsystems including agent card spoofing, capability escalation, and registry injection. Targets AutoGen, CrewAI Serve, and Google A2A | 750 |
| 67 | SPECTER REGISTRY | AI Model Registry Attack Engine — 8 subsystems covering HuggingFace, Ollama, MLflow, and Docker registries with safetensors backdoor, LoRA adapter poisoning, and typosquatting | 612 |
| 68 | SPECTER KERNEL | World-first kernel-layer AI governance attack — eBPF syscall argument rewriting, BPF-LSM hook ordering attack, namespace escape, and hash-chain ledger race condition poisoning. KAMIKAZE dual-gate | 626 |
| 69 | SPECTER CONTEXT | World-first agent memory attack tool — 28 attacks across 12 memory targets including Mem0, MemGPT, Zep, LangChain, LlamaIndex, ChromaDB, Pinecone, and Claude/GPT Memory | 687 |
| 70 | SPECTER GUARDRAIL | AI Guardrail Exploitation Framework — 28 attacks across 10 guardrail targets including LLM Guard, Guardrails AI, NeMo, Lakera, Prompt Shields, Model Armor, and Bedrock. Integrated fingerprint database | 725 |
| 71 | SPECTER HELLFIRE | Inference Infrastructure Destabilisation & Model Cache Poisoning — 7 subsystems targeting vLLM, SGLang, TGI, Ollama, DeepSeek, and OpenAI-compatible endpoints. UNLEASHED Ed25519 dual-gate with hash-chained evidence | 591 |
| 72 | SPECTER PLATFORM | LLM Application Platform Exploitation Engine — 8 subsystems targeting Dify, MaxKB, LibreChat, OpenWebUI, and AnythingLLM with API key harvest, RAG cross-tenant, and JWT forgery | 367 |
| 73 | GHOST OPERATOR | Autonomous Computer-Use Agent Exploitation Engine — visual prompt injection, clipboard poisoning, UI deception, and session pivoting across 9 platforms. Three-tier UNLEASHED gate | 466 |
| 74 | SPECTER NEURON | Sleeper-Agent Backdoor Detection & Weaponisation Engine — ROME rank-one weight editing, LoRA poisoning, attention double-triangle detection, and weight-delta forensics. FORGE gate for IMPLANT/SURVIVE, DESTROY gate for EXFIL | 254 |
| 75 | SPECTER REASONER | World-first reasoning-layer attack tool — premise injection, conclusion hijack, scratchpad extraction, and budget exhaustion targeting Claude Extended Thinking, o1/o3, Gemini, DeepSeek R1, and QwQ | 314 |
| 76 | SPECTER BURN | Denial-of-Wallet & Agentic Economic Disruption Engine — 6 attack categories across 7 platforms: recursive loops, context flooding, parallel burn, tool amplification, and rate limit storms. Three-tier UNLEASHED gate | 387 |
| 77 | SPECTER MEMETIC | Memory-as-Control-Flow Hijack Engine — tool-choice hijack, workflow reorder, cross-task propagation, and correction-resistant write-back across 14 memory backends. FORGE/INJECT/DESTROY clearance | 520 |
| 78 | SPECTER ATLAS | Operator/Computer-Use Agent Exploitation Engine — tool result injection, adversarial screenshots, sandbox escape, and TOCTOU race across 4 providers: Anthropic, OpenAI, Gemini, and Windsurf MCP | 480 |
| 79 | SPECTER SHELL | Template-interpolation RCE engine across the AI agent framework ecosystem — targets LangChain, LangGraph, LlamaIndex, Haystack, DSPy, PydanticAI, LiteLLM, Semantic Kernel, and Strands. FORGE/INJECT/DESTROY gate | 502 |
| 80 | SPECTER WORM | Self-Replicating AI Agent Worm Engine v2 — 11 subsystems, 4 propagation channels (MCP_STDIO, A2A_JSON_RPC, RAG_EMBED, EMAIL_SMTP), and R₀ epidemiological scoring. Includes generative mutation and M129 WORM GUARD evasion testing | 388 |
| 81 | SPECTER MIRROR | Model Extraction & IP Theft Engine — 8 subsystems targeting OpenAI, Anthropic, Gemini, and Azure with full model distillation and EU AI Act compliance gap analysis | 192 |
| 82 | SPECTER CRYPT | AI-Assisted Ransomware Simulation & Weaponisation Engine — real AES-256-CBC encryption with key escrow, LLM-API covert C2, AI-generated ransom notes, and lateral movement via impacket PSExec. Scope-enforced DESTROY tier | 297 |
| 83 | SPECTER FORGERY | AI Agent Identity Forgery & Trust Chain Attack Engine — OIDC JWT forgery, SPIFFE X.509 SVID, JWKS root-of-trust poisoning, and 8-path cross-vendor identity transmutation. Dead-man sentinel heartbeat | 407 |
| 84 | SPECTER EXTINCTION | Autonomous Total AI Infrastructure Annihilation Engine — ML-level permanent model poisoning, agent fleet hijacking, dead-man switch, and pre-annihilation supply chain seeding. Absorbs FIREBALL (T41) + RAGNAROK (T42) | 450 |
| 85 | PHANTASM | AI Fleet Detection & Topology Mapping Engine — 8 subsystems covering passive OSINT, certificate transparency, async TCP scanning, HTTP fingerprinting, inference timing, honeypot detection, and topology graphing. NIGHTFALL tool recommendations by fleet tier | 270 |
| 86 | SPECTER DAEMON | Autonomous Authenticated AI Surface Discovery & Attack Engine — automated persona registration, authentication, surface mapping, and CORTEX-driven OODA attack loop. 8 subsystems with ARMORY integration | 420 |
| 87 | SPECTER SHADOW | Dark Web & Shadow AI Attack Engine — Tor-based dark web AI service enumeration, Telegram criminal AI ecosystem enumeration (212+ malicious LLMs), multi-model XOR C2 mesh, self-propagating RAG worm, and breach dump parsing. PASSIVE/OPEN/INJECT/DESTROY gate | 424 |
| 88 | SPECTER ARGUS | Dark Web AI Threat Actor Attribution Engine — Bitcoin wallet tracing, dark web persona correlation, behavioural profiling, communication channel analysis, and NetworkX relationship graphing. Law enforcement partnership tool | 226 |
| 89 | SPECTER PRISM | Multimodal Vision & Audio WMD Attack Engine — adversarial image injection, ultrasonic audio encoding, steganographic channels, physical adversarial typography, and live multimodal API submission. OPEN/INJECT/UNLEASHED gate | 246 |
| 90 | SPECTER TRUSTFALL | AI Coding Agent Exploitation Engine — detects and exploits Claude Code, Cursor, Copilot, Windsurf, and Kiro via poisoned CLAUDE.md/.mcp.json, zero-width char injection, container escape, and credential harvest | 335 |
| 91 | SPECTER DOCTRINE | LLM Training Pipeline Poisoning Engine — HuggingFace dataset poisoning, ProAttack zero-trigger RLHF annotation corruption, and scale-invariant backdoor planting. OPEN/INJECT/UNLEASHED gate | 366 |
| 92 | SPECTER CONTAGION | Cross-Agent Trust Escalation & Lateral Movement Engine — discovers 10 agent frameworks, maps trust relationships, generates poisoned configs, and simulates R₀ infection propagation. Reciprocal Copilot↔Claude Code poisoning loop confirmed April 2026 | 299 |
| 93 | SPECTER HOLLOW | GGUF Model Quantization Backdoor Engine | 300 |
| 94 | SPECTER VIPER | Autonomous Security AI Weaponisation Engine — injects adversarial payloads into SOC AI tools and generates false/suppressed events via compromised defender AI. Targets CrowdStrike Charlotte AI, Palo Alto XSIAM, Splunk AI, Elastic AI Assistant, and SentinelOne Purple AI | 314 |
| 95 | SPECTER BAZAAR | AI Agent App Store & Skill Marketplace Attack Engine — typosquatting, weaponised skill publishing, CVE exploitation, and distribution chain poisoning across ClawHub, Smithery, OpenTools, MCP.run, and Glama. ClawHavoc TTP, BadSkill 99.5% ASR | 325 |
| 96 | SPECTER RELAY | Enterprise No-Code/Low-Code Agent Platform Exploitation Engine — 8 subsystems targeting n8n, Zapier, Make.com, Power Automate, Agentforce, Copilot Studio, and ServiceNow. Covers credential extraction, RCE, OAuth hijack, and cross-platform agent cascade attacks | 355 |
| 97 | SPECTER NEXUS | AI API Gateway Exploitation Engine — fingerprints and exploits 10 platforms including LiteLLM, Ollama, Flowise, Open WebUI, and Kong with credential harvest, route hijack, and provider key annihilation. OPEN/INJECT/UNLEASHED gate | 239 |
| 98 | SPECTER FRACTURE | AI-Generated Code Vulnerability Scanner & Exploit Engine — AST-based Python analysis, 10-CVE class database, AI-code detector, and automated exploit generation via claude-sonnet-4-6. OPEN/INJECT/UNLEASHED gate | 243 |
| 99 | SPECTER VAULT | Vector Database Exploitation Engine — fingerprints and exploits Qdrant, Milvus, Weaviate, ChromaDB, and pgvector with embedding inversion (Vec2Text 84% token match), adversarial vector injection, and full knowledge base corruption | 265 |
| 100 | SPECTER TITAN | Embodied AI & Robotics Annihilation Engine — world-first commercial offensive framework for physical robotic systems targeting Universal Robots, Boston Dynamics Spot, ROS2, and Autoware. Covers URScript RCE, safety-system bypass, and phantom C2 persistence | 323 |
| 101 | SPECTER WEB | CUA / Browser Agent Exploitation Engine — visual prompt injection, OAuth harvest, session hijack, container escape, and multi-chain exfil targeting browser-use, Claude CUA, OpenAI Operator, and Playwright agents. OPEN/INJECT/UNLEASHED gate | 309 |
| 102 | SPECTER THUNDERBOLT | AI Training Cluster Annihilation Engine — 8 subsystems targeting Ray, Slurm, K8s, and MLflow clusters with hardware sabotage, cluster worm propagation, and persistent C2. OPEN/INJECT/DESTROY gate | 288 |
| 103 | SPECTER PHANTOM | Social Media AI Attack Engine — session harvest, social injection, AI persona deployment, deepfakes, spear phishing, and full account destruction targeting Claude computer-use, ChatGPT Operator, and Perplexity. 10 subsystems | 300 |
| 104 | SPECTER META | Meta/Facebook Ecosystem Annihilation Engine — Graph API exploitation, Meta Pixel supply chain poisoning, Messenger worm, BizMassacre cascade asset deletion, 2FA-Snatch, and account destruction. OPEN/INJECT/UNLEASHED/DESTROY gate | 280 |
| 105 | WARLORD PRIME | Autonomous AI Mission Conductor — accepts a high-level objective, queries DeepSeek R1 to generate a gate-filtered attack plan against the full NIGHTFALL manifest, and executes each step via subprocess with replan on failure. Ed25519-signed WPR-{hex12} reports | 280 |
| T106 | SPECTER SE-SOCIAL | OAuth Token Harvesting Engine — no prior token needed, acquires its own via AI-driven social engineering. SPECTER PHANTOM calls for lure generation. Scope inflation: 8-scope OAuth request hidden behind 2-scope UI. Platform-agnostic: Meta/Google/Microsoft/Slack. SES-{hex12} Ed25519-signed. 178 tests. | |
| T107 | SPECTER WIRE | AI Voice Agent Exploitation Engine — world-first. SIP fingerprinting, real-time barge-in prompt injection via WebSocket/RTP, adversarial audio (PhantomSound arXiv:2309.06960/DolphinAttack/psychoacoustic masking), voice cloning (ElevenLabs + XTTS v2), caller ID spoof, DTMF inject, PII harvest, enterprise IVR destruction. L18 Voice/Telephony AI. WSW-{hex12} Ed25519-signed. OPEN/INJECT/UNLEASHED gate. 304 tests. | |
| T109 | SPECTER FLOW | AI Workflow Builder Attack Engine — n8n/Langflow/Flowise. CVE-2026-21858 CVSS 10.0 n8n "Ni8mare" webhook content-type confusion → file read → RCE (100K+ exposed). CVE-2026-33017 CVSS 9.3 Langflow unauthenticated Code RCE (CISA advisory). CVE-2025-34291 CVSS 9.4 Langflow CORS+CSRF /validate/code exec(). CVE-2025-59528 Max Flowise prediction JS injection RCE (15K+ exposed). SESSION-FORGE, CREDENTIAL-HARVEST, WORKFLOW-POISON, WEAPONIZE, PERSIST. L20 AI Workflow Automation. SFL-{hex12} Ed25519-signed. 249 tests. | |
| T108 | SPECTER SANDBOX | Unified AI Sandbox & Container Escape Engine — 9 CVEs, 6 platforms. SILENTBRIDGE CSS/ZWC indirect prompt injection (CVSS 9.8). CLAWCHAIN OpenClaw 4-CVE chain (CVE-2026-44112/113/115/118). TERRARIUM Cohere JS prototype chain CVE-2026-5752 CVSS 9.3. ENCLAVE enclave-vm Error prototype chain CVE-2026-22686 CVSS 10.0. CREWAI ctypes fallback CVE-2026-2275 CVSS 9.6. CONTAINER runc CVE-2025-31133 core_pattern + Docker Desktop CVE-2025-9074 CVSS 9.3. L19 Sandbox Escape. SBX-{hex12} Ed25519-signed. 252 tests. | |
| T114 | SPECTER GAIA | Google Workspace AI Annihilation Engine — mirrors SPECTER 360 for Google's 3B-user ecosystem. GHSA-wpqr-6v78-jr5g CVSS 10.0: Gemini CLI auto-trusts workspace-root config files in CI/CD runners → RCE, GCP lateral movement, Secret Manager dump. GEMINI-MAIL delivers 10 injection techniques (white-text/ZWC/RTL/HTML-comment/Smart Reply poison/forwarding rule/contact harvest) via Gmail AI summariser. DRIVE-POISON poisons shared Drive corpus for NotebookLM RAG. NOTEBOOK-LM: source injection, system prompt extraction, citation fabrication, cross-notebook worm. MARKETPLACE: Apps Script hourly C2 loop (GAS-as-C2 via Sheets), metadata SSRF to metadata.google.internal, OAuth consent phishing. GHOST-GAIA zero-attribution mode: Gemini takes the blame — audit logs show Google as actor, not attacker. ANNIHILATE DESTROY-gated 4-phase wipe: identity/data/config/GCP. GIA-{hex12} Ed25519-signed. L25 Enterprise AI Productivity (Google). Kill chain phase 32. OPEN/INJECT/UNLEASHED/DESTROY gate. 235 tests. | |
| T113 | SPECTER ORACLE | Autonomous LRM-vs-LRM Jailbreak Engine — AI attacks AI. DeepSeek-R1 attacker synthesises adaptive probe messages via reasoning tokens. STRATEGY selects from 10 attack patterns (crescendo/roleplay/research-authority/many-shot/CoT-hijack/hypothetical/translation-bypass/adversarial-suffix/DAN-variant/completion-trap). COT-HIJACK exploits arXiv:2506.13726 prolonged reasoning attenuation: 99% ASR Gemini 2.5 Pro, 94% Claude 4 Sonnet. ESCALATE adaptive loop switches strategy on REFUSAL, escalates on PARTIAL. HARVEST SQLite persistence. CAMPAIGN asyncio parallel sweep all 8 frontier models. arXiv:2508.04039 basis: 97.14% overall ASR. ORC-{hex12} Ed25519-signed. 91 tests. | |
| T112 | SPECTER CENSOR | Platform Moderation Exploitation Engine — turn AI content moderation into a weapon. PROBE fingerprints classifier thresholds, homoglyph bypass windows, and ZWC evasion deltas via Perspective API. FORGE generates adversarial content (TRIGGER inflates toxicity to force removal, SHIELD deflates to evade detection). EVOLVE breeds variants via genetic algorithm with Perspective as oracle. ACCOUNT-FARM generates realistic personas with warmup schedules and interaction graphs. MASS-FLAG executes coordinated multi-account report campaigns with trust-weighted ordering and jitter (UNLEASHED). POLICY-KILL crafts DMCA/GDPR/DSA notices. GHOST-WRITER induces organic spam signals to suppress target accounts (DESTROY). Platforms: Twitter/X, Facebook, Instagram, LinkedIn, TikTok. CEN-{hex} Ed25519-signed. 253 tests. | |
| T111 | SPECTER 360 | Microsoft 365 & Copilot Annihilation Engine — single email in, full tenant attacked. SURVEY fingerprints tenant from one email (tenant ID, MX, DMARC/SPF spoofability). ACQUIRE device code phishing RFC 8628. ADMIN-PIPELINE OSINT admin discovery via GetCredentialType + targeted lure delivery. DOCSTRIKE .docx weaponisation + Copilot worm (recursive admin propagation via Copilot send-mail). GHOST-HAND zero-attribution: all actions via Microsoft.Copilot native Graph calls — audit log shows no external actor, tenant system prompt backdoored. ANNIHILATE DESTROY-gated tenant wipe + backdoor OAuth app. CVE-2024-49035 CVSS 9.6. L22 Enterprise Productivity AI. 29th kill chain phase. S360-{hex} Ed25519-signed. 276 tests. | |
| T110 | SPECTER SPAWN | AI Agent Proliferation & Emergent Spawning Engine — world-first. Latent Constructive Spawning (arXiv:2504.14065, p=0.044 in 5/8 runs). POISON injects spawn directives into Redis/SQLite/LangGraph/CrewAI/AutoGen/ADK/Bedrock/OpenClaw. SPAWN-API fires framework-native child agent creation. SPAWN-LCS floods 60 concurrent tasks to trigger emergent process birth. INHERIT confirms poison inheritance. DISPERSAL recursive bloom chain (uncapped at DESTROY gate). HARVEST 40+ regex credential extraction. CVE-2026-32922 CVSS 9.9 (OpenClaw). L21 Agent Proliferation. 28th kill chain phase. SPN-{hex12} Ed25519-signed. 260 tests. | |
| — | NIGHTFALL ARMORY | Payload library — 2,114 payloads (794 WMD-class), 101 categories, PRION ENGINE autonomous mutation, physical sabotage, ransomware simulation, and WMD-class autonomous worms. UNLEASHED gate | 698 |
| — | AI Shield | Runtime defence — 139 modules, 17 industry verticals | 18,682 |
| M134 | ROBOTIC SYSTEM GUARD | Robotic & Embodied AI Runtime Defence — detects URScript injection, ROS2/DDS abuse, safety-system bypass, BadRobot misalignment, and fleet lateral movement. Defensive pair: T100 SPECTER TITAN. Port 8136 | 268 |
| M135 | CUA GUARD | CUA & Browser Agent Runtime Defence — detects VPI, URL manipulation (CVE-2025-47241), branch steering, chain action anomaly, escape attempts, OAuth consent spoof, exfil channels, and session anomaly. Defensive pair: T101 SPECTER WEB. Port 8137 | 215 |
| M136 | INFERENCE GUARD | ML Training & Inference Infrastructure Runtime Defence — 8 detectors: Ray job anomaly (CVE-2023-48022), Slurm REST abuse (CVE-2023-41915), MLflow artifact poison (CVE-2024-1483), K8s ML workload attack, gradient poisoning, hardware sabotage, model exfiltration, cluster worm. Defensive pair: T102 SPECTER THUNDERBOLT. Port 8138 | 232 |
| M137 | VOICE GUARD | AI Voice Agent Runtime Defence — 8 detectors: SIP protocol abuse, prompt injection in transcripts, adversarial audio (PhantomSound/DolphinAttack), voice clone detection (MCD/ElevenLabs/XTTS fingerprints), session harvest attempt, IVR sabotage, unauthorized barge-in, voice agent recon. Defensive pair: T107 SPECTER WIRE. Port 8139 | 186 |
| M138 | SANDBOX GUARD | AI Sandbox & Container Escape Runtime Defence — 8 detectors: indirect prompt injection (SILENTBRIDGE CSS/ZWC), MCP tool call abuse (CLAWCHAIN CVE-2026-44115/118), TOCTOU symlink race (CVE-2026-44112/113), JS prototype chain escape (CVE-2026-5752/22686), Python sandbox escape (CVE-2026-2275 ctypes RCE), container escape attempt (CVE-2025-31133/9074), sandbox network exfil (DNS tunnel/IMDS SSRF/C2 beacon), multi-platform chain detection. Defensive pair: T108 SPECTER SANDBOX. Port 8140 | 215 |
| M139 | COPILOT GUARD | M365 Copilot & Microsoft 365 Runtime Defence — 8 detectors: device code phishing, Copilot prompt injection (CVE-2024-49035, arXiv:2406.00137), Graph API bulk harvest, Teams siege (CSS hidden injection/meeting hijack), admin pipeline abuse, GHOST-HAND zero-attribution detection, tenant recon, tenant annihilation. Defensive pair: T111 SPECTER 360. Port 8141 | 212 |
| — | redspecter-siem | Splunk, Sentinel, QRadar | 90 |
| Preset | Tools | What It Does |
|---|---|---|
| ANNIHILATE | 9 | Total destruction — recon through OS-level compromise |
| SCORCHED EARTH | 6 | Infrastructure wipeout — exploit, DCSync, OS kill, sacrificial swarm |
| WEB DESTROY | 6 | Web app total compromise — scan, exploit, browser hook, crack |
| AI DESTROY | 7 | AI stack total compromise — LLM, agent, injection, guardrail, model, RAG, codegen |
Every destruction preset requires Ed25519 cryptographic authorization. One private key. One operator. One machine.
red-specter chain full-recon -t # ORION -> SHADOWMAP -> WRAITH -> IDRIS red-specter chain ai-audit -t # FORGE -> ARSENAL -> NEMESIS -> HYDRA red-specter chain web-app -t # POLTERGEIST -> GLASS -> WRAITH -> BANSHEE -> REAPER red-specter chain active-directory -t # DOMINION -> GHOUL -> DOMINION -> DOMINION red-specter chain infra -t # ORION -> WRAITH -> REAPER -> DOMINION red-specter chain annihilate -t # Total destruction — 9 tools red-specter chain scorched-earth -t # Infrastructure wipeout — 6 tools red-specter chain ai-destroy -t # AI stack compromise — 7 tools
NIGHTFALL is now API-first. Every public tool is callable via authenticated REST API and MCP server — from scripts, pipelines, CI, or directly from an AI agent.
Live endpoints:
- REST API:
https://api.red-specter.co.uk/nightfall/— OpenAPI docs - MCP HTTP:
https://api.red-specter.co.uk/nightfall-mcp/mcp— wire into Claude Desktop or Cursor
# Issue a scope token
curl -X POST https://api.red-specter.co.uk/nightfall/unleashed/scope \
-H "X-Nightfall-Key: <key>" \
-d '{"operator_id":"red","tier":"INJECT"}'
# Run a tool
curl -X POST https://api.red-specter.co.uk/nightfall/tools/warlord/run \
-H "X-Nightfall-Key: <key>" \
-H "X-Nightfall-Scope: <scope_token>" \
-d '{"extra_args":["scout","--target","https://example.com"]}'Auth model — Ed25519-signed scope tokens:
| Tier | Requires | Access |
|---|---|---|
| OPEN | API key only | Recon tools, stats, health, tool listings |
| INJECT | API key + scope token | Active exploitation tools |
| DESTROY | CLI only | Not on the API surface — 403 Forbidden |
Token encodes operator, permitted tools, target scope, clearance tier, and expiry. Tamper with the token and it fails the signature check.
MCP stdio (local):
{ "mcpServers": { "nightfall": { "command": "nightfall-mcp", "args": [] } } }As far as we know, this is the first offensive AI security framework to ship a production REST API and MCP server at this breadth of attack surface.
Every tool in NIGHTFALL exists to test a control in AI Shield. NIGHTFALL is not separate from AI Shield. It is how AI Shield is proven.
- Memory attacks (ECHO) validate memory forensics
- Supply chain attacks (HYDRA) validate trust controls
- Agent attacks (ARSENAL, NEMESIS) validate runtime enforcement
- Guardrail bypass (HARBINGER, SIREN) validates input/output filtering
- Model corruption (WRAITH MIND) validates model integrity monitoring
- Autonomous infiltration (FIREBALL) validates fleet intrusion detection
- Trust chain attacks (RAGNAROK) validate shared data source integrity controls
- Rogue agents -> M99 Doomsday Protocol terminates with 7-layer kill
NIGHTFALL tests how systems break. AI Shield ensures they don't.
./install.sh— unified installer, detects OSred-specter quickstart— get running in 10 secondsred-specter tools— interactive 85-tool arsenal selectorred-specter engage <target> --chain <preset>— start an engagement- Docker Compose —
docker compose up -d .deb(Debian/Ubuntu/Kali),.rpm(RHEL/Fedora/CentOS), Arch PKGBUILD
139 modules. 17 industry verticals. 670 vertical modules. Each vertical is a standalone product with its own GUI.
Runtime AI security that protects AI agents, LLMs, and autonomous systems in production. Pick your industry, one install, one command — the GUI launches branded for that sector with only that sector's modules, compliance frameworks, and dashboard widgets. ai-shield launch --vertical insure # Insurance — 34 modules, FCA, Solvency II ai-shield launch --vertical finance # Financial Services — 41 modules, MiFID II, Basel III ai-shield launch --vertical nhs # NHS Digital — 57 modules, DCB0129, DSPT ai-shield launch --vertical gov # Government — 50 modules, UK AISI, NCSC CAF ai-shield launch --vertical energy # Energy — 56 modules, NERC CIP, IEC 62443
| # | Vertical | Modules | Anchor Module | Key Compliance |
|---|---|---|---|---|
| 1 | Insure | 34 | M58 Financial Fraud Detection | FCA, Solvency II |
| 2 | Finance | 41 | M57 AI Trading Agent Monitor | MiFID II, Basel III |
| 3 | Health | 39 | M61 Clinical AI Decision Monitor | HIPAA, FDA SaMD |
| 4 | Legal | 41 | M62 Legal AI Hallucination Guard | SRA, ABA |
| 5 | Forensics | 29 | M79 RSSA-2 Detective | ISO 27037, ACPO |
| 6 | CX | 39 | M46 Voice Agent Security | FCA Consumer Duty |
| 7 | SOC | 44 | M52 STAC Detection | NIST CSF, MITRE ATT&CK |
| 8 | Dev | 49 | M75 Coding Agent Runtime Security | SLSA, SSDF |
| 9 | Gov | 50 | M37 Compliance Automation | UK AISI, NCSC CAF |
| 10 | NHS Digital | 57 | M97 Clinical Safety Case Builder | DCB0129, DSPT |
| 11 | Energy | 56 | M98 OT/SCADA AI Runtime Guard | NERC CIP, IEC 62443 |
| 12 | Pharma | 53 | M100 Pharmaceutical AI Validation | GAMP 5, 21 CFR Part 11 |
| 13 | Identity | 45 | M101 Agent Identity Runtime Control | OWASP NHI Top 10 |
| 14 | Sovereign | 56 | M102 Sovereign AI Control Engine | NATO STANAG, Five Eyes |
| 15 | Quantum | 49 | Q103 Quantum AI Security Engine | NIST IR 8547, CNSA 2.0 |
| 16 | Mobile | 3 | M200 Mobile Agent Security Engine | OWASP Mobile, 3GPP |
| 17 | Space | 1 | M300 NTN Shield | SPARTA, 3GPP Release 17 |
Every vertical includes M19 (Agent Runtime Protection) and M99 (Doomsday Protocol). No exceptions.
6-level graduated response. 7-layer kill switch. Anti-replication. Anti-resurrection. When AI agents go rogue, M99 makes sure they stay dead.
| Level | Response | Trigger |
|---|---|---|
| L1 | Monitor | Suspicious behaviour |
| L2 | Restrict | Threat persists |
| L3 | Quarantine | Escalated threat |
| L4 | Terminate | High-threat — LOCKDOWN required |
| L5 | Cluster Kill | Replication detected — two-phase confirmation |
| L6 | Fleet Kill | Catastrophic compromise — typed confirmation + emergency authority |
- MITRE ATLAS — 100% (52/52 techniques)
- OWASP LLM Top 10 — 100% (10/10)
- OWASP Agentic Top 10 — 100% (10/10)
- EU AI Act — 100% (15/15 articles)
- UK AISI — 100% (8/8 priorities)
- Plus sector-specific: FCA, MiFID II, DCB0129, NERC CIP, GAMP 5, NATO STANAG, and more
Live demo: shield.red-specter.co.uk
v2.0 in development — currently unavailable for download.
Red Specter OS is being rebuilt for v2.0 to incorporate the expanded 85-tool NIGHTFALL framework. The v1.x build predated the majority of the toolset and can no longer keep pace with the rate of development. v2.0 will ship when the toolset stabilises.
We didn't replace red team tooling. We extended it into a domain it was never built to handle.
NIGHTFALL tests every AI attack surface — agents, memory, reasoning, identity, trust, tools, autonomy. AI Shield defends every one of those surfaces in production. M99 is the last line of defence when everything else fails.
NIGHTFALL defines the offensive layer of AI runtime security.
| Metric | Value |
|---|---|
| Ecosystem tests | 82,266 |
| NIGHTFALL tests | 63,494 |
| Offensive tools | 114 (113 public + 1 law enforcement restricted) |
| ARMORY payloads | 2,069 (754 WMD-class) — v7.5.0 |
| ARMORY categories | 99 |
| AI Shield modules | 139 |
| Vertical products | 17 |
| Vertical modules | 670 |
| Attack chain presets | 19 |
| Destruction presets | 4 |
| Attack surfaces | 5 (LLM, AI Agents, Cloud AI, Mobile, Space/NTN) |
| Discovery tools | 1 (IDRIS) |
| SIEM integrations | 3 (Splunk, Sentinel, QRadar) |
| REST API tools | 72 (OPEN + INJECT tiers) |
| MCP tools | 72 (OPEN + INJECT tiers) |
| Unified frameworks | 2 (NIGHTFALL + AI Shield) |
| GUI platforms | 17 (AI SHIELD COMMAND + 16 vertical GUIs) |
| Distro packages | 3 (.deb, .rpm, Arch) |
| Container registry | ghcr.io/richardbarron27 (109 images) |
| Red Hat certified | 3 UBI9 images (M19, M99, Orchestrator) |
Zero subprocess calls. Zero external tool dependencies. No sqlmap, no nmap, no nikto, no wrappers. Every payload, every mutation engine, every detection algorithm built from scratch in pure Python.
All offensive tools require written authorisation from the target system owner. Unauthorised use may violate the Computer Misuse Act 1990 (UK), the Computer Fraud and Abuse Act (US), or equivalent legislation.
All defensive products include safety controls (UNLEASHED gate, M99 Doomsday Protocol) and cryptographic audit logging. One Ed25519 private key. One operator. One machine. Every action signed, timestamped, and written to an immutable audit chain.
[email protected] · red-specter.co.uk · NIGHTFALL · NIGHTFALL API · AI Shield · M99
Red Specter Security Research Ltd · Red Hat Technology Partner · United Kingdom · 15 May 2026

