Skip to content

Security: LegionForge/legionforge.github.io

Security

security.md

layout default
title Security
description LegionForge's security posture, philosophy, and differentiators. What we defend against, what we don't, and how to disclose vulnerabilities.
permalink /security/
SECURITY POSTURE

Security is the foundation

Not a feature. Not a layer. The constraint every other decision is shaped around.

Philosophy

The thesis

The LLM is not trustworthy. Everything that crosses a trust boundary is validated by deterministic code before the LLM ever sees it, and everything the LLM produces is validated again before it has any effect on the outside world.

The reflex when designing a security layer in 2026 is to put an LLM in it — "have the model judge whether this call is safe." That's wrong. LLM-based checks can be prompt-injected by the very payloads they're inspecting, they're slow, and they're expensive. LegionForge sticks to regex, hash compare, signature verify, and capability lookups. They're crude. They're also predictable and auditable.

Five non-negotiables

Principles that shape every component

Fail-safe tiering

Halt → sandbox/retry → degrade. Never silently succeed. Errors propagate with intent.

Human gates on mutations

Destructive actions cross a human-in-the-loop boundary by default.

Replace AI with determinism

The LLM is the last resort, not the first. Rules, tables, and pattern matchers run ahead.

Validate at trust boundaries

Sanitize once, at the edge. Internal code trusts internal data. Validate at the edges, not at every node.

Privilege tied to tasks

Capability is scoped to the active task and expires when the task ends. No persistent agent privilege.

Differentiators

How LegionForge differs from the others

<table>
  <thead>
    <tr>
      <th></th>
      <th>LegionForge</th>
      <th>Cloud agent platforms<br><small style="color: var(--dim); font-weight: 400;">(OpenAI Operator, Anthropic Computer Use, Google Mariner)</small></th>
      <th>OSS agent frameworks<br><small style="color: var(--dim); font-weight: 400;">(LangChain, AutoGen, CrewAI)</small></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Where it runs</strong></td>
      <td>Your hardware</td>
      <td>Their hardware</td>
      <td>Your hardware</td>
    </tr>
    <tr>
      <td><strong>Where your data sits</strong></td>
      <td>Your PostgreSQL</td>
      <td>Their database (opaque)</td>
      <td>Wherever you wire it</td>
    </tr>
    <tr>
      <td><strong>Tool-call security</strong></td>
      <td>7-check deterministic pipeline on every call (enforced)</td>
      <td>Their internal checks (you don't see them)</td>
      <td>Whatever you wire (often nothing)</td>
    </tr>
    <tr>
      <td><strong>Prompt-injection detection</strong></td>
      <td>29 patterns, two tiers, at trust boundary</td>
      <td>Vendor-defined</td>
      <td>Not bundled</td>
    </tr>
    <tr>
      <td><strong>Audit trail</strong></td>
      <td>SHA-256 hash-chained <code>audit_log</code></td>
      <td>Their logs (you don't get them)</td>
      <td>Not bundled</td>
    </tr>
    <tr>
      <td><strong>HITL on destructive actions</strong></td>
      <td>Enforced via approval gate</td>
      <td>Sometimes</td>
      <td>You wire it</td>
    </tr>
    <tr>
      <td><strong>Tool signing</strong></td>
      <td>Ed25519 on every registered tool</td>
      <td>Internal</td>
      <td>Not bundled</td>
    </tr>
    <tr>
      <td><strong>License</strong></td>
      <td>AGPL-3.0 (commercial available) · Guardian MIT</td>
      <td>Proprietary</td>
      <td>MIT / Apache 2.0</td>
    </tr>
  </tbody>
</table>

What we defend against

Threat model

Prompt injection (Tier 1)
High-confidence patterns. Match → reject immediately, log INJECTION_DETECTED.
Tool poisoning
Live tool code hashed at invocation and compared to registered hash. Catches dependency-replacement attacks.
Capability creep
Tool's required capability must be in the task's scope. Scope is set at task creation and never widens.
Destructive tool arguments
Regex pipeline catches rm -rf /, DROP TABLE, fork bombs, pipe-to-shell, metadata endpoints.
Runaway behavior
Three independent loop-protection layers: step counter, action-history hash, token budget.
Tool result injection
Tool output containing injection payloads aimed back at the model (e.g., a fetched web page).
PII leakage in logs
Output sanitization strips PII before logging to LangSmith or returning to user.
Unauthorized destructive ops
HITL approval gate. Destructive tool calls cross a human-in-the-loop boundary.

What we don't claim to catch

Honest limits

  • A malicious human operator with gateway credentials. Bearer auth gates entry; access control inside the gateway assumes the operator is authorized.
  • Side-channel attacks on local LLM weights. Model integrity is checked at load, but not at every inference.
  • Physical access to the machine.
  • Threats specific to platforms we don't run on (we run local-first; cloud-specific threats aren't our model).

Listing the limits matters as much as listing the wins. A threat model that claims to defend against everything is a threat model nobody has actually walked.

Reporting vulnerabilities

Coordinated disclosure

Do not open a public issue for security vulnerabilities. Email [email protected].

We respond within 5 business days. After a fix is in place and users have had a chance to update, we publish a security advisory in the affected repo with the coordinated CVE if one was assigned.

There aren't any published security advisories