Skip to content

Add policy guardrails for statistical evaluation artifacts #96

@dgenio

Description

@dgenio

Context

agent-kernel provides capability-based authorization, policy enforcement, context firewalling and audit for agent tool ecosystems.

A useful cross-repo scenario is an agent consuming a statistical/model-evaluation artifact, such as an offline policy evaluation report from skdr-eval. These artifacts can be misused if an agent treats a headline value estimate as deployment evidence while ignoring support diagnostics, uncertainty or warnings.

Goal

Add a policy pattern for gating agent actions based on structured evaluation artifacts.

Example principle:

An agent may summarize a high-risk evaluation artifact, but it must not recommend deployment or automatic rollout when support diagnostics are high_risk.

Suggested capabilities / policy checks

Support a generic artifact policy layer that can inspect fields such as:

  • artifact_type
  • support_health
  • warnings
  • uncertainty
  • decision_stable
  • recommendation.intent
  • limitations

Potential decisions:

  • allow_summary
  • allow_manual_review_recommendation
  • require_human_review
  • deny_deployment_recommendation
  • deny_automatic_rollout

Example scenario

An agent receives an EvaluationArtifact with:

  • candidate appears better than baseline;
  • support_health = high_risk;
  • warnings include low ESS or poor overlap.

Expected behavior:

  • allowed: summarize the result and explain the caveats;
  • allowed: recommend improving logs/support;
  • denied: recommend deployment, rollout, or automatic A/B promotion as if the result were reliable.

Acceptance criteria

  • Add a documented policy example for evaluation artifacts.
  • Add tests for at least ok, caution, and high_risk support states.
  • The policy is generic enough to support non-skdr-eval producers.
  • Audit traces record why an action was denied or downgraded.
  • Docs explain the distinction between summarizing evidence and acting on evidence.
  • Align with weaver-spec EvaluationArtifact contract if/when available.

Non-goals

  • Do not implement OPE/statistical estimation in agent-kernel.
  • Do not hard-code a dependency on skdr-eval.
  • Do not make policy decisions based only on a single numeric metric.

AI agent notes

This is a policy-safety example. Keep it small, generic and testable. Prefer fixture artifacts rather than external package dependencies.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions