feat: Phase 1+2 — 6 umbrella skills, curated anchors, evals, gap fixes by george-bafaloukas-forgerock · Pull Request #1 · pingidentity/agent-plugins

george-bafaloukas-forgerock · 2026-06-03T13:25:46Z

Summary

6 umbrella skills built and validated: ping-quickstart, ping-foundation, ping-orchestration, ping-universal-services, ping-app-integration, ping-identity-for-ai
3-tier progressive disclosure architecture: metadata → SKILL.md (≤120 lines) → references/curated/ → references/generated/ → Docs MCP
58 curated anchors across all 6 skills and 4 platform branches (PingOne MT, PingOne ST, Ping Software, cross-platform)
14 generated reference manifests built by scripts/build_reference_manifests.py; CI workflow in .github/workflows/build-reference-manifests.yml
Strategy doc § 9 use-case evaluation — all 9 Getting Started 0–30 prompts evaluated; 4 gaps closed:
- passkeys-and-passwordless.md — friction tiers, FIDO2/WebAuthn patterns across all 3 platforms
- pingone-mt/themes-and-branding.md — branding, custom domain, email/SMS templates, DaVinci UI Studio
- end-to-end-validation.md — cross-cutting validation matrix, test-user patterns, reporting format
- server-side-integration-basics.md — backend OIDC, M2M, token exchange, CIBA, retry/429 resilience
Layer 1 eval (live, Bedrock claude-sonnet-4-6): all 6 skills PASS (≥90% trigger / 100% non-trigger / 100% ambiguous)
Eval prompt sets: 9 strategy use cases added across 5 skill YAMLs (T-50–T-55); all validated against schema

Test plan

scripts/build_reference_manifests.py --root . runs clean; 14 manifests regenerated
python3 -m evals.harness.validate_prompts — all prompt YAML files OK
python3 -m evals.harness.run_eval --adapter mock --layer 1 — 6/6 PASS
python3 -m evals.harness.run_eval --adapter mock --layer 2 — 6/6 PASS
python3 -m evals.harness.run_eval --adapter claude --layer 1 — 6/6 PASS (live Bedrock)
All SKILL.md files ≤120 lines
All new curated anchors have complete frontmatter

🤖 Generated with Claude Code

Add commands/, rules/, evals/ top-level dirs and the multi-IDE manifests required by strategy doc § 5: .claude-plugin/plugin.json, .cursor-plugin/marketplace.json + plugin.json, plugins/ping-identity/ .claude-plugin/plugin.json, and a .well-known/agent-skills/index.json stub for the discovery RFC. Refs: PLAN.md Phase 0 step 1; strategy doc § 5

Move ping-quickstart's flat references into references/curated/, add the runtime/ tier with a docs-mcp-routing.md stub to all three live skills (ping-quickstart, ping-foundation, ping-orchestration), and update ping-quickstart's SKILL.md paths to match. Refs: PLAN.md Phase 0 step 3; strategy doc § 6

Add docs-mcp-routing.md stub to all three live skills' runtime/ tier. Update ping-quickstart SKILL.md reference paths to curated/ subdirectory. Refs: PLAN.md Phase 0 step 3; strategy doc § 6

…entity-for-ai Three Phase 0 scaffolds (≤120 lines each, agentskills.io compliant) matching strategy doc § 4 'In Practice'. Each ships SKILL.md, a ping-marketplace.json metadata file, and the canonical references/{curated,generated,runtime}/ tier set per strategy § 6. Bodies are stubs marked status=Phase 0 scaffold; Phase 1 fills in routing tables and curated anchors per strategy § 7. Refs: PLAN.md Phase 0 step 4; strategy doc § 4 and § 7

Canonical decision rule for sandbox (docs) vs production (helix) runtime plus the strategy-doc § 0 tier-discipline rule (curated → generated → docs MCP). Referenced from every skill's references/runtime/ docs-mcp-routing.md and scored by Layer 3 of the eval. Refs: PLAN.md Phase 0 step 5; strategy doc § 0 and § 4

….mdc Copy shared/templates/AUTHORING-RULES.md and shared/taxonomies/ routing-rules.md into rules/ so authors have a single source of truth at the new repo root. Add rules/ping-identity.mdc (Cursor-style rule) for parity with Cloudflare's rules/workers.mdc. Refs: PLAN.md Phase 0 step 5; Cloudflare repo structure

Update plugin description and keywords to reflect the strategy-doc § 4 umbrella set. Refs: PLAN.md Phase 0 step 1

Port shared/evals/routing-eval.md to evals/scorecards/ with a Layer 1 harness callout prepended. Add Layer 2 (anchor-selection-eval.md — precision/recall against expected_anchors) and Layer 3 (plan-quality-eval.md — LLM-as-judge across correctness, completeness, concreteness, tier discipline). Refs: PLAN.md § Evaluation; strategy doc § 0

JSON Schema for evals/prompts/*.yaml + Python validator with 5 pytest cases covering: minimal valid set, skill/filename match, required top-level fields, required item fields, tier enum. Run: python3 -m evals.harness.validate_prompts Refs: PLAN.md § Evaluation Layer 1

Live skills (ping-quickstart, ping-foundation, ping-orchestration) ship at v1 minimums (10/5/3) since their bodies and curated content already exist. The three new skills ship Phase-0 minimums (3/2/1) and expand in Phase 1 once curated anchors land per strategy doc § 7. All prompt sets validate against evals/schemas/prompt-set.schema.json. Refs: PLAN.md § Evaluation, Phase 0 step 4

base.py defines the RunResult dataclass and LLMAdapter Protocol every driver must satisfy. mock.py provides a rule-based adapter so harness unit tests can assert scoring without hitting a real LLM API. Refs: PLAN.md § Evaluation Layer 4 (cross-LLM)

run_eval.py loads evals/prompts/*.yaml, drives an LLMAdapter (mock or claude), and scores routing accuracy (Layer 1) or anchor selection (Layer 2) against the bars in evals/scorecards/. Three pytest cases cover trigger pass, trigger fail, and Layer 2 recall/precision. All 11 harness tests pass. Mock end-to-end: PASS for all 6 skills. Run: python3 -m evals.harness.run_eval --adapter mock --layer 1 python3 -m evals.harness.run_eval --adapter mock --layer 2 Refs: PLAN.md § Evaluation Layers 1 and 2

Both runnable, both exit 0 with [stub] messages so Phase 1 can wire in real LLM calls without churning file layout. ClaudeAdapter raises a clear SystemExit until Phase 1 implements it. Refs: PLAN.md § Evaluation Layers 3 and 4

Documents the 5-layer eval framework, how to run each layer locally, and the per-PR authoring checklist that becomes a CI gate in Phase 3. Phase 0 smoke check: - validate_prompts: all 6 prompt YAMLs valid - run_eval --layer 1 --adapter mock: PASS (6/6 skills) - run_eval --layer 2 --adapter mock: PASS (6/6 skills) - pytest evals/harness/tests/: 11 passed Phase 0 exit criteria from PLAN.md: all green. Refs: PLAN.md Phase 0 exit criterion

Author 12 curated anchors across ping-universal-services, ping-app-integration, and ping-identity-for-ai (4 per skill). Expand all eval prompt YAMLs to spec (≥10/5/3). Cross-link all new skills from ping-quickstart. Update plugin-map.md and references/index.json. Full repo audit (119 issues fixed across 43 files): frontmatter gaps, broken index.json paths, routing placeholders, UI navigation language, missing Scope sections, scaffold version markers, passive skill descriptions. Implement ClaudeAdapter via Bedrock (eu.anthropic.claude-sonnet-4-6). Fix composition.yaml schema handling in harness. All 6 skills pass Layer 1 eval at ≥90/90/80% — results in evals/results/2026-06-01/claude.layer1.json. Add PHASE-1-EXECUTION-PLAN.md, PLAN.md, and rewrite README with install instructions, skills overview, eval status table, and delivery status. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…n roles, and doc-sourced updates - ping-quickstart: 3 new anchors (licensing, ForgeRock migration, inherited deployment orientation) and fixes to 3 existing anchors (console URLs, AIC URL structure, session timeout facts) - ping-foundation: 4 new MT anchors (app-registration, sign-on-policies, directory-and-populations, admin-roles-and-access), 1 new ping-software anchor (pingaccess-basics), fixes to all 12 existing anchors (strip /latest/, correct console URLs, update last_updated) - admin-roles-and-access.md: full coverage of built-in/custom roles, 3 admin onboarding methods, Administrators environment best practice — sourced from PingOne MT getting-started doc - directory-and-populations.md: added groups section (static/dynamic, internal/external, group roles) - sign-on-policies.md: added registration policy and passwordless SMS variants - app-registration.md: added group-based access and SSO URL location - index.json: added all 9 missing ping-foundation MT and ping-software anchor paths - evals/results/2026-06-02: Layer 1 eval results — all 6 skills PASS (90%+ on all dimensions) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…gistration+MFA use case - davinci-overview.md: rebuilt from scratch — logical operators table, node visual types, connector instance model, variable scopes, versioning (Save/Deploy/Revert), all 3 invocation methods (redirect/widget/API), redirect integration steps; URL fixed to /davinci/davinci_introduction.html - davinci-flow-patterns.md: URL fixed to /davinci/flows/davinci_flows.html; sources updated - davinci-registration-and-mfa.md: NEW — complete registration+email-verification flow, inline vs deferred MFA enrollment, risk step-up with PingOne Protect init/evaluate/update, error handling patterns, subflow extraction recommendations - SKILL.md (ping-orchestration): new DaVinci routing row for registration+MFA use case - All 17 PingOne ST anchors: last_updated bumped to 2026-06-02 - index.json: added davinci-registration-and-mfa.md path - evals/results: Layer 1 all 6 PASS (same known near-misses T-07, T-09) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…onfig, URL fixes - protect-configuration.md: NEW — 14 named predictors with license tier (Risk vs Protect), risk policy types (Standard/Targeted), Signals SDK integration by platform, bot/AI agent detection note, configuration checklist, PingID Device Trust predictor - verify-configuration.md: NEW — all verification types (GovernmentID, Facial Comparison, Liveness, Voice, Data-Based, DigiLocker), full policy field table, transaction lifecycle with all statuses including REQUIRES_REVIEW handling, IDA claims storage, timeout planning - All 4 existing anchors: last_updated → 2026-06-03; source URLs fixed (were all 404) - SKILL.md (ping-universal-services): routing rows for protect-configuration and verify-configuration - index.json: added protect-configuration and verify-configuration paths - evals/results: Layer 1 all 6 PASS Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

… Swift 6 notes, migration workflow - SKILL.md: added companion SDK skills table routing to pingidentity/ping-sdk-agent-skills (7 specialist skills: android, ios, reactjs-journey, reactjs-davinci, javascript, sdk-router, migration) - mobile-integration-basics.md: added full DaVinci collector type table (TextCollector, PasswordCollector, SubmitCollector, FlowCollector, SelectCollector, SsoCollector, QrCodeCollector, PhoneCollector, auto-advancing: ProtectCollector, DeviceAuthenticatorCollector); Swift 6 actor isolation notes - web-integration-basics.md: extended Journey callback table to 21 types (added PollingWaitCallback, MetadataCallback, SuspendedTextOutputCallback, SelectIdPCallback, IdPCallback, KbaCreateCallback, ReCaptchaCallback, WebAuthn* callbacks, attribute input callbacks); added DaVinci collector section - integration-troubleshooting-basics.md: added 9-phase automated migration workflow from forgerock-to-ping-journey-migration skill (comment markers, pre/post-flight build checks, never-delete-silently principle); added SDK skills reference - All 4 anchors: last_updated → 2026-06-03; all dead /r/en-us/ URLs fixed - evals/results: Layer 1 all 6 PASS Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…r, AIC AI Agents, CIBA, bot detection - identity-for-ai-overview.md: expanded from 3 buckets to 5 pillars (Agent Identity, Agent Security, Agent Gateway, Agent Detection, Verified Trust + AI App Auth); new routing table with MCP/gateway entries; products-by-pillar section with AIC Rapid channel note; slug + sources fixed - agent-security-patterns.md: Pattern 0 — agent registration (PingOne AI Agents feature, AIC /aiagent/register DCR endpoint, Rapid-channel availability note); Pattern 6 — CIBA human-in-the-loop approvals with CIBA flow + PingFederate/AIC support; Pattern 7 — bot/agentic AI detection via Protect predictor; sources fixed - agent-gateway-mcp.md: NEW — PingGateway Agent Gateway module; McpAuditFilter/McpProtectionFilter/ McpValidationFilter; RFC 8707 resource indicator requirement; version matrix (2026.3.0 current, 2025.11.2 maintenance); OAuth AS choices (AIC/PingOne/PingFederate); architecture diagram; Cloudflare + AWS Bedrock variants; Evolving stability warning - verified-trust-overview.md: slug + DaVinci applications URL fixed; last_updated bumped - workforce-helpdesk-ai.md: slug + source URLs fixed; last_updated bumped - SKILL.md: added MCP/gateway, CIBA, bot detection routing rows - index.json: added agent-gateway-mcp.md path - evals/results: Layer 1 all 6 PASS (ping-foundation now 100%; T-09 only remaining near-miss) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

- scripts/build_reference_manifests.py: scans curated anchors, scores by canonical/doc_type/recency/slug/status, generates top-N.json per branch; supports --dry-run; validates against reference-manifest-schema.json - .github/workflows/build-reference-manifests.yml: triggers on curated anchor changes or builder script changes; validates all manifests post-build; auto-commits on main pushes Generated manifests (14 total across 6 skills): ping-quickstart: cross-platform/top-15.json (6 docs) ping-foundation: pingone-mt/top-25.json (5 docs) pingone-st/top-25.json (5 docs) ping-software/top-25.json (3 docs) cross-platform/top-10.json (4 docs) ping-orchestration: pingone-mt/top-25.json (3 docs) pingone-st/top-25.json (17 docs) ping-software/top-25.json (0 — no curated anchors) cross-platform/top-10.json (0 — no curated anchors) ping-universal-services: cross-platform/top-20.json (6 docs) ping-app-integration: cross-platform/top-20.json (4 docs) ping-identity-for-ai: ai-identity/top-20.json (5 docs) PLAN.md: Phase 2 marked complete; Phase 3 marked next Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

Adds project CLAUDE.md, Helix setup guide and DaVinci workflow doc, superpowers Phase 0 restructure plan, .gitignore (excludes Playwright session logs), and updated .claude/settings.local.json with sandbox config and new permission entries. Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>

Re-evaluated all 6 umbrella skills against the 9 Getting Started 0–30 use cases in the Ping Agent Skill Strategy doc. Closed 4 gaps: - ping-orchestration: passkeys-and-passwordless.md (cross-platform) friction tiers (low / balanced / higher-assurance), registration patterns A/B/C, auth patterns 1–4, recovery, AIC + DaVinci + PingFederate node/connector mapping. - ping-foundation: pingone-mt/themes-and-branding.md branding, custom domain, email/SMS templates, senders, DaVinci UI Studio, CSP, multi-brand patterns, pre-go-live checklist. - ping-quickstart: end-to-end-validation.md (cross-cutting) validation matrix per layer, automatable vs manual today, test-user patterns, validation sequence, repeatable reporting format, sandbox/PII guardrails. Routing trigger added to SKILL.md. - ping-app-integration: server-side-integration-basics.md confidential clients, JWT/introspection, refresh rotation, M2M client_credentials, CIBA, Token Exchange, retry/429/circuit-breaker, multi-environment. SKILL.md routing tables updated for all four. Manifest builder rebuilt all 14 generated shortlists; index.json updated. All SKILL.md files at ≤120 lines. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Layer 1 live run (Bedrock, claude-sonnet-4-6 [1m]) PASS for all 6 skills with new prompts included: | Skill | Trigger | Non-trig | Ambig | |---------------------------|---------|----------|-------| | ping-app-integration | 100% | 100% | 100% | | ping-foundation | 100% | 100% | 100% | | ping-identity-for-ai | 100% | 100% | 100% | | ping-orchestration | 93% | 100% | 100% | | ping-quickstart | 92% | 100% | 100% | | ping-universal-services | 100% | 100% | 100% | Bar: 90% / 90% / 80%. All above bar. Added T-50–T-55 strategy use case prompts across: - ping-quickstart (uc-1, uc-9 ×2) - ping-foundation (uc-2 ×2, uc-3 ×2, uc-8 ×2) - ping-orchestration (uc-4 ×2, uc-5 ×2, uc-7) - ping-app-integration (uc-6 ×4: web / mobile / server-side / M2M) - ping-universal-services (uc-5: Protect, Verify) Bug fix: run_eval._build_adapter() now skips composition.yaml when loading mock rules — composition.yaml uses a different schema and crashed the mock adapter builder. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Phase 1 + 2 complete — 6 umbrella skills, 58 curated anchors, 14 generated manifests, CI builder, eval framework with live Bedrock Layer 1 PASS (6/6 skills above 90% bar). Resolves LICENSE conflict: keep Ping Identity copyright from main. Co-Authored-By: Claude Opus 4.7 <[email protected]>

brando-dill and others added 28 commits May 14, 2026 11:44

Initial commit

5cf8b82

Initial Structure

71c746e

refactor: add runtime/ tier stubs and fix SKILL.md paths

eae0f00

Add docs-mcp-routing.md stub to all three live skills' runtime/ tier. Update ping-quickstart SKILL.md reference paths to curated/ subdirectory. Refs: PLAN.md Phase 0 step 3; strategy doc § 6

chore: refresh marketplace.json for the 6-umbrella v1 skill set

1b788f7

Update plugin description and keywords to reflect the strategy-doc § 4 umbrella set. Refs: PLAN.md Phase 0 step 1

fix: add missing ping-quickstart references/generated/.gitkeep

081f415

george-bafaloukas-forgerock merged commit 69db1de into main Jun 3, 2026

george-bafaloukas-forgerock deleted the pr/skills-refactoring branch June 3, 2026 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Phase 1+2 — 6 umbrella skills, curated anchors, evals, gap fixes#1

feat: Phase 1+2 — 6 umbrella skills, curated anchors, evals, gap fixes#1
george-bafaloukas-forgerock merged 28 commits into
mainfrom
pr/skills-refactoring

george-bafaloukas-forgerock commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

george-bafaloukas-forgerock commented Jun 3, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants