Skip to content

Add get-api-key skill — generic SaaS API-key extraction#112

Open
shubh24 wants to merge 2 commits into
mainfrom
shubh24/get-api-key-skill
Open

Add get-api-key skill — generic SaaS API-key extraction#112
shubh24 wants to merge 2 commits into
mainfrom
shubh24/get-api-key-skill

Conversation

@shubh24
Copy link
Copy Markdown
Contributor

@shubh24 shubh24 commented May 19, 2026

Summary

  • Adds a new get-api-key skill that drives any authenticated SaaS dashboard via the browse CLI to create or reveal an API key
  • Phase 0 reads the vendor's own docs (canonical URL guess first, Google fallback) to learn the flow before navigating — avoids hardcoded site-specific selectors
  • Documents both local (auto-connect Chrome) and remote (Browserbase cloud + cookie-sync persistent context) auth paths, including the 5-minute idle-session caveat and Chrome 136+ debug-port mitigation

What's inside

  • skills/get-api-key/SKILL.md — the skill prompt, structured as Phase 0 (docs) → Phase 1 (verify auth) → Phase 2 (find keys page) → Phase 3 (reveal/create) → Phase 4 (capture) → Phase 5 (return JSON)
  • skills/get-api-key/LICENSE.txt — MIT, matching repo convention

How it's generic

  • No hardcoded site URLs, selectors, or button labels in the skill body
  • Cross-SaaS UX patterns are documented as principles (custom comboboxes vs native selects, varied confirm labels, secret-in-snapshot extraction) rather than per-site recipes
  • Phase 0 doc-reading replaces the need for site-specific knowledge in the skill itself

Validation

Built via autobrowse iteration across five sites:

  • Browserbase (reveal flow, bb_live_)
  • OpenAI (create flow, sk-proj-)
  • Anthropic Console (create flow, sk-ant-api03-)
  • Vercel (create flow with combobox scope picker, vcp_)
  • GitHub fine-grained PAT (Phase 0 doc-reading validated — agent went straight to canonical docs URL, extracted steps, navigated to /settings/personal-access-tokens/new and filled the form correctly in 13 turns)

Test plan

  • Pick a SaaS the agent has never seen (e.g. Resend, Linear, Stripe test mode) and run the skill end-to-end
  • Verify Phase 0 fires (doc_url_used populated in the returned JSON)
  • Verify the agent falls back gracefully when docs are unavailable (Google blocked, vendor has no public docs, etc.)
  • Confirm the remote-mode flow recovers from a mid-task BB session expiry by reusing the persistent context

🤖 Generated with Claude Code


Note

Medium Risk
New agent skill that automates creation/reveal of live API secrets and documents Browserbase/cookie flows; misuse or trace leakage could expose credentials, but it does not change runtime application code.

Overview
Adds a new get-api-key skill package under skills/get-api-key/: an MIT LICENSE.txt and a SKILL.md agent playbook for pulling API keys from authenticated SaaS dashboards via the browse CLI.

The skill documents local Chrome (--auto-connect, debug port / Chrome 136+ caveats) and remote Browserbase setup ( cookie-sync, CDP attach, manual login via debugger URL, ~5‑minute idle expiry and recovery). It defines a phased workflow—vendor docs first (canonical URL or Google via browse), auth check, keys page discovery (URL patterns or nav), reveal or create, one-time secret capture (screenshot + snapshot), and structured JSON output—plus browse 0.7.1 command guidance, cross-SaaS UX heuristics, failure handling, and explicit limits (no login, no sudo passwords, no fabricated secrets).

Reviewed by Cursor Bugbot for commit 9a90843. Bugbot is set up for automated code reviews on this repo. Configure here.

Drives any authenticated SaaS dashboard via the browse CLI to create or
reveal an API key. Reads the vendor's own docs (Phase 0) to learn the
flow, then executes against the UI — avoids hardcoding site-specific
selectors so the skill generalizes to dashboards it's never seen.

Includes:
- Phase 0 doc-reading via browse-driven Google search + canonical URL
  fallback (validated on GitHub fine-grained PATs).
- Local (auto-connect to Chrome) and remote (Browserbase cloud +
  cookie-sync persistent context) auth-setup paths.
- Remote-mode caveat documenting the 5-minute idle session expiry,
  with API recovery snippet.
- Generic cross-SaaS UX patterns (custom comboboxes vs native selects,
  varied confirm-button labels, secret-in-snapshot extraction).
- Failure-recovery playbook for billing prompts, sudo re-auth, stale
  refs, and session loss.

Validated on Browserbase, OpenAI, Anthropic, Vercel, and GitHub
(fine-grained PAT) via autobrowse iteration.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@shubh24 shubh24 requested a review from shrey150 May 19, 2026 00:10
Comment thread skills/get-api-key/SKILL.md
Phase 0 leaves the browser on a docs/Google page; Phase 1 then checked
the current URL against the dashboard host and would falsely return
"not authenticated". Add an explicit browse open <site-root-url> before
the auth check.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 9a90843. Configure here.

# 2. Stop any existing browse daemon, attach to the cloud session via CDP
browse stop
WS_URL="wss://connect.browserbase.com?apiKey=${BROWSERBASE_API_KEY}&sessionId=<sid>"
browse open <site-root-url> --cdp "$WS_URL"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option B cookie-sync steps wrong

High Severity

Remote setup tells agents to run cookie-sync with a nonexistent --persist flag, expect a Session ID from that step, and attach with a hand-built WebSocket URL. The real cookie-sync script only prints a Context ID and documents creating a cloud session via browse cloud sessions create before browse open --cdp.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 9a90843. Configure here.

- `browse open <url>` — navigate (no flags needed; daemon stays attached)
- `browse snapshot` — accessibility tree; each element gets a `[X-Y]` ref. PRIMARY perception tool.
- `browse click [X-Y]` — click by ref from latest snapshot (include brackets)
- `browse fill <selector> <value>` — fill input AND press Enter (clears existing text — PREFERRED over `type`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

browse fill incorrectly presses Enter

Medium Severity

The browse reference says browse fill clears the field and does not submit unless --press-enter is passed. This skill states fill always presses Enter, which can prematurely submit multi-field API key forms before scopes or expiration are set.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 9a90843. Configure here.

browse open <site-root-url> # return to the target dashboard (skip only if Phase 0 was skipped and you never left it)
browse get url
```
- URL contains the dashboard host AND NOT `/sign-in`, `/login`, `/auth` → proceed.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auth check rejects /auth paths

Medium Severity

Phase 1 treats any URL containing /auth as unauthenticated. Legitimate logged-in settings routes (for example paths under /settings/auth or /authentication) can be rejected with not authenticated even after a successful dashboard return.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 9a90843. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant