autobrowse: optional vendor-neutral inbox-provider hook#119
Conversation
Lets an autobrowse loop provision a throwaway inbox so the inner agent can register accounts and complete email verification. A new scripts/inbox.mjs CLI (create / wait-otp / wait-link / latest / release) talks to the browse.sh inbox endpoint, which owns the AgentMail key — the agent only ever sees the address. evaluate.mjs gains --inbox-email, injects the inbox into the system prompt, and allows the agent to shell out to inbox.mjs. SKILL.md documents the opt-in provision/release steps, graduation note (inbox is loop-only), and the 3-concurrent-loop free-tier cap. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Consolidates all inbox-provisioning logic into the autobrowse skill so the feature is self-contained with nothing browse.sh-specific. inbox.mjs now calls api.agentmail.to directly using AGENTMAIL_API_KEY from the env (sweep-on-create and the ab- prefix guard move into the CLI). Browserbase deployments inject a pooled key; regular users provide their own (free at agentmail.to) and get a clear setup error if it's unset. The inner agent still only ever sees the inbox address — the key is read by inbox.mjs and never printed. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Hardening found by a live Substack magic-link signup run end-to-end: - wait-link returned an open-tracking pixel (.gif) because it grabbed the first URL anywhere in the body. Now extract <a href> anchors with a reject-list (unsubscribe/mailto/tel/preferences/.gif), which skips img-src pixels; --match matches the href OR the visible link text so "confirm"/"sign in" finds the CTA even when the href is a tracking redirect (browse open follows it). - latest only showed list-summary metadata (the list endpoint omits the body). It now fetches the full single message by id so text/html/links are visible. - partsOf prefers AgentMail's cleaned extracted_text/extracted_html. - evaluate.mjs killed wait-otp/wait-link at the fixed 30s exec cap (ETIMEDOUT on --within 60/90). exec timeout for inbox wait commands is now --within + 15s. Verified end-to-end: signup → wait-link returns the real "Confirm your email" CTA → browse open → signed-in Substack home. Sweep still proven to never touch non-ab- inboxes. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
… truth) - create now releases the inbox the task already tracks before minting a new one — a re-create within the 1h sweep window otherwise orphaned a live inbox (leaked AND unreachable by release). (#2) - evaluate.mjs resolves the inbox address from .inbox.json (what wait-otp/ wait-link actually poll); --inbox-email is a fallback and a mismatch now warns instead of silently polling a different inbox. (#4) - {{inbox_email}} in task.md is now substituted with the resolved address. (#3) - executeCommand pins inbox.mjs to the run's own --workspace/--task, so a sub-agent can't read or release a sibling task's inbox (parallel runs share a workspace, isolated only by --task). (#5) The 30s exec-timeout issue (#1) was already fixed by execTimeoutFor in 2d091fc. Verified: re-create deletes the prior inbox (no orphan); a divergent --inbox-email warns and the resolved address wins; {{inbox_email}} is replaced; an agent passing a foreign --task is overridden back to its own. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
|
Addressed the Bugbot findings in
Verified each end-to-end against a live AgentMail org: re-create deletes the prior inbox (no orphan); a divergent |
Validation summary — ready for reviewTested at HEAD
No leaked inboxes after any run. Ready to merge. |
|
Reviewed this — the core idea is solid and the security spine is genuinely well done: the AgentMail key never reaches the inner agent, only the throwaway address does. Nice. Three things worth tightening before merge. Framing them simply: 1. Leftover inboxes pile up (the big one). 2. It can grab the wrong verification code. 3. It'll open links sent by strangers. Everything else I found is low/nit (a possibly-dead |
Removes AgentMail from the public skill entirely and replaces the bundled
inbox.mjs with a generic, off-by-default provider contract. autobrowse no longer
ships an email provider or names any vendor; it only knows how to *call* one.
- evaluate.mjs: `--inbox-cmd <path>` / AUTOBROWSE_INBOX_CMD configures an optional
inbox-provider command. Allowlist, exec-timeout, force-scope, and the (now
vendor-neutral) Agent Inbox prompt key off it; all are inert when unset.
Documents the provider contract (create/wait-otp/wait-link/latest/release +
the .inbox.json {email,inbox_id} schema) as the explicit boundary.
- Deleted scripts/inbox.mjs (AgentMail-specific — moves to the internal caller).
- Scrubbed AGENTMAIL_API_KEY/agentmail.to from .env.example, SKILL.md (silent on
the feature), and example-task.md.
Kept generic mechanics: .inbox.json single-source-of-truth, {{inbox_email}}
substitution, --workspace/--task force-scoping, wait-command exec timeout.
Verified: with a throwaway stub provider the hook injects the section,
substitutes the address, and forces scope; with no --inbox-cmd there is no inbox
section and the allowlist is browse-only. `git grep -i agentmail` → no matches.
Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0593523. Configure here.
| function readInboxState(taskDir) { | ||
| try { | ||
| const { inbox_id } = JSON.parse(fs.readFileSync(path.join(taskDir, ".inbox.json"), "utf-8")); | ||
| return inbox_id || null; |
There was a problem hiding this comment.
readInboxState reads inbox_id instead of email field
High Severity
readInboxState destructures inbox_id from .inbox.json but the return value is used as an email address (inboxEmail). The provider contract on line 178-179 documents the schema as { "email": "...", "inbox_id": "..." } — two distinct fields. The email field contains the actual address (e.g. [email protected]) while inbox_id is the API identifier. The agent receives this ID instead of a valid email address, so it types the wrong value into signup/login forms, breaking the entire inbox feature.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 0593523. Configure here.
Reworked per team feedback — AgentMail is now fully out of this public repoKeeping AgentMail browse.sh-internal (skills is public), but without forking autobrowse. The inbox capability is now a generic, off-by-default provider hook; the AgentMail implementation + secrets live only in the internal browse.sh repo and are injected into the sandbox at runtime. This PR (public) now:
Verified: with a throwaway stub provider the hook injects the section + substitutes the address + forces scope; with no Pairs with the internal browse.sh PR (provider injection). Divergence stays minimal: one shared autobrowse core; browse.sh owns only a swappable provider script + a few prompt lines. |
|
✅ Re-validated end-to-end on the reworked architecture (full browse.sh sandbox pipeline, local): |
|
Reviewed this alongside the AgentMail provider in browserbase/browse.sh#159. The vendor-neutral split is great — this PR ships zero email-vendor code, just a clean 🟡 MediumThe allowlist lets the inner agent run more than it should. ⚪ Low
✅ What's done wellThe contract design is clean, and the isolation instinct — (Review was AI-assisted.) |


Summary
Lets an autobrowse loop provision a throwaway inbox so the inner agent can register accounts, log in, and complete email verification — without the user supplying their own email to the agent. Fully self-contained in the autobrowse skill (no browse.sh dependency).
scripts/inbox.mjsCLI:create/wait-otp/wait-link/latest/release. It talks directly to AgentMail (api.agentmail.to) usingAGENTMAIL_API_KEYfrom the env; the inner agent only ever sees the inbox address (the key is read by inbox.mjs and never printed, and theexecuteallowlist permits onlybrowse+inbox.mjs).createsweeps staleab--prefixed inboxes (>1h) before minting — self-heals crashed loops without ever touching a non-ab-inbox.evaluate.mjsgains--inbox-email, injects an "Agent Inbox" section into the system prompt, and allows the agent to shell out toinbox.mjs.SKILL.mddocuments the opt-in provision step, mandatory release/cleanup, the graduation note (inbox is loop-only — graduated skills expect the end user's own credentials), and the 3-inbox free-tier concurrency cap.Key source
AGENTMAIL_API_KEY(claimed org) into the skill-runner env — browse.sh uses the skill with zero browse.sh-specific code.Verification coverage
wait-otp(default 4–8 digits)wait-otp --regexwait-link [--match]→browse openlatestTest plan
AGENTMAIL_API_KEY→ clear setup error; missing state / unknown command error cleanly771209) → wait-link (extracted href URL) → latest → release; org returned to baseline inbox count, no leakscreateleft a freshab-inbox and the non-ab-primary untouchedinclude_spam=trueon polling — verification emails to a fresh inbox often get spam-flaggedSupersedes the earlier two-repo approach (browse.sh#151 closed).
🤖 Generated with Claude Code
Note
Medium Risk
Expands the agent execute allowlist and runs external inbox CLIs with altered timeouts; misconfiguration or weak provider isolation could affect parallel tasks, though workspace/task pinning mitigates cross-task access.
Overview
evaluate.mjsgains optional throwaway-inbox support for signup/login/MFA flows via a pluggable provider (--inbox-cmd/AUTOBROWSE_INBOX_CMD), documented inline with acreate/wait-otp/wait-link/latest/releasecontract andtasks/<task>/.inbox.json.When configured, the inner agent may run only
browseplusnode <resolved-inbox-cmd>; provider calls are scoped to the current run’s workspace/task (agent-supplied--workspace/--taskstripped), andwait-otp/wait-linkget exec timeouts extended past the default 30s cap. Runs resolve the inbox from.inbox.json(with--inbox-emailfallback), warn on mismatch, substitute{{inbox_email}}intask.md, and inject an Agent Inbox system-prompt section with provider command examples..gitignorenow ignores.inbox.jsonso per-task inbox state is not committed.Reviewed by Cursor Bugbot for commit 0593523. Bugbot is set up for automated code reviews on this repo. Configure here.