Summary
Customers using Cursor Automations with the Sentry MCP server are hitting rate limit errors shortly after authenticating. Reports include: all API calls being rate-limited within a few minutes of auth, and automation runs being blocked with "The automation was rate-limited due to too many concurrent runs."
Symptoms
- Rate limit errors triggered within minutes of initial authentication
- Affects all API calls (not isolated to a specific endpoint)
- Observed in concurrent/parallel automation run scenarios (e.g. Cursor Automations)
- Error:
"The automation was rate-limited due to too many concurrent runs. Retry after a short delay or reduce the number of parallel automation runs."
How Rate Limiting Is Implemented
Rate limiting runs in the Cloudflare Worker (packages/mcp-cloudflare) via Cloudflare's native RateLimit binding. There are two independent layers:
Layer 1: IP-based (pre-auth)
- Applied to all
/mcp and /oauth routes before OAuth processing
- Binding:
MCP_IP_RATE_LIMITER (fallback: MCP_RATE_LIMITER)
- Key:
mcp:ip:<sha256-of-ip>[0:16]
- Threshold: 300 requests / 60 seconds per IP
- Source:
src/server/index.ts
Layer 2: Per-user (post-auth)
- Applied after OAuth token validation, inside the MCP handler
- Binding:
MCP_USER_RATE_LIMITER (fallback: MCP_RATE_LIMITER)
- Key:
mcp:user:<sha256-of-userId>[0:16]
- User ID = Sentry user ID (from OAuth token,
payload.user.id) — shared across all clients/tokens for the same Sentry account
- Threshold: 60 requests / 60 seconds per user
- Source:
src/server/lib/mcp-handler.ts
Key implementation details
- Rate limit keys use the first 16 hex chars of SHA-256 of the identifier (privacy-preserving, but with a ~1/10^19 collision risk — negligible)
- Both bindings ARE defined in
wrangler.jsonc for prod and canary — they are wired. However the namespace IDs in the file (1001–1004, 2001–2004) are local dev mock IDs. Production bindings must be configured via Cloudflare dashboard and should override these.
- If the rate limiter binding is unavailable (e.g. local dev), requests are allowed by default (fail-open)
- On rate limiter errors, requests are also allowed (fail-open)
- The legacy
MCP_RATE_LIMITER binding is still used as a fallback in code but is not present in wrangler.jsonc — if not deployed in prod, both layers transparently fall back to allow-all
Rate Limit Config (wrangler.jsonc — prod and canary use separate namespace IDs but identical limits)
| Binding |
Limit |
Period |
Scope |
MCP_IP_RATE_LIMITER |
300 req |
60s |
per IP |
MCP_USER_RATE_LIMITER |
60 req |
60s |
per Sentry user ID |
CHAT_RATE_LIMITER |
10 req |
60s |
chat routes |
SEARCH_RATE_LIMITER |
20 req |
60s |
search routes |
Why Cursor Automations Are Likely Hitting This
Cursor Automations can spawn multiple concurrent runs, each of which fires multiple MCP tool calls. All runs from the same authenticated Sentry user share the same MCP_USER_RATE_LIMITER bucket (keyed by Sentry user ID). At 60 req/60s, a user running 3–4 parallel automation runs with moderate tool use can saturate the limit within seconds. Cloudflare's rate limits are fixed-window, not sliding — this means burst behavior at window boundaries can make it feel more aggressive than the numbers suggest.
The IP-based limit (300 req/60s) is less likely to be the culprit for single-user scenarios but could be a factor for shared corporate egress (NAT/proxy).
Sentry Instrumentation Gap
There are currently no Sentry events or metrics emitted when a rate limit is hit. The 429 response is returned directly with no captureException, captureEvent, or custom metric. The sentryBeforeSend hook only handles scrubbing + fingerprinting — it does not add rate limit visibility. Cloudflare's built-in observability (observability.enabled: true) may capture request-level data, but we have no Sentry-side signal to alert on, trend, or correlate with user reports.
We should add Sentry instrumentation at both rate limit check points to track:
- Which limit was hit (IP vs user)
- The rate-limited identifier (hashed, already done in the key)
- Frequency / volume of 429s over time
Investigation Still Needed
Expected Behavior
Users should not hit rate limits during normal Cursor Automation usage. If limits are intentional, they should be clearly documented and provide actionable guidance in the error response.
Workaround
Reduce number of parallel automation runs; retry after a short delay.
Action taken on behalf of David Cramer.
Summary
Customers using Cursor Automations with the Sentry MCP server are hitting rate limit errors shortly after authenticating. Reports include: all API calls being rate-limited within a few minutes of auth, and automation runs being blocked with
"The automation was rate-limited due to too many concurrent runs."Symptoms
"The automation was rate-limited due to too many concurrent runs. Retry after a short delay or reduce the number of parallel automation runs."How Rate Limiting Is Implemented
Rate limiting runs in the Cloudflare Worker (
packages/mcp-cloudflare) via Cloudflare's nativeRateLimitbinding. There are two independent layers:Layer 1: IP-based (pre-auth)
/mcpand/oauthroutes before OAuth processingMCP_IP_RATE_LIMITER(fallback:MCP_RATE_LIMITER)mcp:ip:<sha256-of-ip>[0:16]src/server/index.tsLayer 2: Per-user (post-auth)
MCP_USER_RATE_LIMITER(fallback:MCP_RATE_LIMITER)mcp:user:<sha256-of-userId>[0:16]payload.user.id) — shared across all clients/tokens for the same Sentry accountsrc/server/lib/mcp-handler.tsKey implementation details
wrangler.jsoncfor prod and canary — they are wired. However the namespace IDs in the file (1001–1004,2001–2004) are local dev mock IDs. Production bindings must be configured via Cloudflare dashboard and should override these.MCP_RATE_LIMITERbinding is still used as a fallback in code but is not present inwrangler.jsonc— if not deployed in prod, both layers transparently fall back to allow-allRate Limit Config (
wrangler.jsonc— prod and canary use separate namespace IDs but identical limits)MCP_IP_RATE_LIMITERMCP_USER_RATE_LIMITERCHAT_RATE_LIMITERSEARCH_RATE_LIMITERWhy Cursor Automations Are Likely Hitting This
Cursor Automations can spawn multiple concurrent runs, each of which fires multiple MCP tool calls. All runs from the same authenticated Sentry user share the same
MCP_USER_RATE_LIMITERbucket (keyed by Sentry user ID). At 60 req/60s, a user running 3–4 parallel automation runs with moderate tool use can saturate the limit within seconds. Cloudflare's rate limits are fixed-window, not sliding — this means burst behavior at window boundaries can make it feel more aggressive than the numbers suggest.The IP-based limit (300 req/60s) is less likely to be the culprit for single-user scenarios but could be a factor for shared corporate egress (NAT/proxy).
Sentry Instrumentation Gap
There are currently no Sentry events or metrics emitted when a rate limit is hit. The 429 response is returned directly with no
captureException,captureEvent, or custom metric. ThesentryBeforeSendhook only handles scrubbing + fingerprinting — it does not add rate limit visibility. Cloudflare's built-in observability (observability.enabled: true) may capture request-level data, but we have no Sentry-side signal to alert on, trend, or correlate with user reports.We should add Sentry instrumentation at both rate limit check points to track:
Investigation Still Needed
MCP_USER_RATE_LIMITERandMCP_IP_RATE_LIMITERare actually deployed in the Cloudflare dashboard for prod (if not, the fallback toMCP_RATE_LIMITERwould make the limits undefined/allow-all)console.warnso it's captured viaconsoleLoggingIntegration, ideally a custom metric or breadcrumbclientId + userId, or elevated for?agent=1requests — the query param already exists)Expected Behavior
Users should not hit rate limits during normal Cursor Automation usage. If limits are intentional, they should be clearly documented and provide actionable guidance in the error response.
Workaround
Reduce number of parallel automation runs; retry after a short delay.
Action taken on behalf of David Cramer.