Skip to content

feat(images): reference-based image rendering for tool results#376

Merged
dimakis merged 7 commits into
mainfrom
feat/image-rendering-v2
Jun 20, 2026
Merged

feat(images): reference-based image rendering for tool results#376
dimakis merged 7 commits into
mainfrom
feat/image-rendering-v2

Conversation

@dimakis

@dimakis dimakis commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Summary

  • Adds image rendering for tool results (screenshots, photos, matplotlib plots, etc.)
  • Reference-based architecture: images are stored server-side in memory, WS carries lightweight {id, mediaType} refs, frontend fetches via GET /api/images/:id with browser caching
  • Supports JPEG, PNG, GIF, and WebP via MIME allowlist validation
  • Extracts type: 'image' content blocks from Claude SDK tool results (previously discarded by extractToolResultText)

Changes

  • server/image-store.ts — in-memory image store with session-scoped cleanup
  • server/app.ts — REST endpoint GET /api/images/:imageId
  • server/query-loop.ts — extract images from tool results, store server-side, emit refs
  • packages/protocol/ToolResultImage type, extractToolResultImages() function, updated block types
  • packages/client/ — protocol parser + messages slice handle image refs
  • frontend/src/components/ToolPill.tsx — renders images via REST URL
  • 20 tests (6 protocol + 5 image-store + 9 existing)

Test plan

  • Unit tests for extractToolResultImages (6 cases)
  • Unit tests for image-store (5 cases: store/retrieve, MIME validation, session cleanup)
  • E2E: ask Claude to read a screenshot → image renders in chat
  • E2E: ask Claude to generate a matplotlib plot → chart renders
  • Verify no regression on text-only tool results

🤖 Generated with Claude Code

@dimakis dimakis left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centaur Review

Found 7 issue(s) (2 critical) (2 warning).

packages/protocol/src/content-blocks.ts

Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.

  • 🔴 bugs (L55): extractToolResultImages declares return type ToolResultImage[] (which has { id: string; mediaType: string }), but pushes { data: string, mediaType: string }. The id field is missing and data doesn't exist on the type. This is a compile error. A separate intermediate type (e.g. { data: string; mediaType: string }) is needed for extracted raw images, distinct from the reference-based ToolResultImage. [fixable]

server/query-loop.ts

Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.

  • 🔴 bugs (L1271): img.data is accessed on a ToolResultImage which has no data property (only id and mediaType). This works at runtime because the actual object from extractToolResultImages has data, but it's a type error and will fail tsc. Same root cause as the type mismatch in content-blocks.ts. [fixable]

server/image-store.ts

Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.

  • 🟡 unsafe_assumptions: clearSessionImages is defined but never called anywhere in the codebase. Images accumulate in the in-memory Map indefinitely with no TTL, no max-size cap, and no cleanup on session close/hide. For a long-running server this is an unbounded memory leak. Wire clearSessionImages into closeSessionByUser (chat.ts:1342) and/or session abort/cleanup paths. [fixable]
  • 🟡 unsafe_assumptions (L29): storeImage accepts arbitrarily large base64 payloads with no size limit. A single large image (e.g. a high-res screenshot returned by a tool) could decode to tens of megabytes of Buffer. Consider a per-image size cap or at minimum a total store size budget. [fixable]

server/__tests__/image-store.test.ts

Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.

  • 🔵 missing_tests (L1): beforeEach is imported from vitest but never used. The module-level images Map leaks state across tests. Add a reset hook or export a _resetForTest helper to ensure test isolation. Currently the clear-by-session test (line 41) relies on no prior test having used 'session-1'. [fixable]

server/app.ts

Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.

  • 🔵 missing_tests (L1343): No integration test for the GET /api/images/:imageId endpoint — covering the 404 case, successful retrieval with correct Content-Type, and auth rejection when unauthenticated. [fixable]
  • 🔵 unsafe_assumptions (L1343): The image endpoint serves any image to any authenticated user regardless of session ownership. If multi-user access is ever added, an attacker who guesses a UUID can read another user's tool result images. Low risk today (UUIDs are unguessable and single-user), but worth noting. [fixable]

const images: ToolResultImage[] = [];
for (const c of content) {
if (c.type === 'image' && c.source?.type === 'base64' && c.source.data && c.source.media_type) {
images.push({ data: c.source.data, mediaType: c.source.media_type });

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 bugs: extractToolResultImages declares return type ToolResultImage[] (which has { id: string; mediaType: string }), but pushes { data: string, mediaType: string }. The id field is missing and data doesn't exist on the type. This is a compile error. A separate intermediate type (e.g. { data: string; mediaType: string }) is needed for extracted raw images, distinct from the reference-based ToolResultImage. [fixable]

Comment thread server/query-loop.ts
if (resultImages.length > 0 && sid) {
imageRefs = [];
for (const img of resultImages) {
const imgId = storeImage(sid, img.data, img.mediaType);

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 bugs: img.data is accessed on a ToolResultImage which has no data property (only id and mediaType). This works at runtime because the actual object from extractToolResultImages has data, but it's a type error and will fail tsc. Same root cause as the type mismatch in content-blocks.ts. [fixable]

Comment thread server/image-store.ts
): string | null {
if (!ALLOWED_MEDIA_TYPES.has(mediaType)) return null;
const id = randomUUID();
images.set(id, {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 unsafe_assumptions: storeImage accepts arbitrarily large base64 payloads with no size limit. A single large image (e.g. a high-res screenshot returned by a tool) could decode to tens of megabytes of Buffer. Consider a per-image size cap or at minimum a total store size budget. [fixable]

@@ -0,0 +1,46 @@
import { describe, it, expect, beforeEach } from 'vitest';

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 missing_tests: beforeEach is imported from vitest but never used. The module-level images Map leaks state across tests. Add a reset hook or export a _resetForTest helper to ensure test isolation. Currently the clear-by-session test (line 41) relies on no prior test having used 'session-1'. [fixable]

Comment thread server/app.ts
}
});

// ── Tool result images (served from in-memory store) ──────────────────────

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 missing_tests: No integration test for the GET /api/images/:imageId endpoint — covering the 404 case, successful retrieval with correct Content-Type, and auth rejection when unauthenticated. [fixable]

Comment thread server/app.ts
}
});

// ── Tool result images (served from in-memory store) ──────────────────────

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 unsafe_assumptions: The image endpoint serves any image to any authenticated user regardless of session ownership. If multi-user access is ever added, an attacker who guesses a UUID can read another user's tool result images. Low risk today (UUIDs are unguessable and single-user), but worth noting. [fixable]

@dimakis

dimakis commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Centaur Review

Found 7 issue(s) (3 warning).

server/chat.ts

Solid architecture (server-side storage, reference-based WS events, auth-protected endpoint), but images leak in 2 of 4 session cleanup paths — the abort handlers in closeSessionByUser and _closeoutSessionInner don't call clearSessionImages. Also missing per-session image count limits and reducer/component test coverage.

  • 🟡 bugs (L1397): Memory leak: closeSessionByUser with an active agent registers an abort listener that calls finalizeCloseout but never calls clearSessionImages. The 2-minute timeout fires registry.abort, which triggers the abort listener — images stay in memory forever. Same issue exists in _closeoutSessionInner (line 1323) for auto-closeout. Both abort handlers should include clearSessionImages(session.sessionId) alongside finalizeCloseout. [fixable]
  • 🟡 bugs (L1057): Missing clearSessionImages in the startChat error/catch path. If a session stores images (unlikely at startup, but the code path exists after the session is registered) and then fails, images won't be cleaned up. [fixable]

server/image-store.ts

Solid architecture (server-side storage, reference-based WS events, auth-protected endpoint), but images leak in 2 of 4 session cleanup paths — the abort handlers in closeSessionByUser and _closeoutSessionInner don't call clearSessionImages. Also missing per-session image count limits and reducer/component test coverage.

  • 🟡 unsafe_assumptions: No per-session image count limit. A tool that returns many images (e.g., a loop generating screenshots) could accumulate unbounded in-memory data. The 10 MB per-image cap is good, but there's no cap on the total number of images per session or globally. Consider adding a per-session cap (e.g., 50 images) or a global memory budget. [fixable]

packages/client/src/slices/messages.ts

Solid architecture (server-side storage, reference-based WS events, auth-protected endpoint), but images leak in 2 of 4 session cleanup paths — the abort handlers in closeSessionByUser and _closeoutSessionInner don't call clearSessionImages. Also missing per-session image count limits and reducer/component test coverage.

  • 🔵 missing_tests: No tests for patchToolResult with images, finishCurrent preserving toolResultImages, or the SUBAGENT_TOOL_RESULT image patching. The messages-slice test file exists (packages/client/__tests__/messages-slice.test.ts) but wasn't updated. The image-store.test.ts and content-blocks.test.ts tests are solid — the gap is in the reducer/state management layer. [fixable]

frontend/src/components/ToolPill.tsx

Solid architecture (server-side storage, reference-based WS events, auth-protected endpoint), but images leak in 2 of 4 session cleanup paths — the abort handlers in closeSessionByUser and _closeoutSessionInner don't call clearSessionImages. Also missing per-session image count limits and reducer/component test coverage.

  • 🔵 missing_tests: No tests for the updated ToolResult component (image rendering, images-only case, mixed images+text case) or the updated done logic. The existing ToolPill.test.tsx wasn't updated. [fixable]

server/query-loop.ts

Solid architecture (server-side storage, reference-based WS events, auth-protected endpoint), but images leak in 2 of 4 session cleanup paths — the abort handlers in closeSessionByUser and _closeoutSessionInner don't call clearSessionImages. Also missing per-session image count limits and reducer/component test coverage.

  • 🔵 unsafe_assumptions (L1266): If resolvedSessionId is undefined and registry.get(clientId) returns a session without a sessionId, then sid is undefined and images are silently dropped. This is a defensive edge case — images from the SDK are extracted but never stored, and no warning is logged. Consider logging a warning when sid is falsy but resultImages.length > 0. [fixable]

server/app.ts

Solid architecture (server-side storage, reference-based WS events, auth-protected endpoint), but images leak in 2 of 4 session cleanup paths — the abort handlers in closeSessionByUser and _closeoutSessionInner don't call clearSessionImages. Also missing per-session image count limits and reducer/component test coverage.

  • 🔵 style (L1344): The Content-Type response header is set directly from img.mediaType, which was validated on store. This is fine, but the endpoint has no rate limiting or abuse protection. Since image IDs are UUIDs (unguessable) and the endpoint is behind auth middleware, this is low risk, but worth noting for future hardening.

dimakis and others added 6 commits June 20, 2026 13:18
Messages queued in connection.pendingSends during WS downtime (iOS
background) were flushed to the wrong session after reconnect. Add
clearPendingSends() and call it from newSession() and switchSession().

Co-Authored-By: Claude Opus 4.6 <[email protected]>
When Claude reads an image file (JPEG, PNG, GIF, WebP) or reads a
generated plot, the SDK returns image content blocks that were previously
discarded. This adds end-to-end support for extracting and rendering
those images inline in the ToolPill component.

Data flow: SDK tool_result → extractToolResultImages() → WS event →
protocol parser → messages reducer → ToolPill <img> render.

Also handles subagent tool results with images.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Store tool result images server-side and serve via REST endpoint
instead of sending base64 data over WebSocket. The WS protocol
carries lightweight {id, mediaType} refs; frontend fetches images
via GET /api/images/:id with browser caching.

Supports JPEG, PNG, GIF, and WebB — covers screenshots, photos,
and generated plots (matplotlib etc) read by the Read tool.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…lation

- Split ToolResultImage into RawToolResultImage (base64 extraction) and
  ToolResultImage (wire-format reference) to fix tsc errors
- Add 10 MB per-image size cap in storeImage
- Wire clearSessionImages into stopChat and closeSessionByUser
- Add _resetForTest helper and beforeEach in tests for isolation
- Add test for oversized image rejection

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@dimakis dimakis force-pushed the feat/image-rendering-v2 branch from 840a329 to 841702d Compare June 20, 2026 12:19

@dimakis dimakis left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centaur Review

Found 6 issue(s) (3 warning).

server/chat.ts

Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.

  • 🟡 bugs: Memory leak: _closeoutSessionInner (abandoned/detached sessions) never calls clearSessionImages. Both the early-return path (no inputQueue, line 1345–1359) and the abort listener (line 1381–1398) skip image cleanup. Abandoned sessions will leak stored images in the in-memory Map until server restart. The same gap exists in closeSessionByUser's abort listener (line 1460–1471) — when the agent is active and eventually timed out, images are not freed. [fixable]

server/image-store.ts

Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.

  • 🟡 unsafe_assumptions: No global memory cap on the image store. Individual images are capped at 10 MB, but there is no limit on total image count or total memory usage. A session producing many tool result images (e.g., repeated screenshot tools) could grow the Map unboundedly. Consider adding a max-entry or max-total-bytes limit with LRU eviction, or at minimum a per-session cap. [fixable]

packages/client/__tests__/messages-slice.test.ts

Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.

  • 🟡 missing_tests: No test coverage for the images field flowing through patchToolResult or the TOOL_RESULT reducer action. The existing TOOL_RESULT tests don't pass images and don't assert toolResultImages on the patched block. At minimum, add a test that dispatches TOOL_RESULT with images and verifies toolResultImages is set on both current blocks and finalized messages. [fixable]

server/app.ts

Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.

  • 🔵 missing_tests: No integration test for the GET /api/images/:imageId endpoint. The image-store unit tests cover the store itself, but there's no test verifying the HTTP endpoint returns correct Content-Type, Content-Length, 404 for missing images, or that auth middleware protects it. [fixable]

packages/client/__tests__/protocol-parser.test.ts

Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.

  • 🔵 missing_tests: The protocol parser test for tool_result (line 256) does not cover the images field. Add a test that passes images: [{id: '...', mediaType: 'image/png'}] in the WS message and verifies it appears in the dispatched TOOL_RESULT action. [fixable]

frontend/src/components/ToolPill.tsx

Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.

  • 🔵 style: Minor behavior change: previously, an empty-string toolResult ("") would render a CodeBlock; now hasText requires .length > 0, so empty results are hidden. This is likely intentional (no point showing an empty code block), but worth noting as a subtle contract change.

- Add clearSessionImages to all abort/closeout paths (memory leak fix)
- Add per-session image count limit (50 images)
- Log warning when tool result images are dropped due to missing sessionId
- Add test for per-session image cap

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@dimakis

dimakis commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Centaur Review

Found 8 issue(s) (1 critical) (4 warning).

server/query-loop.ts

Sound architecture (reference-based images over WS is the right call), but clearSessionImages is missing from the primary session exit path in query-loop.ts finally block, causing a memory leak for every session that produces images. Secondary cleanup gaps exist in the closeout/abort paths. Image references persisted in EventStore will break on session restore since the in-memory store is ephemeral.

  • 🔴 bugs (L1376): Missing clearSessionImages() in the query-loop finally block. This is the primary session exit path (natural completion, error, or abort). Without cleanup here, every session that ends normally leaks its images in the in-memory store indefinitely. registry.remove(clientId) is called at line 1376 but the session's images are never freed. Add clearSessionImages(resolvedSessionId) before the registry.remove() call. [fixable]
  • 🟡 regressions (L1324): The images field in tool_result events is persisted to the EventStore (via sendOrBufferstore.append). On session restore/reattach, the client receives these events with image IDs, but the in-memory image store is either already cleared (session ended) or lost (server restart). The frontend will render <img src='/api/images/{id}'> tags that return 404. Images don't survive session restore — consider either not persisting image refs in the event store, or noting this as a known limitation. [fixable]
  • 🟡 unsafe_assumptions (L1266): The session ID fallback resolvedSessionId || registry.get(clientId)?.sessionId may be null/undefined when neither is available (e.g., early in the query loop before the first SDK event resolves the session ID). If sid is undefined, images are silently dropped — storeImage is never called. This is safe but may cause confusion when images appear in some tool results but not others depending on timing.

server/chat.ts

Sound architecture (reference-based images over WS is the right call), but clearSessionImages is missing from the primary session exit path in query-loop.ts finally block, causing a memory leak for every session that produces images. Secondary cleanup gaps exist in the closeout/abort paths. Image references persisted in EventStore will break on session restore since the in-memory store is ephemeral.

  • 🟡 bugs (L1355): In _closeoutSessionInner, when the session has an active agent (line 1335+), the abort listener (lines 1355-1372) only calls finalizeCloseout — it never calls clearSessionImages. Same issue in closeSessionByUser at lines 1432-1444. Sessions that go through the closeout-then-abort path leak images. The stopChat and no-agent branches do clean up correctly, but the closeout path doesn't. [fixable]

server/image-store.ts

Sound architecture (reference-based images over WS is the right call), but clearSessionImages is missing from the primary session exit path in query-loop.ts finally block, causing a memory leak for every session that produces images. Secondary cleanup gaps exist in the closeout/abort paths. Image references persisted in EventStore will break on session restore since the in-memory store is ephemeral.

  • 🟡 bugs (L22): No upper bound on total store size or per-session image count. A session producing many tool-result images (e.g., repeated screenshot loops) could accumulate hundreds of 10 MB images in memory before session cleanup runs. Consider a total store size cap or per-session image count limit as a safety valve. [fixable]

packages/client/src/slices/messages.ts

Sound architecture (reference-based images over WS is the right call), but clearSessionImages is missing from the primary session exit path in query-loop.ts finally block, causing a memory leak for every session that produces images. Secondary cleanup gaps exist in the closeout/abort paths. Image references persisted in EventStore will break on session restore since the in-memory store is ephemeral.

  • 🔵 missing_tests (L235): The patchToolResult function now accepts an images parameter and builds imgPatch, but there are no tests for this new behavior. The messages reducer handles TOOL_RESULT and SUBAGENT_TOOL_RESULT actions with images — these paths should have unit tests verifying that toolResultImages is correctly set on blocks. [fixable]

server/app.ts

Sound architecture (reference-based images over WS is the right call), but clearSessionImages is missing from the primary session exit path in query-loop.ts finally block, causing a memory leak for every session that produces images. Secondary cleanup gaps exist in the closeout/abort paths. Image references persisted in EventStore will break on session restore since the in-memory store is ephemeral.

  • 🔵 missing_tests (L1364): The new GET /api/images/:imageId endpoint has no integration test. Other endpoints in this file (e.g., /api/files/read, /api/files/download) have route tests in server/__tests__/routes.test.ts. Should test: 404 for missing ID, successful retrieval with correct Content-Type, and that auth middleware applies. [fixable]

frontend/src/components/ToolPill.tsx

Sound architecture (reference-based images over WS is the right call), but clearSessionImages is missing from the primary session exit path in query-loop.ts finally block, causing a memory leak for every session that produces images. Secondary cleanup gaps exist in the closeout/abort paths. Image references persisted in EventStore will break on session restore since the in-memory store is ephemeral.

  • 🔵 style (L135): The done expression uses inconsistent truthiness checks: block.toolResult !== undefined (strict check) vs block.toolResultImages && block.toolResultImages.length > 0 (truthy check). For consistency and clarity, consider (block.toolResultImages !== undefined && block.toolResultImages.length > 0) or extract both checks into a named variable like hasResult. [fixable]

@dimakis

dimakis commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Centaur Review

Found 7 issue(s) (3 warning).

server/chat.ts

Solid architecture — reference-based approach avoids base64 over WS. Main gap: session restore (replayEventsToMessages) and snapshot recovery (SnapshotBlock) both drop image references, so images are lost on reconnect/refresh. Also needs a global memory cap on the in-memory image store.

  • 🟡 bugs (L1889): replayEventsToMessages() extracts only result and isError from tool_result events, discarding the images field. The event store DOES persist images refs (emit → sendOrBuffer → store.append), but replay ignores them, and RestoredMessage blocks lack a toolResultImages field. Sessions restored via REST (browser refresh, reattach) will lose all tool result images. [fixable]

server/image-store.ts

Solid architecture — reference-based approach avoids base64 over WS. Main gap: session restore (replayEventsToMessages) and snapshot recovery (SnapshotBlock) both drop image references, so images are lost on reconnect/refresh. Also needs a global memory cap on the in-memory image store.

  • 🟡 unsafe_assumptions (L25): No global cap on the in-memory image store. Per-session limit is 50 images × 10 MB = up to 500 MB per session, and with many concurrent sessions the process can OOM. A global cap (e.g. max total entries or total bytes) would prevent unbounded memory growth if session cleanup is delayed or sessions leak. [fixable]
  • 🔵 unsafe_assumptions (L37): Per-session count check iterates all images in the store (O(n) on total images across all sessions). Fine at current scale, but a sessionId → count secondary index would be O(1). Not urgent — just noting for when the store grows. [fixable]
  • 🔵 unsafe_assumptions (L42): Buffer.from(base64Data, 'base64') silently ignores non-base64 characters and can produce shorter-than-expected buffers from malformed input. This is fine for data from the SDK (trusted source), but worth noting — no validation that the decoded buffer actually represents a valid image.

packages/protocol/src/types.ts

Solid architecture — reference-based approach avoids base64 over WS. Main gap: session restore (replayEventsToMessages) and snapshot recovery (SnapshotBlock) both drop image references, so images are lost on reconnect/refresh. Also needs a global memory cap on the in-memory image store.

  • 🟡 bugs (L36): SnapshotBlock (used for iOS reattach recovery via MESSAGE_SNAPSHOT) does not include toolResultImages. When forceFlushPendingMessage builds snapshot-based recovery payloads, image references are dropped. Clients reattaching mid-stream won't see tool result images. [fixable]

packages/client/src/slices/messages.ts

Solid architecture — reference-based approach avoids base64 over WS. Main gap: session restore (replayEventsToMessages) and snapshot recovery (SnapshotBlock) both drop image references, so images are lost on reconnect/refresh. Also needs a global memory cap on the in-memory image store.

  • 🔵 missing_tests (L230): patchToolResult now accepts an optional images parameter and spreads it into blocks, but there are no unit tests for the image-patching path in the messages reducer — neither for top-level TOOL_RESULT nor for SUBAGENT_TOOL_RESULT with images. [fixable]

server/query-loop.ts

Solid architecture — reference-based approach avoids base64 over WS. Main gap: session restore (replayEventsToMessages) and snapshot recovery (SnapshotBlock) both drop image references, so images are lost on reconnect/refresh. Also needs a global memory cap on the in-memory image store.

  • 🔵 style (L1268): The two sequential if (resultImages.length > 0 ...) blocks (lines 1268–1278) can be combined: if (resultImages.length > 0) { if (!sid) { log.warn(...) } else { ... } }. Current form makes the warn branch non-exclusive — both conditions fire when !sid, running the warn log and then silently skipping the store loop. [fixable]

@dimakis

dimakis commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Centaur Review

Found 8 issue(s) (5 warning).

server/image-store.ts

Solid reference-based image architecture that avoids base64 over WS. Main concerns: no global memory cap on the in-memory image store, a missing clearSessionImages call in the startChat exception handler, incomplete WS message type definitions, and gaps in integration test coverage for the core image flow.

  • 🟡 unsafe_assumptions (L25): The image store is a module-level Map with no global memory cap. With 50 images/session × 10 MB/image, a single session could consume 500 MB. Multiple concurrent sessions could exhaust server memory. Consider a global cap (e.g., total bytes across all sessions) or an LRU eviction policy. [fixable]
  • 🟡 unsafe_assumptions (L37): Per-session count check iterates all images in the store on every storeImage() call — O(n) per insert. With many sessions this becomes expensive. Consider a Map<sessionId, number> counter maintained alongside the main map. [fixable]

server/chat.ts

Solid reference-based image architecture that avoids base64 over WS. Main concerns: no global memory cap on the in-memory image store, a missing clearSessionImages call in the startChat exception handler, incomplete WS message type definitions, and gaps in integration test coverage for the core image flow.

  • 🟡 bugs (L1067): The exception handler in _startChatInner (catch block at line 1054) calls cleanupSessionWorktrees and registry.abort but does not call clearSessionImages. If images were stored before the failure, they leak. All other cleanup paths include clearSessionImages. [fixable]

frontend/src/types/ws-messages.ts

Solid reference-based image architecture that avoids base64 over WS. Main concerns: no global memory cap on the in-memory image store, a missing clearSessionImages call in the startChat exception handler, incomplete WS message type definitions, and gaps in integration test coverage for the core image flow.

  • 🟡 regressions (L73): ToolResultMsg and SubagentToolResultMsg interfaces are not updated with the optional images?: ToolResultImage[] field. The runtime works (protocol-parser uses type assertions), but the type definitions are incomplete — IDE autocomplete and future type-checked code will miss the field. [fixable]

server/query-loop.ts

Solid reference-based image architecture that avoids base64 over WS. Main concerns: no global memory cap on the in-memory image store, a missing clearSessionImages call in the startChat exception handler, incomplete WS message type definitions, and gaps in integration test coverage for the core image flow.

  • 🟡 missing_tests (L1262): No test covers the integration path: SDK tool_result with image content → extractToolResultImages → storeImage → imageRefs emitted in WS event. The existing query-loop tests only use string content for tool results. This is the core new behavior and should be tested. [fixable]
  • 🔵 style (L1268): The two sequential if (resultImages.length > 0 && ...) checks could be a single if/else block for clarity — if (resultImages.length > 0) { if (!sid) { log.warn(...) } else { ... } }. The current structure makes it non-obvious that exactly one branch fires. [fixable]

server/app.ts

Solid reference-based image architecture that avoids base64 over WS. Main concerns: no global memory cap on the in-memory image store, a missing clearSessionImages call in the startChat exception handler, incomplete WS message type definitions, and gaps in integration test coverage for the core image flow.

  • 🔵 missing_tests (L1364): No integration test for GET /api/images/:imageId — the 404 path, correct Content-Type/Content-Length headers, and Cache-Control behavior are untested. [fixable]

packages/client/src/slices/messages.ts

Solid reference-based image architecture that avoids base64 over WS. Main concerns: no global memory cap on the in-memory image store, a missing clearSessionImages call in the startChat exception handler, incomplete WS message type definitions, and gaps in integration test coverage for the core image flow.

  • 🔵 missing_tests (L334): No test for the messages reducer handling TOOL_RESULT with images — patchToolResult's image propagation to both current and finished messages is exercised only through the protocol parser, not unit-tested directly. [fixable]

@dimakis dimakis merged commit f01e16c into main Jun 20, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant