feat(images): reference-based image rendering for tool results#376
Conversation
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 7 issue(s) (2 critical) (2 warning).
packages/protocol/src/content-blocks.ts
Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.
- 🔴 bugs (L55):
extractToolResultImagesdeclares return typeToolResultImage[](which has{ id: string; mediaType: string }), but pushes{ data: string, mediaType: string }. Theidfield is missing anddatadoesn't exist on the type. This is a compile error. A separate intermediate type (e.g.{ data: string; mediaType: string }) is needed for extracted raw images, distinct from the reference-basedToolResultImage.[fixable]
server/query-loop.ts
Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.
- 🔴 bugs (L1271):
img.datais accessed on aToolResultImagewhich has nodataproperty (onlyidandmediaType). This works at runtime because the actual object fromextractToolResultImageshasdata, but it's a type error and will failtsc. Same root cause as the type mismatch in content-blocks.ts.[fixable]
server/image-store.ts
Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.
- 🟡 unsafe_assumptions:
clearSessionImagesis defined but never called anywhere in the codebase. Images accumulate in the in-memory Map indefinitely with no TTL, no max-size cap, and no cleanup on session close/hide. For a long-running server this is an unbounded memory leak. WireclearSessionImagesintocloseSessionByUser(chat.ts:1342) and/or session abort/cleanup paths.[fixable] - 🟡 unsafe_assumptions (L29):
storeImageaccepts arbitrarily large base64 payloads with no size limit. A single large image (e.g. a high-res screenshot returned by a tool) could decode to tens of megabytes of Buffer. Consider a per-image size cap or at minimum a total store size budget.[fixable]
server/__tests__/image-store.test.ts
Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.
- 🔵 missing_tests (L1):
beforeEachis imported from vitest but never used. The module-levelimagesMap leaks state across tests. Add a reset hook or export a_resetForTesthelper to ensure test isolation. Currently the clear-by-session test (line 41) relies on no prior test having used 'session-1'.[fixable]
server/app.ts
Critical type mismatch: extractToolResultImages returns objects shaped { data, mediaType } but is typed as ToolResultImage[] which expects { id, mediaType } — this won't pass tsc. Additionally, clearSessionImages is never called, creating an unbounded memory leak.
- 🔵 missing_tests (L1343): No integration test for the
GET /api/images/:imageIdendpoint — covering the 404 case, successful retrieval with correct Content-Type, and auth rejection when unauthenticated.[fixable] - 🔵 unsafe_assumptions (L1343): The image endpoint serves any image to any authenticated user regardless of session ownership. If multi-user access is ever added, an attacker who guesses a UUID can read another user's tool result images. Low risk today (UUIDs are unguessable and single-user), but worth noting.
[fixable]
| const images: ToolResultImage[] = []; | ||
| for (const c of content) { | ||
| if (c.type === 'image' && c.source?.type === 'base64' && c.source.data && c.source.media_type) { | ||
| images.push({ data: c.source.data, mediaType: c.source.media_type }); |
There was a problem hiding this comment.
🔴 bugs: extractToolResultImages declares return type ToolResultImage[] (which has { id: string; mediaType: string }), but pushes { data: string, mediaType: string }. The id field is missing and data doesn't exist on the type. This is a compile error. A separate intermediate type (e.g. { data: string; mediaType: string }) is needed for extracted raw images, distinct from the reference-based ToolResultImage. [fixable]
| if (resultImages.length > 0 && sid) { | ||
| imageRefs = []; | ||
| for (const img of resultImages) { | ||
| const imgId = storeImage(sid, img.data, img.mediaType); |
There was a problem hiding this comment.
🔴 bugs: img.data is accessed on a ToolResultImage which has no data property (only id and mediaType). This works at runtime because the actual object from extractToolResultImages has data, but it's a type error and will fail tsc. Same root cause as the type mismatch in content-blocks.ts. [fixable]
| ): string | null { | ||
| if (!ALLOWED_MEDIA_TYPES.has(mediaType)) return null; | ||
| const id = randomUUID(); | ||
| images.set(id, { |
There was a problem hiding this comment.
🟡 unsafe_assumptions: storeImage accepts arbitrarily large base64 payloads with no size limit. A single large image (e.g. a high-res screenshot returned by a tool) could decode to tens of megabytes of Buffer. Consider a per-image size cap or at minimum a total store size budget. [fixable]
| @@ -0,0 +1,46 @@ | |||
| import { describe, it, expect, beforeEach } from 'vitest'; | |||
There was a problem hiding this comment.
🔵 missing_tests: beforeEach is imported from vitest but never used. The module-level images Map leaks state across tests. Add a reset hook or export a _resetForTest helper to ensure test isolation. Currently the clear-by-session test (line 41) relies on no prior test having used 'session-1'. [fixable]
| } | ||
| }); | ||
|
|
||
| // ── Tool result images (served from in-memory store) ────────────────────── |
There was a problem hiding this comment.
🔵 missing_tests: No integration test for the GET /api/images/:imageId endpoint — covering the 404 case, successful retrieval with correct Content-Type, and auth rejection when unauthenticated. [fixable]
| } | ||
| }); | ||
|
|
||
| // ── Tool result images (served from in-memory store) ────────────────────── |
There was a problem hiding this comment.
🔵 unsafe_assumptions: The image endpoint serves any image to any authenticated user regardless of session ownership. If multi-user access is ever added, an attacker who guesses a UUID can read another user's tool result images. Low risk today (UUIDs are unguessable and single-user), but worth noting. [fixable]
Centaur ReviewFound 7 issue(s) (3 warning).
|
Messages queued in connection.pendingSends during WS downtime (iOS background) were flushed to the wrong session after reconnect. Add clearPendingSends() and call it from newSession() and switchSession(). Co-Authored-By: Claude Opus 4.6 <[email protected]>
…dant comment Co-Authored-By: Claude Opus 4.6 <[email protected]>
When Claude reads an image file (JPEG, PNG, GIF, WebP) or reads a generated plot, the SDK returns image content blocks that were previously discarded. This adds end-to-end support for extracting and rendering those images inline in the ToolPill component. Data flow: SDK tool_result → extractToolResultImages() → WS event → protocol parser → messages reducer → ToolPill <img> render. Also handles subagent tool results with images. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Store tool result images server-side and serve via REST endpoint
instead of sending base64 data over WebSocket. The WS protocol
carries lightweight {id, mediaType} refs; frontend fetches images
via GET /api/images/:id with browser caching.
Supports JPEG, PNG, GIF, and WebB — covers screenshots, photos,
and generated plots (matplotlib etc) read by the Read tool.
Co-Authored-By: Claude Opus 4.6 <[email protected]>
…lation - Split ToolResultImage into RawToolResultImage (base64 extraction) and ToolResultImage (wire-format reference) to fix tsc errors - Add 10 MB per-image size cap in storeImage - Wire clearSessionImages into stopChat and closeSessionByUser - Add _resetForTest helper and beforeEach in tests for isolation - Add test for oversized image rejection Co-Authored-By: Claude Opus 4.6 <[email protected]>
840a329 to
841702d
Compare
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 6 issue(s) (3 warning).
server/chat.ts
Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.
- 🟡 bugs: Memory leak:
_closeoutSessionInner(abandoned/detached sessions) never callsclearSessionImages. Both the early-return path (no inputQueue, line 1345–1359) and the abort listener (line 1381–1398) skip image cleanup. Abandoned sessions will leak stored images in the in-memory Map until server restart. The same gap exists incloseSessionByUser's abort listener (line 1460–1471) — when the agent is active and eventually timed out, images are not freed.[fixable]
server/image-store.ts
Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.
- 🟡 unsafe_assumptions: No global memory cap on the image store. Individual images are capped at 10 MB, but there is no limit on total image count or total memory usage. A session producing many tool result images (e.g., repeated screenshot tools) could grow the Map unboundedly. Consider adding a max-entry or max-total-bytes limit with LRU eviction, or at minimum a per-session cap.
[fixable]
packages/client/__tests__/messages-slice.test.ts
Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.
- 🟡 missing_tests: No test coverage for the
imagesfield flowing throughpatchToolResultor theTOOL_RESULTreducer action. The existing TOOL_RESULT tests don't passimagesand don't asserttoolResultImageson the patched block. At minimum, add a test that dispatches TOOL_RESULT with images and verifiestoolResultImagesis set on bothcurrentblocks and finalizedmessages.[fixable]
server/app.ts
Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.
- 🔵 missing_tests: No integration test for the
GET /api/images/:imageIdendpoint. The image-store unit tests cover the store itself, but there's no test verifying the HTTP endpoint returns correct Content-Type, Content-Length, 404 for missing images, or that auth middleware protects it.[fixable]
packages/client/__tests__/protocol-parser.test.ts
Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.
- 🔵 missing_tests: The protocol parser test for
tool_result(line 256) does not cover theimagesfield. Add a test that passesimages: [{id: '...', mediaType: 'image/png'}]in the WS message and verifies it appears in the dispatched TOOL_RESULT action.[fixable]
frontend/src/components/ToolPill.tsx
Solid reference-based image architecture that avoids base64 over WebSocket. Main concern is incomplete session cleanup — _closeoutSessionInner and abort listeners in closeSessionByUser don't call clearSessionImages, causing memory leaks for abandoned/timed-out sessions. Also needs a global store size cap and more test coverage for the image data path through the client reducer and protocol parser.
- 🔵 style: Minor behavior change: previously, an empty-string
toolResult("") would render a CodeBlock; nowhasTextrequires.length > 0, so empty results are hidden. This is likely intentional (no point showing an empty code block), but worth noting as a subtle contract change.
- Add clearSessionImages to all abort/closeout paths (memory leak fix) - Add per-session image count limit (50 images) - Log warning when tool result images are dropped due to missing sessionId - Add test for per-session image cap Co-Authored-By: Claude Opus 4.6 <[email protected]>
Centaur ReviewFound 8 issue(s) (1 critical) (4 warning).
|
Centaur ReviewFound 7 issue(s) (3 warning).
|
Centaur ReviewFound 8 issue(s) (5 warning).
|
Summary
{id, mediaType}refs, frontend fetches viaGET /api/images/:idwith browser cachingtype: 'image'content blocks from Claude SDK tool results (previously discarded byextractToolResultText)Changes
server/image-store.ts— in-memory image store with session-scoped cleanupserver/app.ts— REST endpointGET /api/images/:imageIdserver/query-loop.ts— extract images from tool results, store server-side, emit refspackages/protocol/—ToolResultImagetype,extractToolResultImages()function, updated block typespackages/client/— protocol parser + messages slice handle image refsfrontend/src/components/ToolPill.tsx— renders images via REST URLTest plan
extractToolResultImages(6 cases)image-store(5 cases: store/retrieve, MIME validation, session cleanup)🤖 Generated with Claude Code