fix: resolve all 35 issues from adversarial code review by ejwhite7 · Pull Request #7 · ejwhite7/ruby-pi

ejwhite7 · 2026-04-30T15:05:46Z

Review Remediation: Adversarial Review Round 2

This PR addresses 12 defects and incomplete items identified during the second adversarial review of the ruby-pi gem.

New Defects Fixed

String interpolation escaped in ProviderError (anthropic.rb:520-525)
- Backslash-escaped #{} in ProviderError message prevented actual tool name and parser error from appearing
- Fix: Removed backslash escaping so interpolation works correctly
Thread-unsafe streaming instance variables (anthropic.rb)
- @_stream_* instance variables leaked state across concurrent requests
- Fix: Replaced with method-local variables via process_anthropic_stream_event helper
Per-agent config threading (agent/core.rb, all providers)
- config: kwarg was decorative -- providers still read RubyPi.configuration directly
- Fix: BaseProvider now accepts config: kwarg; all providers use passed-in config
Streaming error body recovery (all providers)
- on_data callback consumed error response bodies, leaving ApiError with empty body
- Fix: Detect error status in on_data, accumulate error body separately, pass to handle_error_response
Consecutive user messages after compaction (compaction.rb)
- Compaction summary as role: :user could produce consecutive user messages (rejected by Anthropic)
- Fix: Changed to role: :assistant

Issues Not Fully Addressed from V1

OpenAI missing tool_call_id (openai.rb)
- Still fell back to "unknown" instead of failing fast
- Fix: Raises ProviderError with descriptive message (same as Anthropic)
Gemini streaming finish_reason (gemini.rb)
- Hardcoded "stop" instead of reading actual finishReason
- Fix: Parses finishReason from candidate object with "STOP" -> "stop" normalization
README incorrect event keys (README.md)
- References to e[:iteration] and event[:iterations] (nonexistent keys)
- Fix: Changed to e[:turn] and event[:result].turns
CHANGELOG update (CHANGELOG.md)
- Added comprehensive v0.1.4 entry documenting all fixes
Dead parse_sse_events method (base_provider.rb)
- Removed unused method (all providers now use real incremental streaming)
faraday-net_http version cap (Gemfile, gemspec)
- Removed arbitrary < 3.4 upper bound
BufferedStreamProxy breaks streaming UX (fallback.rb)
- Events were buffered until completion even on happy path
- Fix: Stream events directly to consumer; emit :fallback_start event on primary failure

Test Results

All 444 tests pass with 0 failures.

- Fix Agent constructor: remove stream:, add required system_prompt: - Fix event payload keys: e[:data] -> e[:content], e[:name] -> e[:tool_name] - Fix result.output -> result.content, result.iterations -> result.turns - Remove result.stop_reason (does not exist) - Fix context_compaction: -> compaction:, context_transform: -> transform_context: - Fix extensions: [] -> agent.use(ExtensionClass) (takes class, not instance) - Fix Compaction constructor: add required summary_model:, remove strategy: - Fix Transform: module with factory methods, not a class - Remove :before_tool_call/:after_tool_call from events lists (they are hooks) - Add Tool vs Tools namespace clarification - Update CI matrix reference in CLAUDE.md

- Remove unused ostruct ~> 0.6 dependency (#24) - Remove arbitrary faraday-net_http < 3.4 upper bound (#25) - Update placeholder email to maintainer's noreply address (#26) - Add Ruby 3.4 to CI matrix and document supported versions (#38)

…g (#27, #33) - Extract set_defaults method called by both initialize and reset! (#27) - Support per-agent Configuration instances via config: kwarg on Agent::Core (#33) - Add effective_config method that falls back to global config - Add comprehensive configuration specs

- Add mutex protection to all read methods: find, subset, by_category, all, names, size, registered? (#29) - Replace unconditional warn to stderr with logger.debug when a logger is configured; silent otherwise (#28) - Add thread safety spec with concurrent read/write test - Update overwrite test expectations for new logging behavior

…#37) - Improve emit comment to accurately explain recursion guard behavior: errors in :error handlers are silently swallowed to prevent unbounded recursion (#30) - Add :tool_call_delta to EVENTS constant for streaming tool call data (#37) - Add specs for tool_call_delta event subscription and emission

…#31, #33, #36) - Expose @extensions as attr_reader for introspection (#31) - Accept config: kwarg for per-agent configuration override (#33) - Add effective_config method falling back to global config - Plumb execution_mode and tool_timeout through constructor to Loop (#36) - Add specs for extensions reader, per-agent config, execution options

- Serialize error as { class: error.class.name, message: error.message } instead of just error.message, preserving diagnostic context - Add spec for custom error class preservation

… (#36, #37) - Accept execution_mode (:parallel/:sequential) and tool_timeout params in Loop constructor, passed through from Agent::Core (#36) - Emit :tool_call_delta event when provider yields tool-call streaming data during the think phase (#37) - Add specs for configurable execution options and tool_call_delta events

- Clarify that the empty text block is needed when assistant turns contain only tool_use calls with no accompanying text, to satisfy Anthropic's non-empty content constraint

…poisoning Compaction previously wrote the conversation summary as role: :system. When Loop#build_llm_messages prepended the real system prompt (also role: :system), two system messages existed in the array. The Anthropic provider extracts a single system_message by overwriting on each match (last wins), so the compaction summary silently replaced the actual system prompt. Fix: Changed compaction summary from role: :system to role: :user with a clear [Conversation Summary] prefix. The system prompt now always remains the authoritative system message. Tests updated to verify: - Summary uses role :user, not :system - No :system messages in compacted output - Only one :system message when loop prepends the real prompt

The retry loop incremented 'attempt' before the call and used 'attempt < max_retries' as the retry condition. With max_retries: 3, this gave only 2 retries (3 total attempts) instead of the expected 3 retries (4 total attempts). Fix: Changed condition to 'attempt <= max_retries' so max_retries: N means N retries after the initial failure, for N+1 total attempts. Tests added: - Verifies exactly 4 requests with max_retries: 3 (via WebMock count) - Verifies success on the last retry attempt (4th of 4)

All three providers (Anthropic, OpenAI, Gemini) previously did: response = conn.post(...) parse_sse_events(response.body) With net_http adapter and no on_data callback, Faraday buffers the entire response body before returning. No deltas reached the caller until the model finished generating -- fake streaming. Fix: Each provider's streaming method now uses req.options.on_data to process SSE chunks incrementally as they arrive from the API. An sse_buffer accumulates partial lines across chunks, and complete SSE events are parsed and yielded to the caller's block immediately. Tests added for all three providers verifying incremental event delivery.

Fallback < BaseProvider inherited the retry wrapper from BaseProvider#complete. Inside perform_complete, it called @primary.complete (which retries internally) then @fallback.complete (also retries). The retry layers composed multiplicatively: outer_retries x (primary_retries + fallback_retries). Fix: Override #complete in Fallback to call perform_complete directly, skipping the outer retry wrapper. Each inner provider handles its own retries independently. Tests added: - Verifies exact request counts (4 primary + 4 fallback, not multiplied) - Verifies no outer retry loop re-triggers the primary+fallback cycle

inject_datetime, inject_user_preferences, and inject_workspace_context all used state.system_prompt += to append content. Since transforms run before every LLM call in the loop, across N turns the system prompt accumulated N timestamps, N preference blocks, and N workspace contexts until it blew the context window. Fix: Each injection type now uses unique HTML comment markers. Before re-adding, the transform strips any existing injection matching its markers. This makes all injections idempotent. Tests added: - Each inject method called 5x produces exactly 1 injection - Base prompt preserved after multiple calls - Composed transform called 10x produces exactly 1 of each injection - nil preferences correctly strip previous injection

Issues fixed: - #9: Replace unsafe Timeout.timeout with thread+join for sequential mode - #10: Fix Future#value(timeout) nil misinterpretation using wait+fulfilled? - #11: Cancel/interrupt leaked future threads after timeout - #12: Guard JSON.parse against empty-string arguments in OpenAI/Anthropic - #13: Move Gemini API key from URL query string to x-goog-api-key header - #14: Rename NotImplementedError to AbstractMethodError to avoid shadowing stdlib - #15: Guard ToolCall#parse_arguments against non-string/non-hash inputs - #16: Add State#reset_iteration! method; reset counter in run() and continue() - #17: Guard against nil tools registry; raise NoToolsRegisteredError - #18: Re-raise programming errors (NoMethodError, NameError, etc.) in Loop - #19: Set stop_reason and error on max_iterations; success? returns false - #20: Remove dead faraday-retry dependency; fix build_connection docstring - #21: Add stream_options for OpenAI streaming usage data - #22: Guard Anthropic streaming JSON.parse with rescue JSON::ParserError - #23: Buffer streaming deltas in Fallback to prevent double-emit All fixes include comprehensive tests. Full test suite passes (398 examples, 0 failures).

…xes)

…remediation Consolidates all three feature branches: - fix/critical-correctness (Issues #2-#8) - fix/major-reliability (Issues #9-#23) - fix/docs-and-hygiene (Issues #24-#38) Conflicts resolved by integrating both sets of changes: - anthropic.rb: kept real streaming (critical) + tool_use_id validation & JSON guard (major) - openai.rb: kept real streaming (critical) + Issue #21 streaming usage capture (major) - fallback.rb: kept both comment blocks (no-extra-retry + streaming buffer docs) - result.rb: kept richer error hash (docs) + stop_reason/truncated fields (critical) - Gemfile/gemspec: kept stricter version pins from critical-correctness

- Fixed 3 Gemini test stubs using old ?key= query parameter URL pattern (provider now sends API key via X-Goog-Api-Key header) - Synced Gemfile.lock with gemspec (removed phantom ostruct dependency)

ejwhite7 · 2026-04-30T16:06:02Z

This PR has been superseded by #8 which addresses 12 additional defects from adversarial review round 2.

ejwhite7 and others added 19 commits April 30, 2026 14:44

fix: Result#to_h preserves error class name in serialization (#32)

27dc9de

- Serialize error as { class: error.class.name, message: error.message } instead of just error.message, preserving diagnostic context - Add spec for custom error class preservation

fix: improve Anthropic empty content block comment (#34)

79df5f6

- Clarify that the empty text block is needed when assistant turns contain only tool_use calls with no accompanying text, to satisfy Anthropic's non-empty content constraint

chore: update json gem to 2.7.2 for build compatibility

c811ebe

Merge fix/critical-correctness: Issues #2-#8 (critical correctness fi…

ef8c121

…xes)

fix: resolve test failures from branch merge

d8ddc1c

- Fixed 3 Gemini test stubs using old ?key= query parameter URL pattern (provider now sends API key via X-Goog-Api-Key header) - Synced Gemfile.lock with gemspec (removed phantom ostruct dependency)

ejwhite7 force-pushed the fix/review-remediation branch from 5da5e90 to d8ddc1c Compare April 30, 2026 15:25

ejwhite7 merged commit 99f55fa into main Apr 30, 2026
3 checks passed

ejwhite7 deleted the fix/review-remediation branch April 30, 2026 15:27

ejwhite7 mentioned this pull request Apr 30, 2026

fix: adversarial review remediation — release v0.1.5 (rounds 2 + 3) #8

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve all 35 issues from adversarial code review#7

fix: resolve all 35 issues from adversarial code review#7
ejwhite7 merged 19 commits into
mainfrom
fix/review-remediation

ejwhite7 commented Apr 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

ejwhite7 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ejwhite7 commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Remediation: Adversarial Review Round 2

New Defects Fixed

Issues Not Fully Addressed from V1

Test Results

Uh oh!

Uh oh!

ejwhite7 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ejwhite7 commented Apr 30, 2026 •

edited

Loading