From c50f8d1956e612a0a8319b2612d7c966d21b0386 Mon Sep 17 00:00:00 2001 From: EJ White Date: Tue, 9 Jun 2026 14:26:09 -0400 Subject: [PATCH] =?UTF-8?q?fix:=20adversarial=20review=20round=206=20?= =?UTF-8?q?=E2=80=94=20release=20v0.1.8?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fixes all 13 findings from review round 6: High: - Honor server Retry-After on 429 (capped at 60s) instead of always using local exponential backoff (base_provider.rb) - Parallel executor timeout/rejection Results carry the real tool name instead of "unknown"; rejected futures no longer report a misleading full-timeout duration (executor.rb) - Tool blocks with keyword parameters now work — arguments are splatted to keywords when the block declares them (definition.rb) Medium: - Wire the Loop's emitter into Compaction so the documented :compaction event actually fires (loop.rb) - Buffer SSE chunks as BINARY and re-encode complete lines to UTF-8 in all three providers; streaming deltas are now always valid UTF-8 - :fallback_start payload includes partial_output/partial_chars so consumers can truncate partial primary text (fallback.rb) - Validate tool names against provider constraints at definition time - Declare json as a runtime dependency and require it in every file that uses it (composability; pinned by a source-scan spec) Low/docs: - Configuration rejects negative numeric settings; eager-init the global configuration to remove the first-access race - Document continue() per-invocation Result accounting, schema-as-hints (not validation), and unbounded message growth without compaction - Correct CLAUDE.md module map (agent/agent.rb never existed), version reference, and extension handler signature - CHANGELOG note: 0.1.4 was never actually released (version.rb went 0.1.3 → 0.1.5); created the missing v0.1.7 tag locally 22 new regression specs in spec/ruby_pi/fixes/review_round6_spec.rb. Suite: 492 examples, 0 failures. Co-Authored-By: Claude Fable 5 --- CHANGELOG.md | 23 ++ CLAUDE.md | 24 +- Gemfile.lock | 3 +- lib/ruby_pi.rb | 7 + lib/ruby_pi/agent/core.rb | 6 + lib/ruby_pi/agent/loop.rb | 10 + lib/ruby_pi/agent/state.rb | 6 + lib/ruby_pi/configuration.rb | 55 ++++- lib/ruby_pi/llm/anthropic.rb | 28 ++- lib/ruby_pi/llm/base_provider.rb | 25 +- lib/ruby_pi/llm/fallback.rb | 26 +- lib/ruby_pi/llm/gemini.rb | 23 +- lib/ruby_pi/llm/openai.rb | 24 +- lib/ruby_pi/llm/tool_call.rb | 2 + lib/ruby_pi/tools/definition.rb | 43 +++- lib/ruby_pi/tools/executor.rb | 20 +- lib/ruby_pi/tools/schema.rb | 10 + lib/ruby_pi/version.rb | 2 +- ruby-pi.gemspec | 4 + spec/ruby_pi/fixes/review_round6_spec.rb | 288 +++++++++++++++++++++++ 20 files changed, 578 insertions(+), 51 deletions(-) create mode 100644 spec/ruby_pi/fixes/review_round6_spec.rb diff --git a/CHANGELOG.md b/CHANGELOG.md index d9e9e05..f706f28 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,29 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.1.8] - 2026-06-09 + +### Fixed (adversarial review round 6) + +- **`Retry-After` header was parsed but never honored (High)**: On a 429, `handle_error_response` stored the server's `Retry-After` on `RateLimitError#retry_after`, but the retry loop in `BaseProvider#complete` always slept the local exponential backoff (capped at `retry_max_delay`) — hammering a server that asked for a longer cooldown until the retry budget burned out. The retry delay now prefers a positive `retry_after` (capped at `RETRY_AFTER_CEILING`, 60s); HTTP-date values (which parse to `0.0`) and absent headers fall through to the computed backoff +- **Parallel executor timeout/rejection Results reported `name: "unknown"` (High)**: `execute_parallel` hardcoded `"unknown"` in the timeout and rejected-future branches, so with several tools timing out concurrently, logs and `:tool_execution_end` subscribers could not tell which tool hung. Futures are now zipped with their originating calls and failure Results carry the real tool name; the timeout message matches sequential mode (`Tool 'x' timed out after Ns`). The rejected-future branch also no longer reports a misleading full-timeout `duration_ms` for what may have been an instant failure (now `0.0`) +- **Keyword-parameter tool blocks failed on every call (High)**: `Definition#call` passed a single positional Hash, so a block written `{ |content:, platform:| ... }` — the natural style given named schema parameters — raised `ArgumentError: missing keyword` on every invocation (surfacing as a confusing failed Result). `Definition` now detects keyword parameters at construction and splats the arguments hash to keywords; positional-Hash blocks are unchanged. Keyword blocks without `**rest` raise on unexpected keys — strict by design, since the keys come from the LLM +- **`:compaction` event was never emitted in production (Medium)**: `Compaction#emitter` defaults to nil and nothing ever assigned it, so the documented `agent.on(:compaction)` subscription silently never fired — the only place the emitter was set was the spec itself. `Loop#initialize` now wires its emitter into the compaction strategy (an explicitly preassigned emitter is left untouched) +- **Streaming chunks were never normalized to UTF-8 (Medium)**: Faraday delivers `on_data` chunks as ASCII-8BIT; appending a chunk to a UTF-8 SSE buffer already holding non-ASCII text raises `Encoding::CompatibilityError`, and yielded deltas could carry binary encoding into consumers' UTF-8 buffers. All three providers now buffer in BINARY and re-encode each complete SSE line to UTF-8 (with `scrub` guarding invalid bytes) before parsing, so `:text_delta` events are always valid UTF-8 — including multi-byte characters split across network chunks +- **Streaming fallback gave consumers no way to truncate partial primary output (Medium)**: If the primary streamed text and then died mid-stream, the fallback streamed a complete fresh response — a delta-appending consumer rendered `""` with no signal of how much to discard. The `:fallback_start` payload now includes `partial_output` (Boolean) and `partial_chars` (characters already yielded), so consumers can deterministically reset +- **Tool names were not validated against provider constraints (Medium)**: A tool named `send.email` registered fine and then 400'd on every Anthropic request with an opaque server error. `Definition` now validates names against `/\A[a-zA-Z0-9_-]{1,64}\z/` (the strictest provider constraint) and raises `ArgumentError` at definition time with a pointed message +- **`json` was used everywhere but never declared or required (Medium)**: `JSON.parse`/`JSON.generate` are called throughout the providers and agent loop, but the gem relied on Faraday's transitive `json` dependency and the entry point's single `require "json"` — loading `agent/loop.rb` in isolation raised `NameError`, contradicting the composability principle. The gemspec now declares `json >= 2.0` and every file referencing `JSON` requires it directly (pinned by a source-scan spec) +- **Configuration accepted negative retry/timeout values (Low)**: `max_retries = -1` silently disabled retries and a negative delay raised deep inside the retry loop's `sleep`. The numeric settings now have validated writers that raise `ArgumentError` at assignment time +- **Global configuration first-access race (Low)**: `@configuration ||= Configuration.new` was unsynchronized; two threads racing the first call could each construct a Configuration with one silently discarded. The configuration is now eagerly initialized at require time +- **`continue()` Result accounting documented (Docs)**: Each `run`/`continue` builds a fresh Loop, so the returned Result's `usage`/`tool_calls_made`/`turns` cover only that invocation while `messages` is cumulative — an undocumented asymmetry, now documented on `Core#continue` +- **Schema DSL documented as LLM-facing hints, not validation (Docs)**: Nothing validates model-supplied arguments against `tool.parameters` before invoking the block — `required`/`enum`/`minimum` constrain what the model is asked to produce, with no runtime enforcement or type coercion. This is deliberate (anti-framework), but the schema header now says so loudly and directs tool blocks to treat arguments as untrusted input +- **`State#add_message` unbounded growth documented (Docs)**: Long-lived agents calling `continue()` repeatedly accumulate messages linearly without compaction configured; documented on the method +- **CLAUDE.md module map corrected (Docs)**: The map referenced a nonexistent `agent/agent.rb`, omitted `core.rb`/`loop.rb`/`state.rb`/`events.rb`, hardcoded version `0.1.0`, and the extension example used the one-arg `|event|` block signature instead of the actual `|data, agent|`. All corrected + +### Release-history note + +- **`[0.1.4]` below was never actually released**: `lib/ruby_pi/version.rb` went from `0.1.3` directly to `0.1.5` — the round-2 fixes documented under 0.1.4 shipped without a version bump and were first published as part of 0.1.5. There is intentionally no `v0.1.4` git tag or gem. (Discovered during round 6; the entry is kept for historical accuracy of *what* changed.) + ## [0.1.7] - 2026-05-28 ### Fixed (adversarial review round 5) diff --git a/CLAUDE.md b/CLAUDE.md index b228956..44bb354 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -41,7 +41,11 @@ RubyPi::Tools Tool definition and execution +-- Result Execution outcome (value or error + timing) RubyPi::Agent Think-act-observe loop - +-- Result Agent run outcome (output, messages, iterations) + |-- Core Public API: run/continue, extensions, event subscription + |-- Loop The think-act-observe cycle itself + |-- State Mutable run state (messages, iteration, hooks) + |-- Events EVENTS list + EventEmitter + +-- Result Agent run outcome (output, messages, turns) RubyPi::Context Conversation context management |-- Compaction Truncate/summarize to fit token limits @@ -59,7 +63,7 @@ RubyPi::Extensions Hook system for agent events lib/ ruby_pi.rb # Entry point, requires everything, LLM.model factory ruby_pi/ - version.rb # RubyPi::VERSION = "0.1.0" + version.rb # RubyPi::VERSION (single source of truth; see CHANGELOG.md) configuration.rb # RubyPi::Configuration (API keys, retries, defaults) errors.rb # Error hierarchy (ApiError, AuthenticationError, etc.) llm/ @@ -79,8 +83,11 @@ lib/ executor.rb # Parallel/sequential Executor result.rb # Execution Result agent/ - agent.rb # RubyPi::Agent (think-act-observe loop) - result.rb # Agent::Result (output, messages, iterations) + core.rb # Agent::Core — public API (run, continue, use, on) + loop.rb # Agent::Loop — the think-act-observe cycle + state.rb # Agent::State — messages, iteration counter, hooks + events.rb # EVENTS constant + EventEmitter + result.rb # Agent::Result (output, messages, turns) context/ compaction.rb # Context::Compaction transform.rb # Context::Transform @@ -90,8 +97,10 @@ lib/ spec/ spec_helper.rb # RSpec + WebMock setup ruby_pi/ + agent/ # Unit tests for core, loop, state, events, result llm/ # Unit tests for each LLM provider tools/ # Unit tests for tools, registry, executor + fixes/ # Regression specs pinned to review-round fixes integration/ agent_integration_spec.rb # End-to-end agent loop tests ``` @@ -188,17 +197,18 @@ The callable receives the `Agent::State` object and can mutate it (typically app ## Adding a New Extension 1. Subclass `RubyPi::Extensions::Base` -2. Use `on_event :event_name do |event| ... end` to subscribe to events +2. Use `on_event :event_name do |data, agent| ... end` to subscribe to events + (handlers receive the event payload hash and the agent instance) 3. Available events: `:turn_start`, `:turn_end`, `:text_delta`, `:tool_call_delta`, `:tool_execution_start`, `:tool_execution_end`, `:agent_end`, `:error`, `:compaction`, `:provider_fallback` Note: `:before_tool_call` and `:after_tool_call` are constructor hooks (Procs on Agent::Core), not subscribable events. 4. Register the extension: `agent.use(MyExtension)` (pass the CLASS, not an instance) ```ruby class MyExtension < RubyPi::Extensions::Base - on_event :agent_end do |event| + on_event :agent_end do |data, agent| # The :agent_end payload is { result:, success: }; the iteration count # lives on the Result object as `turns`. - puts "Agent finished after #{event[:result].turns} turns" + puts "Agent finished after #{data[:result].turns} turns" end end ``` diff --git a/Gemfile.lock b/Gemfile.lock index bb6e447..4d27ce2 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -1,10 +1,11 @@ PATH remote: . specs: - ruby-pi (0.1.7) + ruby-pi (0.1.8) concurrent-ruby (~> 1.2) faraday (~> 2.0) faraday-net_http (~> 3.0) + json (>= 2.0) GEM remote: https://rubygems.org/ diff --git a/lib/ruby_pi.rb b/lib/ruby_pi.rb index 0619a81..7f38367 100644 --- a/lib/ruby_pi.rb +++ b/lib/ruby_pi.rb @@ -82,6 +82,13 @@ def reset_configuration! end end + # Eagerly initialize the global configuration at load time. The lazy + # `@configuration ||= ...` in .configuration is not synchronized; two + # threads hitting it concurrently on first access could each construct a + # Configuration, with one silently discarded. Initializing here (requires + # run single-threaded) removes the race without adding a mutex to every read. + @configuration = Configuration.new + # Namespace for large language model providers and related abstractions. module LLM class << self diff --git a/lib/ruby_pi/agent/core.rb b/lib/ruby_pi/agent/core.rb index 6e61ce5..30c647a 100644 --- a/lib/ruby_pi/agent/core.rb +++ b/lib/ruby_pi/agent/core.rb @@ -140,12 +140,18 @@ def run(prompt) # the existing conversation history and appends the new prompt before # resuming the loop. # + # NOTE on Result accounting: each run/continue builds a fresh Loop, so + # the returned Result's `usage`, `tool_calls_made`, and `turns` cover + # ONLY this invocation — while `messages` is cumulative across the whole + # conversation. Sum the per-call Results if you need session totals. + # # Issue #16: Uses the encapsulated reset_iteration! method instead of # the old approach that bypassed encapsulation # and was fragile. # # @param prompt [String] the follow-up user message # @return [RubyPi::Agent::Result] the outcome of the continued run + # (usage/tool_calls_made/turns are per-invocation; messages cumulative) def continue(prompt) @state.reset_iteration! @state.add_message(role: :user, content: prompt) diff --git a/lib/ruby_pi/agent/loop.rb b/lib/ruby_pi/agent/loop.rb index b6f0d44..5e34506 100644 --- a/lib/ruby_pi/agent/loop.rb +++ b/lib/ruby_pi/agent/loop.rb @@ -10,6 +10,8 @@ # is reached. It handles streaming, lifecycle events, compaction, and all # pre/post tool call hooks. +require "json" + module RubyPi module Agent # Executes the think-act-observe cycle against a given State, emitting @@ -55,6 +57,14 @@ def initialize(state:, emitter:, compaction: nil, execution_mode: :parallel, too @state = state @emitter = emitter @compaction = compaction + # Wire the loop's emitter into the compaction strategy so the + # documented :compaction event actually reaches agent subscribers. + # Compaction#emitter defaults to nil and nothing else ever sets it — + # without this, `agent.on(:compaction)` never fires. An emitter that + # was already assigned explicitly is left untouched. + if @compaction.respond_to?(:emitter=) && @compaction.respond_to?(:emitter) && @compaction.emitter.nil? + @compaction.emitter = emitter + end @execution_mode = execution_mode @tool_timeout = tool_timeout @tool_calls_made = [] diff --git a/lib/ruby_pi/agent/state.rb b/lib/ruby_pi/agent/state.rb index 01b804c..1dd42dd 100644 --- a/lib/ruby_pi/agent/state.rb +++ b/lib/ruby_pi/agent/state.rb @@ -91,6 +91,12 @@ def initialize( # Appends a message to the conversation history. # + # NOTE: history grows without bound — there is no built-in cap. Growth + # per run is limited by max_iterations, but long-lived agents that call + # continue() repeatedly (or use a high max_iterations with large tool + # outputs) accumulate messages linearly. Configure + # Agent.new(compaction: ...) to keep the context bounded. + # # @param role [Symbol, String] the message role (:user, :assistant, :system, :tool) # @param content [String, nil] the text content of the message # @param options [Hash] additional fields (e.g., :tool_call_id, :tool_calls) diff --git a/lib/ruby_pi/configuration.rb b/lib/ruby_pi/configuration.rb index 1460954..8a4b05f 100644 --- a/lib/ruby_pi/configuration.rb +++ b/lib/ruby_pi/configuration.rb @@ -37,19 +37,53 @@ class Configuration attr_accessor :openai_api_key # @return [Integer] Maximum number of retry attempts for transient errors (default: 3) - attr_accessor :max_retries + attr_reader :max_retries # @return [Float] Base delay in seconds for exponential backoff (default: 1.0) - attr_accessor :retry_base_delay + attr_reader :retry_base_delay # @return [Float] Maximum delay in seconds between retries (default: 30.0) - attr_accessor :retry_max_delay + attr_reader :retry_max_delay # @return [Integer] HTTP request timeout in seconds (default: 120) - attr_accessor :request_timeout + attr_reader :request_timeout # @return [Integer] HTTP connection open timeout in seconds (default: 10) - attr_accessor :open_timeout + attr_reader :open_timeout + + # Validated writers for numeric settings. A negative max_retries silently + # disables retries and a negative delay raises deep inside the retry + # loop's sleep — fail fast at assignment time instead, where the typo is. + + # @param value [Integer] must be a non-negative integer + def max_retries=(value) + validate_numeric!(:max_retries, value) + @max_retries = value + end + + # @param value [Numeric] must be non-negative + def retry_base_delay=(value) + validate_numeric!(:retry_base_delay, value) + @retry_base_delay = value + end + + # @param value [Numeric] must be non-negative + def retry_max_delay=(value) + validate_numeric!(:retry_max_delay, value) + @retry_max_delay = value + end + + # @param value [Numeric] must be non-negative + def request_timeout=(value) + validate_numeric!(:request_timeout, value) + @request_timeout = value + end + + # @param value [Numeric] must be non-negative + def open_timeout=(value) + validate_numeric!(:open_timeout, value) + @open_timeout = value + end # @return [String] Default model name for Gemini provider attr_accessor :default_gemini_model @@ -78,6 +112,17 @@ def reset! private + # Raises unless the value is a non-negative Numeric. + # + # @param name [Symbol] the setting name (for the error message) + # @param value [Object] the value being assigned + # @raise [ArgumentError] if value is not a Numeric or is negative + def validate_numeric!(name, value) + return if value.is_a?(Numeric) && value >= 0 + + raise ArgumentError, "#{name} must be a non-negative number, got #{value.inspect}" + end + # Sets all configuration ivars to their default values. Called by both # initialize and reset! to ensure consistent defaults without the # anti-pattern of calling initialize from reset!. diff --git a/lib/ruby_pi/llm/anthropic.rb b/lib/ruby_pi/llm/anthropic.rb index c272bdc..b161810 100644 --- a/lib/ruby_pi/llm/anthropic.rb +++ b/lib/ruby_pi/llm/anthropic.rb @@ -6,6 +6,8 @@ # the Anthropic Messages API for both synchronous and streaming completions, # including tool_use block support. +require "json" + module RubyPi module LLM # Anthropic Claude provider implementation. Communicates with the Anthropic @@ -370,12 +372,19 @@ def perform_streaming_request(body, &block) # process complete lines incrementally so that deltas reach the caller # as soon as each SSE event is fully received — not after the entire # response has been buffered. - sse_buffer = +"" + # + # The buffer is BINARY because chunks arrive as ASCII-8BIT and may end + # mid-way through a multi-byte UTF-8 character; appending such a chunk + # to a UTF-8 buffer that already holds non-ASCII text raises + # Encoding::CompatibilityError. Each complete line is re-encoded to + # UTF-8 (and scrubbed) before parsing, so deltas reach the caller as + # valid UTF-8 strings. + sse_buffer = (+"").force_encoding(Encoding::BINARY) response_status = nil # Accumulate error response body separately so ApiError gets the # full body even though on_data consumed the chunks. - error_body = +"" + error_body = (+"").force_encoding(Encoding::BINARY) response = with_transport_errors do conn.post("/v1/messages") do |req| @@ -394,14 +403,17 @@ def perform_streaming_request(body, &block) # calls on_data for error responses too, which would otherwise # consume the body and leave response.body empty. if response_status && response_status >= 400 - error_body << chunk + error_body << chunk.b next end - sse_buffer << chunk - # Process all complete lines in the buffer + sse_buffer << chunk.b + # Process all complete lines in the buffer. A complete line holds + # complete UTF-8 sequences (multi-byte characters split across + # chunks are repaired by the buffering), so re-encode it to UTF-8 + # here; scrub guards against a server sending invalid bytes. while (line_end = sse_buffer.index("\n")) - line = sse_buffer.slice!(0, line_end + 1).strip + line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip next if line.empty? next unless line.start_with?("data: ") @@ -436,12 +448,12 @@ def perform_streaming_request(body, &block) unless response.success? # Reconstruct the response body from what on_data accumulated error_response = response - error_body_str = error_body.empty? ? response.body : error_body + error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub handle_error_response(error_response, override_body: error_body_str) end # Process any remaining data in the buffer after the connection closes - sse_buffer.each_line do |line| + sse_buffer.force_encoding(Encoding::UTF_8).scrub.each_line do |line| line = line.strip next if line.empty? next unless line.start_with?("data: ") diff --git a/lib/ruby_pi/llm/base_provider.rb b/lib/ruby_pi/llm/base_provider.rb index 385ba2d..2afd179 100644 --- a/lib/ruby_pi/llm/base_provider.rb +++ b/lib/ruby_pi/llm/base_provider.rb @@ -94,7 +94,7 @@ def complete(messages:, tools: [], stream: false, &block) # 3 retries + 1 initial = 4 total attempts. Previously used `< @max_retries` # which was off-by-one (only 2 retries with max_retries: 3). if attempt <= @max_retries - delay = calculate_backoff(attempt) + delay = retry_delay_for(e, attempt) log_retry(attempt, delay, e) sleep(delay) retry @@ -136,6 +136,29 @@ def perform_complete(messages:, tools:, stream:, &block) raise RubyPi::AbstractMethodError, :perform_complete end + # Maximum delay (seconds) honored from a server-provided Retry-After + # header. Caps pathological or misconfigured server values so a single + # 429 cannot stall the client indefinitely. + RETRY_AFTER_CEILING = 60.0 + + # Picks the delay before the next retry. A server-provided Retry-After + # on a 429 takes precedence over the local exponential backoff: the + # server knows its own cooldown window, and retrying earlier just burns + # the retry budget against guaranteed 429s. Retry-After parsed from an + # HTTP-date (rather than delta-seconds) arrives as 0.0 and falls through + # to the computed backoff. + # + # @param error [Exception] the error that triggered the retry + # @param attempt [Integer] the current attempt number (1-based) + # @return [Float] delay in seconds + def retry_delay_for(error, attempt) + if error.is_a?(RubyPi::RateLimitError) && error.retry_after&.positive? + [error.retry_after, RETRY_AFTER_CEILING].min + else + calculate_backoff(attempt) + end + end + # Calculates the backoff delay for a given retry attempt using # exponential backoff with jitter. # diff --git a/lib/ruby_pi/llm/fallback.rb b/lib/ruby_pi/llm/fallback.rb index 0ab5638..7f36238 100644 --- a/lib/ruby_pi/llm/fallback.rb +++ b/lib/ruby_pi/llm/fallback.rb @@ -149,6 +149,19 @@ def perform_complete_without_streaming(messages:, tools:, stream:, &block) # @yield [event] the consumer's streaming block # @return [RubyPi::LLM::Response] def perform_complete_with_streaming_fallback(messages:, tools:, &block) + # Count the characters of text already delivered to the consumer from + # the primary. If the primary fails mid-stream AFTER yielding text, + # the fallback streams a complete fresh response — a consumer that + # merely appends deltas would render the primary's partial text + # followed by the full fallback text. The :fallback_start payload + # carries partial_output/partial_chars so consumers can deterministically + # truncate what they already rendered. + partial_chars = 0 + counting_block = proc do |event| + partial_chars += event.data.to_s.length if event.text_delta? + block.call(event) + end + begin # Stream primary events directly to the consumer for real-time UX. # No buffering — tokens appear immediately as they arrive. @@ -156,7 +169,7 @@ def perform_complete_with_streaming_fallback(messages:, tools:, &block) messages: messages, tools: tools, stream: true, - &block + &counting_block ) response @@ -167,12 +180,17 @@ def perform_complete_with_streaming_fallback(messages:, tools:, &block) log_fallback(e) # Signal the consumer that the primary failed mid-stream and a - # fallback provider is taking over. Consumers should use this event - # to clear any partial output from the failed primary. + # fallback provider is taking over. Consumers MUST use this event + # to clear any partial output from the failed primary: + # partial_output — true when the primary yielded any text deltas + # partial_chars — how many characters were yielded (truncate by + # this amount if appending to a shared buffer) block.call(StreamEvent.new(type: :fallback_start, data: { failed_provider: @primary.provider_name, error: e.message, - fallback_provider: @fallback.provider_name + fallback_provider: @fallback.provider_name, + partial_output: partial_chars.positive?, + partial_chars: partial_chars })) # Stream directly from the fallback to the consumer's block. diff --git a/lib/ruby_pi/llm/gemini.rb b/lib/ruby_pi/llm/gemini.rb index a5f6050..517ba3e 100644 --- a/lib/ruby_pi/llm/gemini.rb +++ b/lib/ruby_pi/llm/gemini.rb @@ -6,6 +6,7 @@ # the Gemini REST API for both synchronous and streaming completions, including # tool/function calling support. +require "json" require "securerandom" module RubyPi @@ -305,9 +306,14 @@ def perform_streaming_request(body, &block) # which may split SSE events mid-line. We accumulate a line buffer and # process complete lines incrementally so that deltas reach the caller # as soon as each SSE event is fully received. - sse_buffer = +"" + # BINARY buffer: chunks arrive as ASCII-8BIT and may end mid-way + # through a multi-byte UTF-8 character; appending such a chunk to a + # UTF-8 buffer holding non-ASCII text raises + # Encoding::CompatibilityError. Complete lines are re-encoded to + # UTF-8 (and scrubbed) before parsing. + sse_buffer = (+"").force_encoding(Encoding::BINARY) response_status = nil - error_body = +"" + error_body = (+"").force_encoding(Encoding::BINARY) response = with_transport_errors do conn.post(url) do |req| @@ -324,14 +330,17 @@ def perform_streaming_request(body, &block) # If the HTTP status indicates an error, accumulate the body for # the error handler instead of parsing it as SSE events. if response_status && response_status >= 400 - error_body << chunk + error_body << chunk.b next end - sse_buffer << chunk - # Process all complete lines in the buffer + sse_buffer << chunk.b + # Process all complete lines in the buffer. A complete line holds + # complete UTF-8 sequences (multi-byte characters split across + # chunks are repaired by the buffering), so re-encode it to UTF-8 + # here; scrub guards against a server sending invalid bytes. while (line_end = sse_buffer.index("\n")) - line = sse_buffer.slice!(0, line_end + 1).strip + line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip next if line.empty? next unless line.start_with?("data: ") @@ -400,7 +409,7 @@ def perform_streaming_request(body, &block) # callback. Pass the accumulated error_body so ApiError carries the # full server message instead of an empty body. unless response.success? - error_body_str = error_body.empty? ? response.body : error_body + error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub handle_error_response(response, override_body: error_body_str) end diff --git a/lib/ruby_pi/llm/openai.rb b/lib/ruby_pi/llm/openai.rb index d90ccaf..d0e546c 100644 --- a/lib/ruby_pi/llm/openai.rb +++ b/lib/ruby_pi/llm/openai.rb @@ -6,6 +6,8 @@ # OpenAI Chat Completions API for both synchronous and streaming completions, # including function/tool calling support. +require "json" + module RubyPi module LLM # OpenAI provider implementation. Communicates with the OpenAI Chat @@ -318,9 +320,14 @@ def perform_streaming_request(body, &block) # which may split SSE events mid-line. We accumulate a line buffer and # process complete lines incrementally so that deltas reach the caller # as soon as each SSE event is fully received. - sse_buffer = +"" + # BINARY buffer: chunks arrive as ASCII-8BIT and may end mid-way + # through a multi-byte UTF-8 character; appending such a chunk to a + # UTF-8 buffer holding non-ASCII text raises + # Encoding::CompatibilityError. Complete lines are re-encoded to + # UTF-8 (and scrubbed) before parsing. + sse_buffer = (+"").force_encoding(Encoding::BINARY) response_status = nil - error_body = +"" + error_body = (+"").force_encoding(Encoding::BINARY) response = with_transport_errors do conn.post("/v1/chat/completions") do |req| @@ -337,14 +344,17 @@ def perform_streaming_request(body, &block) # If the HTTP status indicates an error, accumulate the body for # the error handler instead of parsing it as SSE events. if response_status && response_status >= 400 - error_body << chunk + error_body << chunk.b next end - sse_buffer << chunk - # Process all complete lines in the buffer + sse_buffer << chunk.b + # Process all complete lines in the buffer. A complete line holds + # complete UTF-8 sequences (multi-byte characters split across + # chunks are repaired by the buffering), so re-encode it to UTF-8 + # here; scrub guards against a server sending invalid bytes. while (line_end = sse_buffer.index("\n")) - line = sse_buffer.slice!(0, line_end + 1).strip + line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip next if line.empty? next unless line.start_with?("data: ") @@ -419,7 +429,7 @@ def perform_streaming_request(body, &block) # callback. Pass the accumulated error_body so ApiError carries the # full server message instead of an empty body. unless response.success? - error_body_str = error_body.empty? ? response.body : error_body + error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub handle_error_response(response, override_body: error_body_str) end diff --git a/lib/ruby_pi/llm/tool_call.rb b/lib/ruby_pi/llm/tool_call.rb index 93ea49c..ff46b6d 100644 --- a/lib/ruby_pi/llm/tool_call.rb +++ b/lib/ruby_pi/llm/tool_call.rb @@ -6,6 +6,8 @@ # decides to invoke a tool, it returns one or more ToolCall objects describing # which function to call and with what arguments. +require "json" + module RubyPi module LLM # A tool call extracted from an LLM response. Contains the unique call ID, diff --git a/lib/ruby_pi/tools/definition.rb b/lib/ruby_pi/tools/definition.rb index 9ea0940..4e14fe0 100644 --- a/lib/ruby_pi/tools/definition.rb +++ b/lib/ruby_pi/tools/definition.rb @@ -37,16 +37,32 @@ class Definition # @return [Hash] A JSON Schema hash describing the tool's parameters. attr_reader :parameters + # Tool names must satisfy the strictest provider constraint (Anthropic's + # ^[a-zA-Z0-9_-]{1,64}$). Without this guard, a name like "send.email" + # registers fine and then 400s on every API request with an opaque + # server-side validation error that doesn't point back to the tool. + NAME_FORMAT = /\A[a-zA-Z0-9_-]{1,64}\z/ + # Creates a new tool definition. # - # @param name [String, Symbol] Unique identifier for the tool. + # @param name [String, Symbol] Unique identifier for the tool. Must match + # NAME_FORMAT (letters, digits, underscore, hyphen; max 64 chars). # @param description [String] What the tool does (shown to the LLM). # @param category [Symbol, nil] Optional grouping category. # @param parameters [Hash] JSON Schema hash for the tool's input parameters. - # @yield [Hash] Block that implements the tool logic. Receives a hash of arguments. - # @raise [ArgumentError] If name or description is missing, or no block given. + # @yield [Hash] Block that implements the tool logic. Receives a hash of + # symbol-keyed arguments, or keyword arguments if the block declares + # keyword parameters (see #call). + # @raise [ArgumentError] If name is missing or violates NAME_FORMAT, + # description is missing, or no block given. def initialize(name:, description:, category: nil, parameters: {}, &block) raise ArgumentError, "Tool name is required" if name.nil? || name.to_s.strip.empty? + unless name.to_s.match?(NAME_FORMAT) + raise ArgumentError, + "Tool name #{name.to_s.inspect} is invalid — provider APIs require " \ + "names matching #{NAME_FORMAT.inspect} (letters, digits, underscore, " \ + "hyphen; 1-64 characters)" + end raise ArgumentError, "Tool description is required" if description.nil? || description.strip.empty? raise ArgumentError, "Tool implementation block is required" unless block_given? @@ -55,14 +71,33 @@ def initialize(name:, description:, category: nil, parameters: {}, &block) @category = category&.to_sym @parameters = parameters @implementation = block + # On Ruby 3.x a positional Hash is never auto-splatted to keywords, so + # a block written `{ |content:, platform:| ... }` — the natural style + # given named schema parameters — would fail every call with + # "missing keyword". Detect keyword parameters once here and splat in + # #call accordingly. + @expects_keywords = block.parameters.any? { |type, _| %i[key keyreq keyrest].include?(type) } end # Invokes the tool with the given arguments. # + # Blocks may be written either style: + # { |args| args[:content] } # single positional Hash + # { |content:, platform: "x"| ... } # keyword parameters + # + # When the block declares keyword parameters, the arguments hash is + # splatted to keywords. Note that a keyword-style block without **rest + # raises ArgumentError on unexpected keys — strict by design, since the + # keys come from the LLM. + # # @param args [Hash] The arguments to pass to the tool implementation. # @return [Object] Whatever the implementation block returns. def call(args = {}) - @implementation.call(args) + if @expects_keywords + @implementation.call(**args) + else + @implementation.call(args) + end end # Converts this tool definition to Google Gemini function declaration format. diff --git a/lib/ruby_pi/tools/executor.rb b/lib/ruby_pi/tools/executor.rb index 1dc9c67..7080c1a 100644 --- a/lib/ruby_pi/tools/executor.rb +++ b/lib/ruby_pi/tools/executor.rb @@ -115,7 +115,12 @@ def execute_parallel(calls) end # Collect results, respecting the configured timeout for each future. - futures.map do |future| + # Zip each future with its originating call so failure Results carry + # the real tool name — with several tools timing out in parallel, + # "unknown" Results are indistinguishable in logs and extension events. + calls.zip(futures).map do |call, future| + tool_name = (call[:name] || call["name"]).to_s + # Issue #10: Wait for the future to complete, then check its state # explicitly. Future#value returns nil both on timeout AND when the # block legitimately returned nil, so we cannot use || to distinguish. @@ -128,13 +133,16 @@ def execute_parallel(calls) else # Future was rejected (raised an exception within the block). # This shouldn't normally happen since execute_single rescues - # internally, but handle it defensively. + # internally, but handle it defensively. The actual run time is + # unknown here (the future failed at some point before the wait + # elapsed), so report 0.0 rather than a misleading full-timeout + # duration for what may have been an instant failure. error = future.reason Result.new( - name: "unknown", + name: tool_name, success: false, error: "#{error.class}: #{error.message}", - duration_ms: @timeout * 1000.0 + duration_ms: 0.0 ) end else @@ -147,9 +155,9 @@ def execute_parallel(calls) future.cancel if future.respond_to?(:cancel) Result.new( - name: "unknown", + name: tool_name, success: false, - error: "Tool execution timed out after #{@timeout}s", + error: "Tool '#{tool_name}' timed out after #{@timeout}s", duration_ms: @timeout * 1000.0 ) end diff --git a/lib/ruby_pi/tools/schema.rb b/lib/ruby_pi/tools/schema.rb index cdb05bf..ec8d4dd 100644 --- a/lib/ruby_pi/tools/schema.rb +++ b/lib/ruby_pi/tools/schema.rb @@ -13,6 +13,16 @@ # flag consumed by `.object` to populate the top-level "required" array. # It is stripped from the property's own schema hash before inclusion. # +# IMPORTANT: Schemas are LLM-facing hints, NOT runtime input validation. +# Nothing in the execution pipeline validates the model's arguments against +# the schema before invoking the tool block: `required`, `enum`, `minimum`, +# and type declarations constrain what the model is *asked* to produce, but a +# misbehaving model can still omit required fields, send extra keys, or pass +# a String where an Integer is declared — no coercion is performed. Tool +# blocks should treat their arguments as untrusted input and validate or +# coerce what they depend on. (This is deliberate, per the anti-framework +# philosophy: validation policy belongs to the tool, not the harness.) +# # Usage: # schema = RubyPi::Schema.object( # name: RubyPi::Schema.string("User's name", required: true), diff --git a/lib/ruby_pi/version.rb b/lib/ruby_pi/version.rb index 7deaa5f..d54577c 100644 --- a/lib/ruby_pi/version.rb +++ b/lib/ruby_pi/version.rb @@ -7,5 +7,5 @@ module RubyPi # The current version of the RubyPi gem, following Semantic Versioning. - VERSION = "0.1.7" + VERSION = "0.1.8" end diff --git a/ruby-pi.gemspec b/ruby-pi.gemspec index 9fed46d..831cc55 100644 --- a/ruby-pi.gemspec +++ b/ruby-pi.gemspec @@ -51,6 +51,10 @@ Gem::Specification.new do |spec| # Issue #20: Removed arbitrary < 3.4 upper bound — no known incompatibility. spec.add_dependency "faraday-net_http", "~> 3.0" spec.add_dependency "concurrent-ruby", "~> 1.2" + # JSON.parse/JSON.generate are used directly throughout the providers and + # agent loop. Declare it explicitly rather than relying on Faraday's + # transitive dependency (which could drop it) or the default-gem status. + spec.add_dependency "json", ">= 2.0" # Development dependencies spec.add_development_dependency "rspec", "~> 3.12" diff --git a/spec/ruby_pi/fixes/review_round6_spec.rb b/spec/ruby_pi/fixes/review_round6_spec.rb new file mode 100644 index 0000000..d587eef --- /dev/null +++ b/spec/ruby_pi/fixes/review_round6_spec.rb @@ -0,0 +1,288 @@ +# frozen_string_literal: true + +# spec/ruby_pi/fixes/review_round6_spec.rb +# +# Regression specs for adversarial review round 6: +# +# 1. BaseProvider honors a server-provided Retry-After on 429 instead of +# always using local exponential backoff. +# 2. Parallel executor timeout/rejection Results carry the real tool name +# (previously hardcoded "unknown"). +# 3. Tool blocks declared with keyword parameters work (previously every +# call raised "missing keyword"). +# 4. Tool names are validated against provider constraints at definition time. +# 5. The :compaction event emitter is wired by the Loop, so subscribers +# actually receive the documented event. +# 6. The :fallback_start payload reports how much primary text was already +# yielded (partial_output / partial_chars). +# 7. Configuration rejects negative numeric settings at assignment time. +# 8. Streaming text deltas are valid UTF-8 even though chunks arrive as +# binary, and the SSE buffer survives multi-byte characters. +# 9. Every lib file referencing JSON requires "json" itself (composability). + +require "spec_helper" + +RSpec.describe "Review round 6 fixes" do + describe "1. Retry-After is honored on 429" do + let(:provider) { RubyPi::LLM::Anthropic.new(model: "claude-test") } + + it "uses the server's Retry-After as the retry delay" do + error = RubyPi::RateLimitError.new("rate limited", retry_after: 7.5) + expect(provider.send(:retry_delay_for, error, 1)).to eq(7.5) + end + + it "caps a pathological Retry-After at the ceiling" do + error = RubyPi::RateLimitError.new("rate limited", retry_after: 86_400.0) + expect(provider.send(:retry_delay_for, error, 1)) + .to eq(RubyPi::LLM::BaseProvider::RETRY_AFTER_CEILING) + end + + it "falls back to exponential backoff when Retry-After is absent" do + error = RubyPi::RateLimitError.new("rate limited", retry_after: nil) + delay = provider.send(:retry_delay_for, error, 1) + # backoff = base * 2^0 + jitter, capped at retry_max_delay + expect(delay).to be_between(0, RubyPi.configuration.retry_max_delay) + end + + it "falls back to backoff when Retry-After was an HTTP-date (parses to 0.0)" do + error = RubyPi::RateLimitError.new("rate limited", retry_after: 0.0) + expect(provider).to receive(:calculate_backoff).with(2).and_return(0.01) + expect(provider.send(:retry_delay_for, error, 2)).to eq(0.01) + end + + it "uses backoff (not Retry-After) for non-rate-limit errors" do + error = RubyPi::ApiError.new("boom", status_code: 500) + expect(provider).to receive(:calculate_backoff).with(1).and_return(0.01) + expect(provider.send(:retry_delay_for, error, 1)).to eq(0.01) + end + + it "sleeps for the Retry-After duration when retrying a real 429" do + stub_request(:post, "https://api.anthropic.com/v1/messages") + .to_return( + { status: 429, headers: { "Retry-After" => "0.02" }, body: "rate limited" }, + { + status: 200, + headers: { "Content-Type" => "application/json" }, + body: JSON.generate( + id: "msg_1", type: "message", role: "assistant", + content: [{ type: "text", text: "ok" }], + stop_reason: "end_turn", + usage: { input_tokens: 1, output_tokens: 1 } + ) + } + ) + + expect(provider).to receive(:sleep).with(0.02) + response = provider.complete(messages: [{ role: :user, content: "hi" }]) + expect(response.content).to eq("ok") + end + end + + describe "2. Parallel executor failure Results carry the tool name" do + let(:registry) { RubyPi::Tools::Registry.new } + + it "reports the real tool name when a parallel tool times out" do + registry.register( + RubyPi::Tool.define(name: "slow_tool", description: "sleeps") { |_args| sleep(1) } + ) + registry.register( + RubyPi::Tool.define(name: "fast_tool", description: "returns") { |_args| "done" } + ) + + executor = RubyPi::Tools::Executor.new(registry, mode: :parallel, timeout: 0.05) + results = executor.execute([ + { name: "slow_tool", arguments: {} }, + { name: "fast_tool", arguments: {} } + ]) + + timed_out = results[0] + expect(timed_out.success?).to be(false) + expect(timed_out.name).to eq("slow_tool") + expect(timed_out.error).to include("slow_tool") + expect(timed_out.error).to include("timed out") + + expect(results[1].name).to eq("fast_tool") + expect(results[1].success?).to be(true) + end + end + + describe "3. Keyword-parameter tool blocks" do + it "splats arguments to a block with required keywords" do + tool = RubyPi::Tool.define(name: "kw_tool", description: "kw") do |content:| + content.upcase + end + expect(tool.call(content: "hi")).to eq("HI") + end + + it "splats arguments to a block with optional keywords and **rest" do + tool = RubyPi::Tool.define(name: "kw_rest", description: "kw") do |content:, platform: "x", **rest| + [content, platform, rest] + end + expect(tool.call(content: "a", extra: 1)).to eq(["a", "x", { extra: 1 }]) + end + + it "still passes a single positional Hash to positional-style blocks" do + tool = RubyPi::Tool.define(name: "pos_tool", description: "pos") do |args| + args[:content] + end + expect(tool.call(content: "hello")).to eq("hello") + end + end + + describe "4. Tool name validation" do + it "rejects names with characters providers refuse" do + ["send.email", "create post!", "a b", "tool/x"].each do |bad| + expect do + RubyPi::Tool.define(name: bad, description: "d") { |a| a } + end.to raise_error(ArgumentError, /invalid/) + end + end + + it "rejects names longer than 64 characters" do + expect do + RubyPi::Tool.define(name: "a" * 65, description: "d") { |a| a } + end.to raise_error(ArgumentError, /invalid/) + end + + it "accepts provider-safe names" do + %w[send_email tool-1 Get_Analytics2].each do |good| + expect do + RubyPi::Tool.define(name: good, description: "d") { |a| a } + end.not_to raise_error + end + end + end + + describe "5. :compaction event emitter wiring" do + let(:model) { double("model") } + let(:emitter) { Class.new { include RubyPi::Agent::EventEmitter }.new } + let(:state) do + RubyPi::Agent::State.new(system_prompt: "test", model: model) + end + + it "wires the loop's emitter into the compaction strategy" do + compaction = RubyPi::Context::Compaction.new(summary_model: model) + expect(compaction.emitter).to be_nil + + RubyPi::Agent::Loop.new(state: state, emitter: emitter, compaction: compaction) + expect(compaction.emitter).to be(emitter) + end + + it "does not overwrite an explicitly assigned emitter" do + compaction = RubyPi::Context::Compaction.new(summary_model: model) + custom = double("custom emitter") + compaction.emitter = custom + + RubyPi::Agent::Loop.new(state: state, emitter: emitter, compaction: compaction) + expect(compaction.emitter).to be(custom) + end + end + + describe "6. :fallback_start reports partial output" do + let(:primary) { double("primary", provider_name: :primary, model_name: "p") } + let(:backup) { double("backup", provider_name: :backup, model_name: "b") } + let(:provider) { RubyPi::LLM::Fallback.new(primary: primary, fallback: backup) } + let(:fallback_response) do + RubyPi::LLM::Response.new(content: "ok", tool_calls: [], usage: {}, finish_reason: "stop") + end + + before do + allow(backup).to receive(:complete) do |**_args, &block| + block.call(RubyPi::LLM::StreamEvent.new(type: :text_delta, data: "ok")) + fallback_response + end + end + + it "counts the characters already yielded by the failed primary" do + allow(primary).to receive(:complete) do |**_args, &block| + block.call(RubyPi::LLM::StreamEvent.new(type: :text_delta, data: "The answer ")) + block.call(RubyPi::LLM::StreamEvent.new(type: :text_delta, data: "is")) + raise RubyPi::ApiError.new("primary died mid-stream", status_code: 500) + end + + events = [] + provider.complete(messages: [], tools: [], stream: true) { |e| events << e } + + start = events.find { |e| e.type == :fallback_start } + expect(start.data[:partial_output]).to be(true) + expect(start.data[:partial_chars]).to eq("The answer is".length) + end + + it "reports no partial output when the primary failed before yielding text" do + allow(primary).to receive(:complete).and_raise( + RubyPi::ApiError.new("immediate failure", status_code: 500) + ) + + events = [] + provider.complete(messages: [], tools: [], stream: true) { |e| events << e } + + start = events.find { |e| e.type == :fallback_start } + expect(start.data[:partial_output]).to be(false) + expect(start.data[:partial_chars]).to eq(0) + end + end + + describe "7. Configuration validates numeric settings" do + let(:config) { RubyPi::Configuration.new } + + it "rejects negative values" do + expect { config.max_retries = -1 }.to raise_error(ArgumentError, /max_retries/) + expect { config.retry_base_delay = -0.5 }.to raise_error(ArgumentError, /retry_base_delay/) + expect { config.retry_max_delay = -1 }.to raise_error(ArgumentError, /retry_max_delay/) + expect { config.request_timeout = -10 }.to raise_error(ArgumentError, /request_timeout/) + expect { config.open_timeout = -1 }.to raise_error(ArgumentError, /open_timeout/) + end + + it "rejects non-numeric values" do + expect { config.max_retries = "three" }.to raise_error(ArgumentError, /max_retries/) + end + + it "accepts zero and positive values" do + expect { config.max_retries = 0 }.not_to raise_error + expect { config.retry_base_delay = 0.01 }.not_to raise_error + expect(config.retry_base_delay).to eq(0.01) + end + end + + describe "8. Streaming UTF-8 safety" do + let(:provider) { RubyPi::LLM::Anthropic.new(model: "claude-test") } + let(:emoji_text) { "Hello 👋 wörld" } + let(:sse_body) do + [ + %(data: {"type":"message_start","message":{"usage":{"input_tokens":5}}}\n\n), + %(data: {"type":"content_block_start","index":0,"content_block":{"type":"text"}}\n\n), + %(data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":#{JSON.generate(emoji_text)}}}\n\n), + %(data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":3}}\n\n), + %(data: {"type":"message_stop"}\n\n) + ].join.b # deliver as binary, like a real socket + end + + it "yields multi-byte text deltas as valid UTF-8" do + stub_request(:post, "https://api.anthropic.com/v1/messages") + .to_return(status: 200, headers: { "Content-Type" => "text/event-stream" }, body: sse_body) + + deltas = [] + response = provider.complete(messages: [{ role: :user, content: "hi" }], stream: true) do |event| + deltas << event.data if event.text_delta? + end + + text = deltas.join + expect(text).to eq(emoji_text) + expect(text.encoding).to eq(Encoding::UTF_8) + expect(text.valid_encoding?).to be(true) + expect(response.content).to eq(emoji_text) + end + end + + describe "9. JSON is required where it is used" do + it "every lib file that references JSON requires json itself" do + lib_root = File.expand_path("../../../lib", __dir__) + offenders = Dir[File.join(lib_root, "**", "*.rb")].select do |file| + src = File.read(file) + src.match?(/\bJSON\.(parse|generate)/) && !src.match?(/^require "json"$/) + end + expect(offenders).to be_empty, + "files use JSON without requiring it: #{offenders.inspect}" + end + end +end