Skip to content

fix: resolve Pydantic delta serialization warnings in agent streaming#2005

Open
Vchen7629 wants to merge 7 commits into
mainfrom
chat-message-pydantic-field-warning-log-fix
Open

fix: resolve Pydantic delta serialization warnings in agent streaming#2005
Vchen7629 wants to merge 7 commits into
mainfrom
chat-message-pydantic-field-warning-log-fix

Conversation

@Vchen7629

@Vchen7629 Vchen7629 commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Summary

Port of PR #1924 to target main branch instead of release-0.5.1. Fixes #1821. Resolves repeated PydanticSerializationUnexpectedValue warnings logged for every streamed chat chunk. Langflow's /response SSE always wraps text deltas as {"content": "..."} for every provider, but the openai SDK's response event models declare delta: str. When chunk.model_dump() runs, Pydantic's serializer checks delta against that str type, finds a dict, and logs a warning for every chunk which pollutes the openrag-backend logs

Changes

  • exclude delta from model_dump()'s serialization (which is what triggers the type-check warning), then reattach the raw, unvalidated delta value back onto the resulting dict. Output shape sent to the frontend/SDK is unchanged and only the noisy serializer warning is avoided.
  • Added a comment above my agent.py changes explaining why and when this workaround can be removed.
  • Added a simple regression unit test to verify that the pydantic warning isnt being raised and the fields to be expected

Summary by CodeRabbit

  • Bug Fixes

    • Improved streaming response handling for structured delta chunks to prevent serialization warnings while preserving the original delta content.
  • Tests

    • Added unit tests to verify delta serialization behavior when excluding the delta field: no warnings are emitted, and the dumped output retains the original delta content and event type.

@Vchen7629 Vchen7629 requested a review from lucaseduoli July 2, 2026 19:23
@github-actions github-actions Bot added backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) tests bug 🔴 Something isn't working. labels Jul 2, 2026
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 99ed5438-6b7e-454b-bf46-f924b8e5d663

📥 Commits

Reviewing files that changed from the base of the PR and between f966225 and b19f1ce.

📒 Files selected for processing (1)
  • tests/unit/test_agent_stream_delta.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/test_agent_stream_delta.py

Walkthrough

async_response_stream now serializes chunks with model_dump(exclude={"delta"}) and then restores chunk.delta. New unit tests cover warning behavior for direct dumping and the preserved delta shape with exclusion.

Changes

Delta serialization fix

Layer / File(s) Summary
Chunk serialization fix
src/agent.py
model_dump() now excludes delta, and the original chunk.delta value is reattached afterward when present.
Delta serialization tests
tests/unit/test_agent_stream_delta.py
Adds a mismatched Pydantic delta event fixture and tests warning behavior for direct dumping versus excluded dumping with delta restoration.

Estimated code review effort: 2 (Simple) | ~10 minutes

Suggested labels: bug

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: fixing Pydantic serialization warnings in agent streaming.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chat-message-pydantic-field-warning-log-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jul 2, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unit/test_agent_stream_delta.py (1)

13-25: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Solid regression test. Optionally strengthen it by also asserting that the plain chunk.model_dump() (without exclude) actually triggers the warning, proving this test would catch a regression if the workaround becomes unnecessary or ineffective.

Optional companion assertion
     with warnings.catch_warnings():
         warnings.simplefilter("error")
 
         chunk_data = chunk.model_dump(exclude={"delta"})
         chunk_data["delta"] = chunk.delta
 
     assert chunk_data["delta"] == {"content": "Hello"}
     assert chunk_data["type"] == "response.output_text.delta"
+
+
+def test_model_dump_without_exclude_raises_the_warning() -> None:
+    chunk = FakeTextDeltaEvent.model_construct(
+        delta={"content": "Hello"}, type="response.output_text.delta"
+    )
+    with warnings.catch_warnings():
+        warnings.simplefilter("error")
+        with pytest.raises(UserWarning):
+            chunk.model_dump()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_agent_stream_delta.py` around lines 13 - 25, The regression
test around FakeTextDeltaEvent.model_dump() should also prove the warning path
is exercised, not just the workaround. In
test_model_dump_excluding_delta_avoids_warning_and_preserves_dict_shape, add an
assertion that calling chunk.model_dump() without exclude={"delta"} raises the
expected warning, while keeping the existing excluded-dump check to verify the
fix still preserves the dict shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/test_agent_stream_delta.py`:
- Around line 13-25: The regression test around FakeTextDeltaEvent.model_dump()
should also prove the warning path is exercised, not just the workaround. In
test_model_dump_excluding_delta_avoids_warning_and_preserves_dict_shape, add an
assertion that calling chunk.model_dump() without exclude={"delta"} raises the
expected warning, while keeping the existing excluded-dump check to verify the
fix still preserves the dict shape.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0aecb6c6-f811-4327-ae24-eaf120d41e07

📥 Commits

Reviewing files that changed from the base of the PR and between 4f104ec and 6d93ab0.

📒 Files selected for processing (2)
  • src/agent.py
  • tests/unit/test_agent_stream_delta.py

@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jul 2, 2026
@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jul 2, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/unit/test_agent_stream_delta.py (2)

18-24: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Assertion doesn't verify the warning is the expected serialization warning.

assert recorded only checks that some warning fired, not that it's the PydanticSerializationUnexpectedValue/serializer warning this test is meant to demonstrate. If an unrelated warning were emitted (e.g., a deprecation warning from another import), this assertion would still pass, masking a regression where the actual serialization warning stops firing.

🔍 Suggested tightening
     with warnings.catch_warnings(record=True) as recorded:
         warnings.simplefilter("always")
         chunk.model_dump()

-    assert recorded, "expected a serialization warning when delta has a type mismatch"
+    assert any(
+        "serializer warnings" in str(w.message) for w in recorded
+    ), "expected a Pydantic serialization warning when delta has a type mismatch"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_agent_stream_delta.py` around lines 18 - 24, The test in
test_model_dump_without_exclusion_raises_serialization_warning only checks that
some warning was emitted, so tighten it to assert the warning is the expected
serialization warning from chunk.model_dump. Inspect the warnings captured in
recorded and verify the specific warning category/message associated with
PydanticSerializationUnexpectedValue rather than relying on assert recorded, so
the test fails if an unrelated warning appears or the serializer warning stops
firing.

13-34: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy lift

Tests validate Pydantic's own behavior, not the actual fix in agent.py.

Both tests independently replicate the model_dump(exclude={"delta"}) + reattach pattern instead of invoking async_response_stream (the function this PR actually changes, see src/agent.py:174-191). If a future edit to async_response_stream regresses the exclude/reattach logic (e.g., typo in the exclude key, or forgetting to reattach delta), these tests will still pass since they don't exercise that code path at all — they only prove that Pydantic's model_dump behaves as expected in isolation.

Consider adding a test that drives an actual chunk through async_response_stream (with a minimal async generator stub emitting a FakeTextDeltaEvent-like chunk) and asserts on the resulting SSE payload, in addition to (or instead of) these Pydantic-behavior-only unit tests.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_agent_stream_delta.py` around lines 13 - 34, The current
tests only assert Pydantic’s standalone `model_dump` behavior and do not cover
the actual logic changed in `async_response_stream`. Update the tests to drive a
real chunk through `async_response_stream` in `src/agent.py` using a minimal
async generator stub that yields a `FakeTextDeltaEvent`-like event, then assert
the emitted SSE payload includes the expected `delta` and `type` fields. Keep or
replace the existing `chunk`-based checks only if they directly exercise
`async_response_stream`’s exclude-and-reattach path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/test_agent_stream_delta.py`:
- Around line 18-24: The test in
test_model_dump_without_exclusion_raises_serialization_warning only checks that
some warning was emitted, so tighten it to assert the warning is the expected
serialization warning from chunk.model_dump. Inspect the warnings captured in
recorded and verify the specific warning category/message associated with
PydanticSerializationUnexpectedValue rather than relying on assert recorded, so
the test fails if an unrelated warning appears or the serializer warning stops
firing.
- Around line 13-34: The current tests only assert Pydantic’s standalone
`model_dump` behavior and do not cover the actual logic changed in
`async_response_stream`. Update the tests to drive a real chunk through
`async_response_stream` in `src/agent.py` using a minimal async generator stub
that yields a `FakeTextDeltaEvent`-like event, then assert the emitted SSE
payload includes the expected `delta` and `type` fields. Keep or replace the
existing `chunk`-based checks only if they directly exercise
`async_response_stream`’s exclude-and-reattach path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d04191fa-b886-490e-a4d5-011a4947f40e

📥 Commits

Reviewing files that changed from the base of the PR and between 6d93ab0 and f966225.

📒 Files selected for processing (1)
  • tests/unit/test_agent_stream_delta.py

@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jul 2, 2026
@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jul 2, 2026
@github-actions github-actions Bot added the lgtm label Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) bug 🔴 Something isn't working. lgtm tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Chat streaming: delta field receives dict instead of string

2 participants