fix(realtime): send safe output guardrail feedback#3138
fix(realtime): send safe output guardrail feedback#3138adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
Conversation
Replace internal realtime guardrail prompts with a user-facing recovery instruction so blocked audio output is canceled and followed by a safe replacement response.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b7b0bd1aaf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| failure_details = [ | ||
| { | ||
| "guardrail": result.guardrail.get_name(), | ||
| "output_info": result.output.output_info, |
There was a problem hiding this comment.
Do not feed guardrail output_info back to the model
output_info is arbitrary guardrail data and often includes the exact matched text or internal policy reason. Serializing it into the user message means a PII/competitor/unsafe-output guardrail can re-inject the blocked content into the next model turn, where it may be spoken or exposed in conversation history, defeating the safe fallback path.
Useful? React with 👍 / 👎.
|
Even if the issues pointed out by codex are resolved, we don't plan to have this change. Thanks for your interest here. |
Summary
Fixes #1912
Test plan
uv run ruff format src/agents/realtime/session.py tests/realtime/test_session.pyuv run ruff check src/agents/realtime/session.py tests/realtime/test_session.pyuv run pytest tests/realtime/test_session.py -k guardrail -quv run pytest tests/realtime/test_session.py -quv run mypy src/agents/realtime/session.py tests/realtime/test_session.pymake lint