Skip to content

fix(realtime): send safe output guardrail feedback#3138

Closed
adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
adityasingh2400:fix/realtime-output-guardrails-1912
Closed

fix(realtime): send safe output guardrail feedback#3138
adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
adityasingh2400:fix/realtime-output-guardrails-1912

Conversation

@adityasingh2400
Copy link
Copy Markdown
Contributor

Summary

  • Replaces the internal realtime output-guardrail follow-up prompt with a user-facing safe recovery instruction.
  • Includes failed guardrail names and serialized output info so the model can recover without speaking the blocked content.
  • Keeps the existing forced interrupt/cancel behavior and adds coverage that the interrupt is forced before the feedback message is sent.

Fixes #1912

Test plan

  • uv run ruff format src/agents/realtime/session.py tests/realtime/test_session.py
  • uv run ruff check src/agents/realtime/session.py tests/realtime/test_session.py
  • uv run pytest tests/realtime/test_session.py -k guardrail -q
  • uv run pytest tests/realtime/test_session.py -q
  • uv run mypy src/agents/realtime/session.py tests/realtime/test_session.py
  • make lint

Replace internal realtime guardrail prompts with a user-facing recovery instruction so blocked audio output is canceled and followed by a safe replacement response.
@github-actions github-actions Bot added bug Something isn't working feature:realtime labels May 6, 2026
@seratch
Copy link
Copy Markdown
Member

seratch commented May 6, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7b0bd1aaf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

failure_details = [
{
"guardrail": result.guardrail.get_name(),
"output_info": result.output.output_info,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not feed guardrail output_info back to the model

output_info is arbitrary guardrail data and often includes the exact matched text or internal policy reason. Serializing it into the user message means a PII/competitor/unsafe-output guardrail can re-inject the blocked content into the next model turn, where it may be spoken or exposed in conversation history, defeating the safe fallback path.

Useful? React with 👍 / 👎.

@seratch
Copy link
Copy Markdown
Member

seratch commented May 6, 2026

Even if the issues pointed out by codex are resolved, we don't plan to have this change. Thanks for your interest here.

@seratch seratch closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:realtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python Realtime Voice Agent doesn’t handle output guardrails like Node.js SDK

2 participants