fix(realtime): send safe output guardrail feedback by adityasingh2400 · Pull Request #3138 · openai/openai-agents-python

adityasingh2400 · 2026-05-06T09:13:24Z

Summary

Replaces the internal realtime output-guardrail follow-up prompt with a user-facing safe recovery instruction.
Includes failed guardrail names and serialized output info so the model can recover without speaking the blocked content.
Keeps the existing forced interrupt/cancel behavior and adds coverage that the interrupt is forced before the feedback message is sent.

Fixes #1912

Test plan

uv run ruff format src/agents/realtime/session.py tests/realtime/test_session.py
uv run ruff check src/agents/realtime/session.py tests/realtime/test_session.py
uv run pytest tests/realtime/test_session.py -k guardrail -q
uv run pytest tests/realtime/test_session.py -q
uv run mypy src/agents/realtime/session.py tests/realtime/test_session.py
make lint

Replace internal realtime guardrail prompts with a user-facing recovery instruction so blocked audio output is canceled and followed by a safe replacement response.

seratch · 2026-05-06T09:20:55Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7b0bd1aaf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T09:23:16Z

+    failure_details = [
+        {
+            "guardrail": result.guardrail.get_name(),
+            "output_info": result.output.output_info,


Do not feed guardrail output_info back to the model

output_info is arbitrary guardrail data and often includes the exact matched text or internal policy reason. Serializing it into the user message means a PII/competitor/unsafe-output guardrail can re-inject the blocked content into the next model turn, where it may be spoken or exposed in conversation history, defeating the safe fallback path.

Useful? React with 👍 / 👎.

seratch · 2026-05-06T09:56:01Z

Even if the issues pointed out by codex are resolved, we don't plan to have this change. Thanks for your interest here.

fix(realtime): send safe output guardrail feedback

b7b0bd1

Replace internal realtime guardrail prompts with a user-facing recovery instruction so blocked audio output is canceled and followed by a safe replacement response.

github-actions Bot added bug Something isn't working feature:realtime labels May 6, 2026

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

seratch closed this May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(realtime): send safe output guardrail feedback#3138

fix(realtime): send safe output guardrail feedback#3138
adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
adityasingh2400:fix/realtime-output-guardrails-1912

adityasingh2400 commented May 6, 2026

Uh oh!

seratch commented May 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Uh oh!

seratch commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adityasingh2400 commented May 6, 2026

Summary

Test plan

Uh oh!

seratch commented May 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

seratch commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants