Skip to content

Add interactive drive recording options#301

Open
gtong-nv wants to merge 4 commits into
mainfrom
dev/gtong/demo-recording-option
Open

Add interactive drive recording options#301
gtong-nv wants to merge 4 commits into
mainfrom
dev/gtong/demo-recording-option

Conversation

@gtong-nv

@gtong-nv gtong-nv commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Add Interactive Drive Recording Options

Summary

Adds rollout recording support to the OmniDreams interactive-drive demo.

This PR lets world-model manifests enable recording, optionally auto-start recording when a rollout begins, and save the captured bundle under a repository-root-relative output directory by default.

Changes

  • Add RecordingConfig and InteractiveDriveRecorder for writing recording bundles.
  • Add manifest fields:
    • recording_enabled
    • recording_dir
    • recording_hotkey
    • recording_auto_start
  • Resolve relative recording_dir values from the flashdreams repository root instead of the manifest/config directory.
  • Preserve absolute recording_dir values as-is.
  • Wire recording config through AppConfig, InteractiveDriveApp, KeyboardState, and run_main_loop.
  • Support hotkey toggles through the SlangPy presenter, SlangPy HUD presenter, and MJPEG streaming presenter.
  • Add auto-start behavior so recording begins as soon as each rollout starts when recording_auto_start: true.
  • Close active recordings on loop end, manual reset, and OOB respawn.
  • Save each bundle with:
    • first_frame.png
    • prompt.txt
    • metadata.json
    • hdmap.mp4
    • inferred.mp4
  • Write first_frame.png from the first recorded inferred frame so it matches inferred.mp4, instead of using the input RGB frame.
  • Document recording options in the interactive-drive README.
  • Enable recording in the perf manifest with recording_hotkey: F9 and recording_auto_start: true.

Notes

This PR supports configured single-key recording hotkeys such as F9 or r. Modifier combinations such as Alt+R still need separate modifier-state plumbing in the browser and SlangPy event paths.

Tests

Local verification:

uv run --no-sync --package flashdreams-omnidreams pytest \
  integrations/omnidreams/tests/interactive_drive/test_keyboard_state.py \
  integrations/omnidreams/tests/interactive_drive/test_presenter.py \
  integrations/omnidreams/tests/interactive_drive/test_recording.py \
  integrations/omnidreams/tests/interactive_drive/test_latency_loop.py \
  integrations/omnidreams/tests/interactive_drive/test_manifest.py \
  integrations/omnidreams/tests/interactive_drive/test_cli_manifest_resolution.py

Result:

55 passed in 18.86s

@copy-pr-bot

copy-pr-bot Bot commented Jun 6, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps

greptile-apps Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds rollout recording support to the OmniDreams interactive-drive demo. A new RecordingConfig / InteractiveDriveRecorder pair collects HD-map and inferred video frames into a bounded ring buffer and flushes a bundle (first_frame.png, prompt.txt, metadata.json, hdmap.mp4, inferred.mp4) on stop. Recording state is wired through AppConfig, KeyboardState, all three presenters, and run_main_loop, with graceful teardown on reset, OOB respawn, and session end.

  • New recording.py implements the recorder with a configurable frame cap (default 600), non-fatal I/O guards on both start() and stop(), and collision warnings when the recording hotkey overlaps a control key.
  • Manifest fields (recording_enabled, recording_dir, recording_hotkey, recording_auto_start) are parsed in world_model/manifest.py and threaded through cli.py, with relative paths resolved from the detected flashdreams repository root.
  • 55 new tests cover hotkey normalization, collision detection, bundle writing, buffer caps, start/stop error resilience, and loop integration.

Confidence Score: 5/5

Safe to merge; all recording I/O is non-fatal and teardown is handled in every exit path.

The feature is well-contained: recording failures in both start() and stop() are caught and logged rather than propagated, the frame buffer is bounded, and the recorder is cleanly closed on reset, OOB respawn, loop-end, and exceptions via app.py's finally block. The only finding is a minor session-count numbering inconsistency when a start() attempt fails, which has no runtime impact on correctness.

recording.py — session counter increment ordering; otherwise no files require special attention.

Important Files Changed

Filename Overview
integrations/omnidreams/omnidreams/interactive_drive/recording.py New file: RecordingConfig, InteractiveDriveRecorder, and helpers; session_count is incremented before I/O is confirmed, leading to numbering gaps when start() fails; buffer eviction uses list.pop(0) which is O(n).
integrations/omnidreams/omnidreams/interactive_drive/runtime/loop.py Wires RecordingBackend protocol into run_main_loop; auto-start, toggle, frame recording, and graceful close on reset/respawn/loop-end are all handled correctly.
integrations/omnidreams/omnidreams/interactive_drive/app.py Wraps the rollout loop in try/finally to ensure recorder.close() runs on session end; _build_recorder guards against missing scene; no issues.
integrations/omnidreams/omnidreams/interactive_drive/cli.py Adds _find_flashdreams_root (with RuntimeWarning fallback), _resolve_recording_output_dir, and _recording_config_from_manifest; both raster and omnidreams backends populate RecordingConfig correctly.
integrations/omnidreams/omnidreams/interactive_drive/world_model/manifest.py Adds parse_recording_manifest / load_recording_manifest and four recording fields to WorldModelManifest; collision warning at parse time is a nice UX touch.
integrations/omnidreams/omnidreams/interactive_drive/streaming_presenter.py Recording toggle check added after reset branch with an early return; reset correctly takes priority when hotkey collides with r.
integrations/omnidreams/omnidreams/interactive_drive/input/keyboard.py Adds recording_enabled / recording_hotkey state and request_recording_toggle / consume_recording_toggle_request with correct lock semantics; no issues.
integrations/omnidreams/tests/interactive_drive/test_recording.py New test file; covers bundle write, video-write failure, start failure, frame cap, raster-only, and collision-suffix correctness with good coverage.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[app.py: _build_recorder] --> B[run_main_loop called per rollout]
    B --> C{recorder.auto_start?}
    C -->|yes| D[recorder.start]
    C -->|no| E[wait for hotkey]
    D --> F[per-frame: record_frame]
    E --> F
    F --> G{event?}
    G -->|toggle hotkey| H[recorder.toggle]
    H --> F
    G -->|reset requested| I[close reason=reset]
    I --> B
    G -->|OOB respawn| J[close reason=respawn]
    J --> B
    G -->|presenter closes| K[close reason=loop-end]
    K --> L[app.py finally: close reason=scene-end no-op]
Loading

Reviews (5): Last reviewed commit: "Save raster recording first frame" | Re-trigger Greptile

Comment on lines +127 to +160
def stop(self, *, reason: str) -> Path | None:
session = self._active_session
if session is None:
return None
self._active_session = None
metadata = {
"scene_id": self._scene.scene_id,
"scene_path": str(self._scene.scene_path),
"fps": self._fps,
"start_time_utc": session.start_time_utc,
"stop_time_utc": datetime.now(timezone.utc).isoformat(),
"reason": reason,
"hdmap_frames": len(session.hdmap_frames),
"inferred_frames": len(session.inferred_frames),
}
(session.output_dir / "metadata.json").write_text(
json.dumps(metadata, indent=2, sort_keys=True) + "\n",
encoding="utf-8",
)
_write_video(session.hdmap_frames, session.output_dir / "hdmap.mp4", self._fps)
_write_video(
session.inferred_frames,
session.output_dir / "inferred.mp4",
self._fps,
)
self._closed_paths.append(session.output_dir)
print(
"[recording] saved "
f"dir={session.output_dir} "
f"hdmap_frames={len(session.hdmap_frames)} "
f"inferred_frames={len(session.inferred_frames)}",
flush=True,
)
return session.output_dir

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unhandled exception from video write crashes the session

_active_session is cleared on line 131 before the video files are written. If _write_video raises (disk full, codec error, mediapy bug), the exception propagates through close()_close_recording()run_main_loop → the app.py finally block, terminating the entire interactive-drive session. The partial bundle directory (with metadata.json and prompt.txt) is also never added to closed_paths, so callers have no record of the attempt.

Wrapping the file-writing section in a try/except that logs and skips addition to closed_paths — or at least catching and logging the exception so the app continues — would make recording failures non-fatal.

Comment on lines +185 to +189
candidate = base
suffix = 1
while candidate.exists():
suffix += 1
candidate = root / f"{base.name}-{suffix:02d}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The collision-avoidance loop initialises suffix = 1 but increments it to 2 before first use, so the first disambiguated directory produced is ...-02 while ...-01 is never generated. Starting suffix at 0 closes the gap.

Suggested change
candidate = base
suffix = 1
while candidate.exists():
suffix += 1
candidate = root / f"{base.name}-{suffix:02d}"
candidate = base
suffix = 0
while candidate.exists():
suffix += 1
candidate = root / f"{base.name}-{suffix:02d}"

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@liruilong940607 liruilong940607 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this functionality!

We should consider expose these functionalities in the doc instead of burying them here in this config. Now that interactive demo has many controls now -- for example switching between RGB and HDMap, the controls that supported on the job sticker etc -- we might consider open another doc page [could be another level under omnidreams] to detail these user-facing instructions.

Comment on lines +185 to +189
candidate = base
suffix = 1
while candidate.exists():
suffix += 1
candidate = root / f"{base.name}-{suffix:02d}"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@gtong-nv gtong-nv force-pushed the dev/gtong/demo-recording-option branch from 84a3838 to 75bf982 Compare June 10, 2026 21:40
Comment on lines +118 to +132
def start(self) -> None:
if self.is_recording:
return
self._session_count += 1
start_time = datetime.now(timezone.utc)
output_dir = self._next_output_dir(start_time)
output_dir.mkdir(parents=True, exist_ok=False)
(output_dir / "prompt.txt").write_text(self._scene.prompt, encoding="utf-8")
self._active_session = _RecordingSession(
output_dir=output_dir,
start_time_utc=start_time.isoformat(),
hdmap_frames=[],
inferred_frames=[],
)
print(f"[recording] started dir={output_dir}", flush=True)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 start() I/O failures are not caught and crash the session

output_dir.mkdir() and prompt.txt write are not wrapped in a try/except. A permissions error, disk-full condition, or race on the output directory will raise and propagate through run_main_loop into the app.py try block, terminating the entire interactive-drive session — the same class of problem that was addressed for stop(). When triggered by a user hotkey mid-session or by auto_start at rollout begin, this would end the session with a Python traceback and no recording written, even though the world model was running fine. Wrapping the I/O in a try/except with a logger warning (matching the stop() pattern) would make start failures non-fatal.

Comment on lines +182 to +194
def record_frame(self, frame: PresentedFrame) -> None:
session = self._active_session
if session is None:
return
session.hdmap_frames.append(_as_rgb_host_uint8(frame.rgb_host_uint8))
if frame.model_rgb_host_uint8 is not None:
inferred_frame = _as_rgb_host_uint8(frame.model_rgb_host_uint8)
if not session.first_frame_written:
Image.fromarray(inferred_frame).save(
session.output_dir / "first_frame.png"
)
session.first_frame_written = True
session.inferred_frames.append(inferred_frame)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unbounded in-memory frame accumulation

hdmap_frames and inferred_frames grow without bound for the entire lifetime of a recording session. With recording_auto_start: true in the perf manifest, recording begins at rollout start and runs until reset, OOB respawn, or loop end. At a typical interactive-drive resolution (e.g. 640×360) and 10 fps, a 10-minute continuous rollout accumulates roughly 4 GB across both streams before stop() finally calls np.stack and writes the videos. On a system that was already GPU-memory-heavy this will OOM the host. There is no frame cap, sliding window, or background-write path. Consider either capping the buffer (dropping the oldest frames) or streaming frames directly to the video file as they arrive.

@gtong-nv

Copy link
Copy Markdown
Collaborator Author

I like this functionality!

We should consider expose these functionalities in the doc instead of burying them here in this config. Now that interactive demo has many controls now -- for example switching between RGB and HDMap, the controls that supported on the job sticker etc -- we might consider open another doc page [could be another level under omnidreams] to detail these user-facing instructions.

Thanks @liruilong940607 . Missed this feedback the other day.
Yea, this is mostly for our internal use for now. I will create a doc page once the features are stabilized. Thanks for the review! Most greptile comments addressed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants