Skip to content

Fix FrameView staleness for sensors parented under physics bodies#5477

Open
ooctipus wants to merge 1 commit intoisaac-sim:developfrom
ooctipus:octi/fabric-frameview-fix
Open

Fix FrameView staleness for sensors parented under physics bodies#5477
ooctipus wants to merge 1 commit intoisaac-sim:developfrom
ooctipus:octi/fabric-frameview-fix

Conversation

@ooctipus
Copy link
Copy Markdown
Collaborator

@ooctipus ooctipus commented May 3, 2026

Summary

Changes

file what
source/isaaclab_physx/.../physx_manager.py PhysicsManager.step calls _update_fabric(dt, sim_time) after each simulate/fetch_results with monotonic _fabric_sim_time. Body-write block is gated by the carb setting — call is free when nothing's registered.
source/isaaclab/isaaclab/app/app_launcher.py Stop unconditionally setting /physics/fabricUpdateTransformations to _rendering_enabled(). Respect the value if a kit file / kit_args / sim_launcher already set it True.
source/isaaclab_tasks/.../sim_launcher.py New _needs_fabric_reads predicate scanning env_cfg for RayCasterCfg / CameraCfg. When detected, launch_simulation injects --/app/useFabricSceneDelegate=1 --/physics/fabricUpdateTransformations=1 into kit_args before AppLauncher (FSD must be set at kit boot). Mirrors the existing _is_kit_camera auto-detect.
compute_kit_requirements signature Now returns 4-tuple including needs_fabric_reads. Three internal test files updated.
test_views_xform_prim_fabric.py Module-level AppLauncher passes the kit_args sim_launcher would inject. Regression test test_world_pose_tracks_physics_body_parent (also added in #5476) goes green.

Performance

anymal_c rough terrain, 1024 envs, 500 steps after 100 warmup, RTX 5090:

variant per-step env-steps/s vs baseline
fix off (today's broken state) 4.481 ms 228k baseline
fix on 5.073 ms 201k +13.2%
no FrameView env 4.456 ms 230k noise

~13% physics-step cost when a heightmap / camera sensor is in the env (cost of pushing all anymal articulation transforms into Fabric every step). Zero overhead for envs without FrameView sensors. This is the cost of correct observations vs the silent training corruption we had.

Why not just enable --enable_cameras?

That works but loads RTX shaders, hydra texture, replicator, and viewport extensions — adds seconds to startup and significant GPU memory for users who don't need rendering. The _needs_fabric_reads auto-detect uses only the FSD scene-delegate flag (free routing setting; no shaders, no extensions) plus the PhysX → Fabric writer toggle.

What about Newton?

Newton has its own implementation in newton_manager.sync_transforms_to_usd that doesn't depend on omni.physx.fabric, so this PR does not affect the Newton backend. (Aside: Newton's sync is gated on pre_render, which suggests a similar bug for pure-headless Newton training — worth a separate investigation.)

Test plan

  • CI for test_world_pose_tracks_physics_body_parent[cuda:0] goes green.
  • Existing test_views_xform_prim_fabric.py contract tests still pass (18 passed locally).
  • test_ray_caster_integration.py still passes (6 passed locally).
  • Isaac-Velocity-Rough-Anymal-C-v0 height_scan obs now varies as the robot walks (verified by re-recording the heightmap video from before/after).

Linked: #5476

Sensors that read pose through FrameView (RayCaster, Camera, IMU spawned
under articulation or rigid bodies) have been silently returning their
spawn-time pose forever in headless training. Height-scan obs in
rough-terrain locomotion was a frozen patch from spawn — agents trained
on a constant input.

Root cause is that FrameView reads from Fabric, but the PhysX -> Fabric
write path is gated by three independent flags, all of which are off in
headless mode as a perf optimization (the assumption being that without a
renderer nothing reads from Fabric):

  /app/useFabricSceneDelegate           - off in the headless kit file
  /physics/fabricUpdateTransformations  - stomped to False by AppLauncher
                                          when rendering is disabled
  per-step _update_fabric(dt, time)     - PhysicsManager.forward called it
                                          with (0, 0), which is a no-op
                                          (timestamp doesn't advance)

isaac-sim#5179 switched RayCaster from PhysX tensor views (always GPU-fresh) to
FrameView and quietly broke the assumption that "no renderer => nobody
reads Fabric". The existing FrameView contract tests only assert pose at
init, before any sim.step, so CI never caught it.

This commit:

- physx_manager.py: PhysicsManager.step pushes Fabric every step with
  monotonic _fabric_sim_time so the body-write path inside FabricManager
  isn't skipped on stale-timestamp grounds. Free when no FrameView
  consumer exists (the body-write block is gated by the carb setting).

- app_launcher.py: stop unconditionally setting
  /physics/fabricUpdateTransformations to _rendering_enabled() in
  _load_extensions; if it was already True (kit_args / kit file /
  sim_launcher), keep it True.

- sim_launcher.py: new _needs_fabric_reads predicate scanning env_cfg
  for RayCasterCfg / CameraCfg. When detected, launch_simulation injects
    --/app/useFabricSceneDelegate=1 --/physics/fabricUpdateTransformations=1
  into kit_args before AppLauncher. FSD must be set at kit boot — runtime
  flips don't take effect once SimulationContext has run. Mirrors the
  existing _is_kit_camera auto-detect pattern.

  compute_kit_requirements now returns a 4-tuple (..., needs_fabric_reads).
  Three internal test files updated for the new signature.

- test_views_xform_prim_fabric.py: regression test
  test_world_pose_tracks_physics_body_parent (drops a RigidBody parent
  under gravity, asserts FabricFrameView world pose drift > 0.5m). Pre-fix
  drift = 0.0000 m, post-fix = 4.95 m. Module-level AppLauncher now passes
  the same kit_args sim_launcher would inject for FrameView envs.

Performance (anymal_c rough, 1024 envs, 500 steps after 100 warmup):

  fix off (broken)   4.481 ms/step   228k env-steps/s   baseline
  fix on             5.073 ms/step   201k env-steps/s   +13.2%
  no FrameView env   4.456 ms/step   230k env-steps/s   noise

~13% physics-step cost when a heightmap/camera sensor is in the env
(this is the actual cost of pushing 1024 anymal articulations' transforms
into Fabric every step). Zero overhead for envs without FrameView
sensors.

Newton has its own implementation in newton_manager.sync_transforms_to_usd
that doesn't depend on omni.physx.fabric, so this PR does not affect the
Newton backend.
@github-actions github-actions Bot added bug Something isn't working isaac-lab Related to Isaac Lab team labels May 3, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR fixes a critical bug where FrameView-based sensors (RayCaster, Camera, IMU) parented under physics bodies returned stale spawn-time poses in headless training. The fix correctly identifies the root cause: PhysX→Fabric transform writes are gated by settings that are off by default in headless mode. The implementation adds per-step fabric updates in PhysxManager.step() and auto-detects FrameView sensors to enable the required settings at kit boot.

Architecture Impact

Cross-module impact is well-contained:

  • PhysxManager.step() now calls _update_fabric(dt, sim_time) after physics fetch — affects all PhysX-based simulations but is gated by the carb setting
  • compute_kit_requirements() signature change (3-tuple → 4-tuple) requires updates to all callers — correctly done in 3 test files
  • sim_launcher.launch_simulation() injects kit_args when FrameView sensors detected — only affects envs with RayCaster/Camera
  • AppLauncher._load_extensions() now preserves existing fabricUpdateTransformations setting — prevents stomping config set by kit files or sim_launcher

Implementation Verdict

Minor fixes needed — One logic gap in the auto-detection path that could silently miss enabling fabric writes.

Test Coverage

The new regression test test_world_pose_tracks_physics_body_parent directly validates the fix by spawning a physics body, letting it fall under gravity, and asserting the child FrameView tracks the parent. This is an excellent targeted test. However:

  • ✅ Test correctly sets up the required kit_args to mirror sim_launcher behavior
  • ✅ Test validates the actual failure mode (drift_z > 0.5m threshold)
  • ⚠️ No unit test for the _needs_fabric_reads() predicate itself

CI Status

No CI checks available yet — critical to verify test_world_pose_tracks_physics_body_parent[cuda:0] passes.

Findings

🔴 Critical: source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py:280 — Fabric settings not injected when has_kit_cameras=True

The condition if needs_kit and needs_fabric_reads and not has_kit_cameras: skips fabric setting injection when cameras are present. This assumes enable_cameras=True implicitly enables fabric writes, but that's not guaranteed. If a user has a RayCaster (no camera) + a Camera with Kit renderer, the RayCaster path depends on fabric writes that may not be enabled just because cameras are on.

# Current logic at line 280:
if needs_kit and needs_fabric_reads and not has_kit_cameras:
    fabric_kit_args = ...

The enable_cameras path enables rendering which does set /physics/fabricUpdateTransformations=True via _rendering_enabled(), but this relies on indirect coupling. Consider whether the not has_kit_cameras exclusion is intentional or should be removed.

🟡 Warning: source/isaaclab_physx/isaaclab_physx/physics/physx_manager.py:176-180 — Duplicate _fabric_sim_time declaration

The class variable _fabric_sim_time is declared twice with identical comments:

  • Line 176-180 (declaration with comment)
  • The same variable appears to already exist (based on the truncated file showing it at the class level)

Looking at the diff, line 176 adds the comment and line 180 adds the variable. But the full file at line 178 also shows:

_fabric_sim_time: ClassVar[float] = 0.0

This appears intentional as the PR adds both the comment and the variable together. No actual bug, but verify there's no duplicate in the full source.

🟡 Warning: source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py:114-125 — _needs_fabric_reads only checks RayCasterCfg and CameraCfg

The PR description mentions IMU sensors as affected, but _needs_fabric_reads() only checks for RayCasterCfg and CameraCfg. If IMU uses FrameView for pose reads, it should be added:

def _needs_fabric_reads(node) -> bool:
    from isaaclab.sensors.ray_caster import RayCasterCfg
    # Missing: ImuCfg if it uses FrameView
    return isinstance(node, (CameraCfg, RayCasterCfg))

Verify whether ImuCfg also needs fabric reads — if so, add it to the isinstance check.

🟡 Warning: source/isaaclab/isaaclab/app/app_launcher.py:1166-1172 — Type coercion from carb settings

The comment mentions settings.get() may return 1/0 from kit_args overrides, but bool(1) is True and bool(0) is False, so this works. However, if the setting doesn't exist, settings.get() may return None, and bool(None) is False, which is the desired default. This is correct but worth a comment:

# settings.get() returns None if unset, 1/0 from kit_args, or bool from prior set_bool
existing = bool(settings.get("/physics/fabricUpdateTransformations"))  # None → False

🔵 Improvement: source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py:111-181 — Test could be more robust

The test uses a hardcoded 0.5m threshold for free-fall validation. Under g≈9.81 m/s², 1 second of free fall should produce ~4.9m displacement. The 0.5m threshold is conservative but consider:

# More explicit physics validation
expected_min_fall = 0.5 * 9.81 * 1.0**2  # ~4.9m theoretical
# Using 0.5m as threshold is ~10% of expected — very conservative, which is fine

The test is correct but a comment explaining the threshold choice would help maintainability.

🔵 Improvement: source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py:122-125 — Import inside function

The RayCasterCfg import is inside _needs_fabric_reads() which is called during config tree scanning. While this avoids circular imports at module load time, consider whether this could be cached or moved if performance becomes a concern (unlikely given it's called once per launch).

Claude: I'll provide a thorough review of this PR.

🤖 Isaac Lab Review Bot

Summary

This PR fixes a critical bug where FrameView-based sensors (RayCaster, Camera) parented under physics bodies returned stale spawn-time poses in headless training. The root cause was that PhysX→Fabric transform writes are gated by three flags all off in headless mode. The fix adds per-step fabric updates in PhysxManager.step() and auto-enables required settings when FrameView sensors are detected.

Architecture Impact

  • PhysxManager.step(): Now calls _update_fabric(dt, sim_time) after fetch_results() — affects all PhysX simulations but is internally no-op when setting is off
  • compute_kit_requirements(): Signature changes from 3-tuple to 4-tuple — all 3 callers in test files updated correctly
  • sim_launcher.launch_simulation(): Injects kit_args when FrameView sensors detected and cameras aren't already enabled
  • AppLauncher._load_extensions(): Preserves existing fabricUpdateTransformations setting instead of stomping it

Implementation Verdict

Minor fixes needed — One logic gap in the auto-detection that could miss enabling fabric writes in edge cases.

Test Coverage

Excellent regression test test_world_pose_tracks_physics_body_parent directly validates the fix by spawning a physics body, letting it fall under gravity, and asserting the child FrameView tracks the parent. Test correctly mirrors the kit_args that sim_launcher would inject. Missing: unit test for _needs_fabric_reads() predicate itself.

CI Status

No CI checks available yet — critical to verify test_world_pose_tracks_physics_body_parent[cuda:0] passes.

Findings

🔴 Critical: source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py:280 — Fabric settings skipped when env has Kit cameras but also has RayCaster

The condition if needs_kit and needs_fabric_reads and not has_kit_cameras: assumes that when has_kit_cameras=True, the enable_cameras auto-enable path will set fabricUpdateTransformations. This is true because enable_cameras triggers rendering which calls _rendering_enabled()=True in AppLauncher line 1172. However, this coupling is implicit and fragile. If a user explicitly sets enable_cameras=False while having a Kit camera (unusual but possible), RayCaster sensors would still get stale poses. The logic is likely correct for real-world usage but the dependency chain should be documented.

🟡 Warning: source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py:114-125 — Missing ImuCfg from _needs_fabric_reads()

The PR description explicitly lists IMU as affected ("RayCaster, Camera, IMU under articulation / rigid bodies"), but the predicate only checks CameraCfg and RayCasterCfg:

def _needs_fabric_reads(node) -> bool:
    from isaaclab.sensors.ray_caster import RayCasterCfg
    return isinstance(node, (CameraCfg, RayCasterCfg))  # Missing ImuCfg?

If ImuCfg uses FrameView for pose reads, it should be added. If IMU doesn't use FrameView, the PR description should be corrected.

🟡 Warning: source/isaaclab_physx/isaaclab_physx/physics/physx_manager.py:178 — _fabric_sim_time reset in close() but not in reset()

Looking at line 348, _fabric_sim_time is reset to 0.0 in close(), but reset() (line 221-240) doesn't reset it. This means across soft resets, the sim_time continues accumulating. While this shouldn't cause functional issues (the write path just needs monotonically increasing time), it could cause float precision issues in very long training runs. Consider whether reset(soft=False) should also reset _fabric_sim_time = 0.0.

🟡 Warning: source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py:20-24 — Test hardcodes kit_args instead of importing from sim_launcher

The test manually specifies kit_args="--/app/useFabricSceneDelegate=1 --/physics/fabricUpdateTransformations=1". If the settings in sim_launcher change, this test won't track them. Consider extracting these to a shared constant or having the test import from sim_launcher.

🔵 Improvement: source/isaaclab/isaaclab/app/app_launcher.py:1170 — Comment could clarify None handling

existing = bool(settings.get("/physics/fabricUpdateTransformations"))

The settings.get() may return None if unset (which bool() converts to False), 1/0 from kit_args, or a bool from prior set_bool(). The code is correct but a brief inline comment would help future readers understand the type coercion.

🔵 Improvement: source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py:282-286 — Log message could be clearer

The log message mentions "RayCaster / Camera" but if ImuCfg is added later, the message would need updating. Consider making it more generic: "FrameView-based sensors that read pose from Fabric".

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 3, 2026

Greptile Summary

This PR fixes a silent training corruption bug where FrameView-based sensors (RayCaster, Camera) parented under physics bodies returned their spawn-time pose forever in headless runs. The fix has three coordinated parts: physx_manager.py now calls _update_fabric(dt, sim_time) with a monotonically advancing timestamp after every physics step; app_launcher.py preserves an already-True /physics/fabricUpdateTransformations instead of overwriting it; and sim_launcher.py auto-injects useFabricSceneDelegate + fabricUpdateTransformations kit boot args when a RayCasterCfg or CameraCfg is detected in the env config.

Confidence Score: 4/5

Safe to merge; fixes a confirmed silent training corruption with targeted, well-tested changes and only P2 findings.

No P0 or P1 issues found. The three-part fix correctly addresses the root cause and is verified by a new regression test. Minor gaps: launcher_args=None silently drops the FSD injection, and the implicit assumption that enabling kit cameras also enables FSD is undocumented.

source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py — the launcher_args=None silent-drop case and the not has_kit_cameras implicit assumption warrant a closer look.

Important Files Changed

Filename Overview
source/isaaclab_physx/isaaclab_physx/physics/physx_manager.py Adds _fabric_sim_time class variable and calls _update_fabric(dt, sim_time) with a monotonically advancing timestamp after each physics step, fixing stale FrameView poses in headless mode; the forward() no-op at (0.0, 0.0) is benign but undocumented.
source/isaaclab/isaaclab/app/app_launcher.py Preserves an existing True value for /physics/fabricUpdateTransformations instead of unconditionally overwriting it; prevents sim_launcher's kit_args injection from being stomped at AppLauncher init.
source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py Adds _needs_fabric_reads predicate and injects FSD + fabricUpdateTransformations kit args when env contains RayCasterCfg/CameraCfg; compute_kit_requirements now returns a 4-tuple; injection silently does nothing when launcher_args=None.
source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py Bootstraps AppLauncher with FSD/fabricUpdateTransformations kit args and adds a regression test verifying that a FabricFrameView child tracks a RigidBody parent through free-fall simulation.
source/isaaclab_tasks/test/test_distributed_device_resolution.py Mechanical update to mock compute_kit_requirements to return the new 4-tuple; no logic changes.
source/isaaclab_tasks/test/test_preset_kit_decision.py Mechanical 3-tuple → 4-tuple unpacking update; no logic changes.
source/isaaclab_tasks/test/test_sim_launcher_visualizer_intent.py Mechanical mock return-value update from 3-tuple to 4-tuple; no logic changes.

Sequence Diagram

sequenceDiagram
    participant SC as sim_launcher.launch_simulation
    participant AL as AppLauncher.__init__
    participant PM as PhysxManager.step
    participant FB as omni.physx.fabric._update_fabric
    participant FV as FabricFrameView.get_world_poses

    SC->>SC: compute_kit_requirements() → needs_fabric_reads=True
    SC->>SC: inject kit_args: useFabricSceneDelegate=1 + fabricUpdateTransformations=1
    SC->>AL: AppLauncher(launcher_args)
    AL->>AL: set_bool fabricUpdateTransformations = _rendering_enabled() OR existing=True
    Note over AL: Does NOT stomp True back to False

    loop Each physics step
        PM->>PM: physx_sim.simulate(dt) + fetch_results()
        PM->>PM: _fabric_sim_time += dt
        PM->>FB: _update_fabric(dt, _fabric_sim_time)
        Note over FB: Writes PhysX body world transforms into Fabric (monotonic time)
        PM-->>FV: Fabric transform data is fresh
        FV-->>FV: get_world_poses() returns correct post-step pose
    end
Loading

Comments Outside Diff (1)

  1. source/isaaclab_physx/isaaclab_physx/physics/physx_manager.py, line 248-254 (link)

    P2 forward() calls _update_fabric(0.0, 0.0) — permanently a no-op for body transforms

    The comment in step() explains that passing (0.0, 0.0) is skipped by DirectGpuHelper::update because the timestamp doesn't advance monotonically. With the new step() path handling monotonic-time writes, the _update_fabric(0.0, 0.0) call in forward() is now a no-op for rigid-body/articulation-link transforms. This is benign because step() already flushed them, but it may confuse future readers — consider documenting why this is intentionally left as a no-op, or removing it if it truly has no remaining effect.

Reviews (1): Last reviewed commit: "Fix FrameView staleness for sensors pare..." | Re-trigger Greptile

Comment on lines +281 to +287
)
if isinstance(launcher_args, argparse.Namespace):
existing = getattr(launcher_args, "kit_args", "") or ""
launcher_args.kit_args = (existing + " " + fabric_kit_args).strip()
elif isinstance(launcher_args, dict):
existing = launcher_args.get("kit_args", "") or ""
launcher_args["kit_args"] = (existing + " " + fabric_kit_args).strip()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Fabric args silently dropped when launcher_args is None

When launcher_args=None (which is explicitly allowed by the function signature and default), neither the Namespace nor dict branch executes, so useFabricSceneDelegate and fabricUpdateTransformations are never injected into the kit boot args. A caller doing launch_simulation(env_cfg) with a headless RayCaster/Camera env will hit the same stale-pose bug this PR was meant to fix, with no warning.

Consider adding a warning log for this case:

if isinstance(launcher_args, argparse.Namespace):
    existing = getattr(launcher_args, "kit_args", "") or ""
    launcher_args.kit_args = (existing + " " + fabric_kit_args).strip()
elif isinstance(launcher_args, dict):
    existing = launcher_args.get("kit_args", "") or ""
    launcher_args["kit_args"] = (existing + " " + fabric_kit_args).strip()
else:
    logger.warning(
        "Auto-Fabric detection found FrameView-based sensors but launcher_args is None — "
        "cannot inject --/app/useFabricSceneDelegate. Sensor poses may be stale in headless mode."
    )

# config contains FrameView-based sensors. FSD has to be set at kit boot (carb
# setting changes after SimulationContext is created don't take effect — PhysX
# registers bodies during warmup), so inject it via kit_args before AppLauncher.
if needs_kit and needs_fabric_reads and not has_kit_cameras:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 not has_kit_cameras guard relies on an undocumented assumption

When a scene contains both a CameraCfg (kit renderer) and a RayCasterCfg, has_kit_cameras=True blocks the fabric kit args injection. The code assumes that enabling cameras (enable_cameras=True) causes the Kit experience file to enable FSD, which in turn satisfies the RayCaster's fabric requirement. This implicit dependency is fragile — consider adding an inline comment clarifying this so future changes to the camera-enable path don't silently break RayCasters in mixed-sensor envs.

@kellyguo11 kellyguo11 moved this to In review in Isaac Lab May 5, 2026
@kellyguo11 kellyguo11 moved this from In review to Ready in Isaac Lab May 5, 2026
@kellyguo11 kellyguo11 moved this from Ready to In progress in Isaac Lab May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working isaac-lab Related to Isaac Lab team

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

2 participants