Skip to content

Launching a component with a missing binary can sometimes succeed #261

Description

@WilliamRoebuck

Description

The log shows that the component was not started because the binary was missing, however, the state transition reports success.

Analysis results

This was caused by an incorrect check in Graph::queueHeadNodes(). In this specific case, a node would be queued and fail very fast before the function exits. The function checks how many nodes are "in flight", sees 0, and reports that there was no work to do and thus that the state transition was an unconditional success. This is incorrect, since the transition failed.

Solution

The "nodes in flight" check been replaced with a check on the number of executable nodes, and an additional error path added when a node is not enqueued. See #262

Error Occurrence Rate

Sporadic

How to reproduce

Run any integration test with a component binary path set incorrectly. Approximately 1 in 60 times, the transition will incorrectly succeed.

Supporting Information

[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Start transition to MainPG/run_target_app_does_not_report_krunning_in_time for PG MainPG ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Graph::setState changes from kSuccess to kInTransition for PG 0 ( MainPG ) ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Stop Dependencies: 0 ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Stop Dependencies: 0 ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Stop Dependencies: 0 ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Start Dependencies: 0 ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Start Dependencies: 0 ]�[0m
[ Starting process 1 ( component_does_not_report_krunning_in_time ) from executable /tmp/tests/process_wrong_binary_failure/abc_complex_reporting_process ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[101;30m !!! -> �[0m �[0;34m 2026/6/24 9:2:48 LCLM LCLM ERROR:   [ File does not exist or is not executable: /tmp/tests/process_wrong_binary_failure/abc_complex_reporting_process ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Graph::setState changes from kInTransition to kAborting for PG 0 ( MainPG ) ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ startProcess for MainPG process 1 ( component_does_not_report_krunning_in_time ) done ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Graph::setState changes from kAborting to kUndefinedState for PG 0 ( MainPG ) ]�[0m
[2026-06-24 09:02:48.794] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Control Client handler nudged ]�[0m
[2026-06-24 09:02:48.796] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Graph::setState changes from kUndefinedState to kUndefinedState for PG 0 ( MainPG ) ]�[0m
[2026-06-24 09:02:48.796] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM INFO:    [ Completed the request for PG MainPG to State MainPG/run_target_app_does_not_report_krunning_in_time in 1 ms ]�[0m
[2026-06-24 09:02:48.796] [INFO] [launch_manager]  �[0;34m 2026/6/24 9:2:48 LCLM LCLM DEBUG:  [ Control Client handler nudged ]�[0m

Classification

Major

First Affected Release

0.7

Last Affected Release

0.7

Expected Fixed Release

0.8

Category

  • Safety Relevant
  • Security Relevant

Metadata

Metadata

Labels

No labels
No labels

Type

No fields configured for Bug.

Projects

Status
In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions