Skip to content

Terminator calls system_shutdown immediately when parent goes down, ignoring shutdown_timeout for placed children #86

@samrat

Description

@samrat

We're using FLAME to persist WebRTC call-routing processes across deployments. The idea is to run packet-routing processes on FLAME runner nodes via place_child with link: false, so that when the parent node restarts during a deploy, the runner and its placed children survive until the new parent can reconnect.

The place_child docs say:

:link – Whether the caller should be linked to the remote child process to prevent long-running orphaned resources. Defaults to true. Set to false to support
long-running work that you want to complete within the :shutdown_timeout of the remote runner, even when the parent process or node is terminated.

However, when the parent node goes down, FLAME.Terminator calls system_stop/2 immediately. It does not wait for shutdown_timeout before initiating shutdown:

{:noreply, system_stop(state, message)}

The shutdown_timeout seems to only apply in terminate/2 to drain in-flight RPC calls-- it doesn't delay the system_shutdown() call itself. So placed children with link: false get no grace period to finish their work; the runner starts terminating immediately.

Is this the intended behavior? We read the link: false docs as implying placed children would get up to shutdown_timeout to complete their work when the parent goes away.

Should the Terminator delay calling system_stop() to honor that, or is there a different mechanism we should be using for this use case?

Environment

  • FLAME: v0.5.3
  • Backend: flame_k8s_backend v0.5.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions