You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# ADR 0135: RWX volume strategy and RWO affinity fallback
2
+
3
+
**Date:** 22 April 2026
4
+
5
+
**Status**: Accepted
6
+
7
+
## Context
8
+
9
+
The Kubernetes hook implementation for GitHub Actions runners requires access to the runner's working directory (`_work`) within the dynamically created job pods. This shared access is typically managed via Persistent Volume Claims (PVCs).
10
+
11
+
Regardless of the storage strategy, job pods are always constrained to run on the same node as the runner pod to ensure consistent access to the local environment and state. The choice of volume access mode determines operational flexibility and multi-pod access capability rather than pod placement.
12
+
13
+
Depending on the storage provider and cluster configuration, operators may choose between `ReadWriteMany` (RWX) or `ReadWriteOnce` (RWO) access modes. RWX is preferred because it allows multiple pods to access the volume simultaneously, providing greater operational flexibility for future scaling or monitoring scenarios. RWO restricts volume access to a single pod at a time, locking the volume to that pod's specific node.
14
+
15
+
## Decision
16
+
17
+
We have decided to establish `ReadWriteMany` (RWX) as the preferred storage strategy for the Kubernetes hook. While job pods remain pinned to the runner's node, RWX provides superior operational flexibility by allowing multiple pods (such as sidecars or auxiliary tools) to access the same volume without storage-imposed locking constraints.
18
+
19
+
For environments where RWX is unavailable or undesirable, we support a `ReadWriteOnce` (RWO) fallback strategy. This fallback is implemented using node affinity to ensure that job pods are scheduled onto the same node as the runner pod that holds the RWO volume.
20
+
21
+
### Operational Guidance
22
+
23
+
1.**Preferred Model (RWX):** Operators should configure the runner with a PVC supporting `ReadWriteMany`.
24
+
2.**Fallback Model (RWO):** If using `ReadWriteOnce`, operators must enable the Kubernetes scheduler integration by setting `ACTIONS_RUNNER_USE_KUBE_SCHEDULER=true`.
25
+
3.**Node Selection:** When scheduler integration is enabled, the hook applies a `requiredDuringSchedulingIgnoredDuringExecution` node affinity targeting the runner's current node (`kubernetes.io/hostname`).
26
+
4.**Implementation Details:**
27
+
- The hook determines the node name via `getCurrentNodeName()` and applies affinity in `packages/k8s/src/k8s/index.ts` (lines 101, 165).
28
+
- The scheduler behavior is toggled by the `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` environment variable, as defined in `packages/k8s/src/k8s/utils.ts` (line 16).
29
+
- The PVC claim name defaults to `${ACTIONS_RUNNER_POD_NAME}-work` unless overridden by `ACTIONS_RUNNER_CLAIM_NAME` (`packages/k8s/src/hooks/constants.ts`, lines 27-33).
30
+
31
+
### Non-Recommendations
32
+
33
+
We explicitly do **not** recommend the use of `spec.nodeName` for operator-driven scheduling. While the hook uses `nodeName` as a legacy fallback when `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` is not set to `true` (`packages/k8s/src/k8s/index.ts`, lines 103, 167), this bypasses the Kubernetes scheduler and can lead to scheduling failures or resource imbalances. Operators should always prefer the affinity-based approach for RWO volumes.
34
+
35
+
## Alternatives
36
+
37
+
-**nodeName Bypass:** Directly setting `nodeName` bypasses the scheduler entirely. This was rejected as a recommendation because it prevents the scheduler from accounting for taints, tolerations, and resource pressure.
38
+
-**Local Volumes:** Using local volumes tied to specific nodes. This is a subset of the RWO fallback and is supported via the affinity mechanism.
39
+
40
+
## Consequences
41
+
42
+
-**Flexibility:** RWX users benefit from the ability to have multiple pods access the volume simultaneously, simplifying future operational extensions.
43
+
-**Node Coupling:** All users are coupled to the node where the runner pod is running. The hook ensures job pods are scheduled on the same node to maintain workspace integrity.
44
+
-**Configuration:** Operators must be aware of the `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` toggle when moving from RWX to RWO. This toggle controls whether the hook uses `nodeName` (bypassing the scheduler) or node affinity (using the scheduler) to pin the pod to the runner's node.
45
+
46
+
## Migration Guidance
47
+
48
+
Operators migrating from an RWO setup that relied on default `nodeName` behavior to a more robust affinity-based setup should:
49
+
1. Ensure the runner pod has the `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` environment variable set to `true`.
50
+
2. Verify that the runner's ServiceAccount has the necessary permissions to list pods (to determine its own node).
51
+
52
+
## Non-Goals
53
+
54
+
- This ADR does not recommend `nodeName` as a primary or secondary configuration path for operators.
55
+
- This ADR does not dictate specific storage providers (e.g., EBS vs. EFS vs. Azure Files), but rather the access mode strategy.
Copy file name to clipboardExpand all lines: packages/k8s/README.md
+21Lines changed: 21 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,6 +30,27 @@ rules:
30
30
- The `ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER` env should be set to true to prevent the runner from running any jobs outside of a container
31
31
- The runner pod should map a persistent volume claim into the `_work` directory
32
32
- The `ACTIONS_RUNNER_CLAIM_NAME` env should be set to the persistent volume claim that contains the runner's working directory, otherwise it defaults to `${ACTIONS_RUNNER_POD_NAME}-work`
33
+
- The `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` env can be set to `true` to enable the Kubernetes scheduler for job pods. When set to `true`, the hook uses `nodeAffinity` to ensure job pods are scheduled correctly (essential for `ReadWriteOnce` volumes). If not set, the hook defaults to a legacy mode where job pods are pinned to the same node as the runner pod using `nodeName`.
34
+
35
+
## Storage Guidance
36
+
The K8s hooks require a shared volume between the runner pod and the job pods to share the workspace and other internal directories.
37
+
38
+
### RWX (Recommended)
39
+
The preferred way to configure storage is using a `ReadWriteMany` (RWX) Persistent Volume Claim. While job pods are always pinned to the runner's node, RWX provides better operational flexibility by allowing multiple pods to access the same workspace simultaneously.
40
+
41
+
To migrate from RWO to RWX:
42
+
1. Provision a new `ReadWriteMany` StorageClass if one is not available.
43
+
2. Update your PVC definition to use `accessModes: [ReadWriteMany]`.
44
+
3. Set `ACTIONS_RUNNER_USE_KUBE_SCHEDULER=true` to enable the scheduler-based node pinning (via affinity) instead of the default `nodeName` pinning.
45
+
46
+
### RWO Fallback (Affinity-based)
47
+
If `ReadWriteMany` storage is not available, you can use `ReadWriteOnce` (RWO) storage. In this mode, all job pods must be scheduled on the same node as the runner pod that owns the PVC.
48
+
49
+
To enable this safely:
50
+
1. Ensure `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` is set to `true`.
51
+
2. The hooks will automatically add a `nodeAffinity` to the job pods, ensuring they are scheduled on the same node as the runner pod (`kubernetes.io/hostname` match).
52
+
53
+
> **Note:** We do not recommend manually setting `nodeName` in the pod template, as the hooks handle node placement automatically via affinity when the scheduler is enabled.
33
54
- Some actions runner env's are expected to be set. These are set automatically by the runner.
34
55
-`RUNNER_WORKSPACE` is expected to be set to the workspace of the runner
35
56
-`GITHUB_WORKSPACE` is expected to be set to the workspace of the job
0 commit comments