actions
diff --git a/‎docs/adrs/0135-rwx-volume-strategy.md‎
Lines changed: 55 additions & 0 deletions b/‎docs/adrs/0135-rwx-volume-strategy.md‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎packages/k8s/README.md‎
Lines changed: 21 additions & 0 deletions b/‎packages/k8s/README.md‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎packages/k8s/src/hooks/prepare-job.ts‎
Lines changed: 29 additions & 48 deletions b/‎packages/k8s/src/hooks/prepare-job.ts‎
Lines changed: 29 additions & 48 deletions
@@ -0,0 +1,55 @@
+# ADR 0135: RWX volume strategy and RWO affinity fallback
+
+**Date:** 22 April 2026
+
+**Status**: Accepted
+
+## Context
+
+The Kubernetes hook implementation for GitHub Actions runners requires access to the runner's working directory (`_work`) within the dynamically created job pods. This shared access is typically managed via Persistent Volume Claims (PVCs).
+
+Regardless of the storage strategy, job pods are always constrained to run on the same node as the runner pod to ensure consistent access to the local environment and state. The choice of volume access mode determines operational flexibility and multi-pod access capability rather than pod placement.
+
+Depending on the storage provider and cluster configuration, operators may choose between `ReadWriteMany` (RWX) or `ReadWriteOnce` (RWO) access modes. RWX is preferred because it allows multiple pods to access the volume simultaneously, providing greater operational flexibility for future scaling or monitoring scenarios. RWO restricts volume access to a single pod at a time, locking the volume to that pod's specific node.
+
+## Decision
+
+We have decided to establish `ReadWriteMany` (RWX) as the preferred storage strategy for the Kubernetes hook. While job pods remain pinned to the runner's node, RWX provides superior operational flexibility by allowing multiple pods (such as sidecars or auxiliary tools) to access the same volume without storage-imposed locking constraints.
+
+For environments where RWX is unavailable or undesirable, we support a `ReadWriteOnce` (RWO) fallback strategy. This fallback is implemented using node affinity to ensure that job pods are scheduled onto the same node as the runner pod that holds the RWO volume.
+
+### Operational Guidance
+
+1. **Preferred Model (RWX):** Operators should configure the runner with a PVC supporting `ReadWriteMany`.
+2. **Fallback Model (RWO):** If using `ReadWriteOnce`, operators must enable the Kubernetes scheduler integration by setting `ACTIONS_RUNNER_USE_KUBE_SCHEDULER=true`.
+3. **Node Selection:** When scheduler integration is enabled, the hook applies a `requiredDuringSchedulingIgnoredDuringExecution` node affinity targeting the runner's current node (`kubernetes.io/hostname`).
+4. **Implementation Details:** 
+   - The hook determines the node name via `getCurrentNodeName()` and applies affinity in `packages/k8s/src/k8s/index.ts` (lines 101, 165).
+   - The scheduler behavior is toggled by the `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` environment variable, as defined in `packages/k8s/src/k8s/utils.ts` (line 16).
+   - The PVC claim name defaults to `${ACTIONS_RUNNER_POD_NAME}-work` unless overridden by `ACTIONS_RUNNER_CLAIM_NAME` (`packages/k8s/src/hooks/constants.ts`, lines 27-33).
+
+### Non-Recommendations
+
+We explicitly do **not** recommend the use of `spec.nodeName` for operator-driven scheduling. While the hook uses `nodeName` as a legacy fallback when `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` is not set to `true` (`packages/k8s/src/k8s/index.ts`, lines 103, 167), this bypasses the Kubernetes scheduler and can lead to scheduling failures or resource imbalances. Operators should always prefer the affinity-based approach for RWO volumes.
+
+## Alternatives
+
+- **nodeName Bypass:** Directly setting `nodeName` bypasses the scheduler entirely. This was rejected as a recommendation because it prevents the scheduler from accounting for taints, tolerations, and resource pressure.
+- **Local Volumes:** Using local volumes tied to specific nodes. This is a subset of the RWO fallback and is supported via the affinity mechanism.
+
+## Consequences
+
+- **Flexibility:** RWX users benefit from the ability to have multiple pods access the volume simultaneously, simplifying future operational extensions.
+- **Node Coupling:** All users are coupled to the node where the runner pod is running. The hook ensures job pods are scheduled on the same node to maintain workspace integrity.
+- **Configuration:** Operators must be aware of the `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` toggle when moving from RWX to RWO. This toggle controls whether the hook uses `nodeName` (bypassing the scheduler) or node affinity (using the scheduler) to pin the pod to the runner's node.
+
+## Migration Guidance
+
+Operators migrating from an RWO setup that relied on default `nodeName` behavior to a more robust affinity-based setup should:
+1. Ensure the runner pod has the `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` environment variable set to `true`.
+2. Verify that the runner's ServiceAccount has the necessary permissions to list pods (to determine its own node).
+
+## Non-Goals
+
+- This ADR does not recommend `nodeName` as a primary or secondary configuration path for operators.
+- This ADR does not dictate specific storage providers (e.g., EBS vs. EFS vs. Azure Files), but rather the access mode strategy.
@@ -30,6 +30,27 @@ rules:
 - The `ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER` env should be set to true to prevent the runner from running any jobs outside of a container
 - The runner pod should map a persistent volume claim into the `_work` directory
     - The `ACTIONS_RUNNER_CLAIM_NAME` env should be set to the persistent volume claim that contains the runner's working directory, otherwise it defaults to `${ACTIONS_RUNNER_POD_NAME}-work`
+- The `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` env can be set to `true` to enable the Kubernetes scheduler for job pods. When set to `true`, the hook uses `nodeAffinity` to ensure job pods are scheduled correctly (essential for `ReadWriteOnce` volumes). If not set, the hook defaults to a legacy mode where job pods are pinned to the same node as the runner pod using `nodeName`.
+
+## Storage Guidance
+The K8s hooks require a shared volume between the runner pod and the job pods to share the workspace and other internal directories.
+
+### RWX (Recommended)
+The preferred way to configure storage is using a `ReadWriteMany` (RWX) Persistent Volume Claim. While job pods are always pinned to the runner's node, RWX provides better operational flexibility by allowing multiple pods to access the same workspace simultaneously.
+
+To migrate from RWO to RWX:
+1. Provision a new `ReadWriteMany` StorageClass if one is not available.
+2. Update your PVC definition to use `accessModes: [ReadWriteMany]`.
+3. Set `ACTIONS_RUNNER_USE_KUBE_SCHEDULER=true` to enable the scheduler-based node pinning (via affinity) instead of the default `nodeName` pinning.
+
+### RWO Fallback (Affinity-based)
+If `ReadWriteMany` storage is not available, you can use `ReadWriteOnce` (RWO) storage. In this mode, all job pods must be scheduled on the same node as the runner pod that owns the PVC.
+
+To enable this safely:
+1. Ensure `ACTIONS_RUNNER_USE_KUBE_SCHEDULER` is set to `true`.
+2. The hooks will automatically add a `nodeAffinity` to the job pods, ensuring they are scheduled on the same node as the runner pod (`kubernetes.io/hostname` match).
+
+> **Note:** We do not recommend manually setting `nodeName` in the pod template, as the hooks handle node placement automatically via affinity when the scheduler is enabled.
 - Some actions runner env's are expected to be set. These are set automatically by the runner.
     - `RUNNER_WORKSPACE` is expected to be set to the workspace of the runner
     - `GITHUB_WORKSPACE` is expected to be set to the workspace of the job
 
@@ -1,4 +1,5 @@
 import * as core from '@actions/core'
+import * as io from '@actions/io'
 import * as k8s from '@kubernetes/client-node'
 import {
   JobContainerInfo,
@@ -7,33 +8,26 @@ import {
   writeToResponseFile,
   ServiceContainerInfo
 } from 'hooklib'
+import path from 'path'
 import {
   containerPorts,
-  createJobPod,
+  createPod,
   isPodContainerAlpine,
   prunePods,
   waitForPodPhases,
-  getPrepareJobTimeoutSeconds,
-  execCpToPod,
-  execPodStep
+  getPrepareJobTimeoutSeconds
 } from '../k8s'
 import {
-  CONTAINER_VOLUMES,
+  containerVolumes,
   DEFAULT_CONTAINER_ENTRY_POINT,
   DEFAULT_CONTAINER_ENTRY_POINT_ARGS,
   generateContainerName,
   mergeContainerWithOptions,
   readExtensionFromFile,
   PodPhase,
-  fixArgs,
-  prepareJobScript
+  fixArgs
 } from '../k8s/utils'
-import {
-  CONTAINER_EXTENSION_PREFIX,
-  getJobPodName,
-  JOB_CONTAINER_NAME
-} from './constants'
-import { dirname } from 'path'
+import { CONTAINER_EXTENSION_PREFIX, JOB_CONTAINER_NAME } from './constants'
 
 export async function prepareJob(
   args: PrepareJobArgs,
@@ -46,6 +40,7 @@ export async function prepareJob(
   await prunePods()
 
   const extension = readExtensionFromFile()
+  await copyExternalsToRoot()
 
   let container: k8s.V1Container | undefined = undefined
   if (args.container?.image) {
@@ -75,8 +70,7 @@ export async function prepareJob(
 
   let createdPod: k8s.V1Pod | undefined = undefined
   try {
-    createdPod = await createJobPod(
-      getJobPodName(),
+    createdPod = await createPod(
       container,
       services,
       args.container.registry,
@@ -96,13 +90,6 @@ export async function prepareJob(
     `Job pod created, waiting for it to come online ${createdPod?.metadata?.name}`
   )
 
-  const runnerWorkspace = dirname(process.env.RUNNER_WORKSPACE as string)
-
-  let prepareScript: { containerPath: string; runnerPath: string } | undefined
-  if (args.container?.userMountVolumes?.length) {
-    prepareScript = prepareJobScript(args.container.userMountVolumes || [])
-  }
-
   try {
     await waitForPodPhases(
       createdPod.metadata.name,
@@ -115,28 +102,6 @@ export async function prepareJob(
     throw new Error(`pod failed to come online with error: ${err}`)
   }
 
-  await execCpToPod(createdPod.metadata.name, runnerWorkspace, '/__w')
-
-  if (prepareScript) {
-    await execPodStep(
-      ['sh', '-e', prepareScript.containerPath],
-      createdPod.metadata.name,
-      JOB_CONTAINER_NAME
-    )
-
-    const promises: Promise<void>[] = []
-    for (const vol of args?.container?.userMountVolumes || []) {
-      promises.push(
-        execCpToPod(
-          createdPod.metadata.name,
-          vol.sourceVolumePath,
-          vol.targetVolumePath
-        )
-      )
-    }
-    await Promise.all(promises)
-  }
-
   core.debug('Job pod is ready for traffic')
 
   let isAlpine = false
@@ -180,8 +145,10 @@ function generateResponseFile(
     const mainContainerContextPorts: ContextPorts = {}
     if (mainContainer?.ports) {
       for (const port of mainContainer.ports) {
-        mainContainerContextPorts[port.containerPort] =
-          mainContainerContextPorts.hostPort
+        if (port.containerPort && port.hostPort) {
+          mainContainerContextPorts[port.containerPort.toString()] =
+            port.hostPort.toString()
+        }
       }
     }
 
@@ -217,6 +184,17 @@ function generateResponseFile(
   writeToResponseFile(responseFile, JSON.stringify(response))
 }
 
+async function copyExternalsToRoot(): Promise<void> {
+  const workspace = process.env['RUNNER_WORKSPACE']
+  if (workspace) {
+    await io.cp(
+      path.join(workspace, '../../externals'),
+      path.join(workspace, '../externals'),
+      { force: true, recursive: true, copySourceDirectory: false }
+    )
+  }
+}
+
 export function createContainerSpec(
   container: JobContainerInfo | ServiceContainerInfo,
   name: string,
@@ -250,7 +228,7 @@ export function createContainerSpec(
     container['environmentVariables'] || {}
   )) {
     if (value && key !== 'HOME') {
-      podContainer.env.push({ name: key, value })
+      podContainer.env.push({ name: key, value: value as string })
     }
   }
 
@@ -266,7 +244,10 @@ export function createContainerSpec(
     })
   }
 
-  podContainer.volumeMounts = CONTAINER_VOLUMES
+  podContainer.volumeMounts = containerVolumes(
+    container['userMountVolumes'],
+    jobContainer
+  )
 
   if (!extension) {
     return podContainer