Wait for a free GPU, claim it, set CUDA_VISIBLE_DEVICES, and run your command.
On a shared multi-GPU box without a cluster scheduler, starting a job usually
means watching nvidia-smi, picking a card by hand, exporting the env var, and
remembering to actually launch. gpu-gate is the small wait-pick-export-run
loop that does this for you, with a cooperative lock so two invocations on the
same host do not grab the same just-freed card. No daemon, no server, nothing
to administer.
$ gpu-gate run --min-free-mb 8000 -- python train.py
gpu-gate: waiting for a free GPU ...
# ... blocks until a card has >= 8 GB free, then runs train.py with
# CUDA_VISIBLE_DEVICES set to the chosen index$ pip install gpu-gate # from PyPI, once released
$ pip install git+https://github.com/jmweb-org/gpu-gate # latest, available nowIt requires an NVIDIA driver at run time. The NVML binding
(nvidia-ml-py) is pulled in automatically; the package still installs and
imports on machines without a GPU, so it is safe to add to shared requirements.
$ gpu-gate run -n 1 --min-free-mb 8000 -- python train.py --epochs 50Everything after -- is the command. gpu-gate blocks until the requirements
are met, claims the chosen device(s), exports CUDA_VISIBLE_DEVICES, and execs
the command. Its own exit code is the command's exit code, so it drops cleanly
into scripts and CI.
Common options:
| Option | Meaning |
|---|---|
-n, --count |
Number of GPUs to claim (default 1) |
--min-free-mb |
Require at least this much free memory |
--max-util |
Skip cards busier than this percent |
--only 0,1 |
Restrict the search to these indices |
--exclude 2,3 |
Never pick these indices |
--poll |
Seconds between checks (default 5) |
--timeout |
Give up after N seconds (exit 124) |
$ export CUDA_VISIBLE_DEVICES=$(gpu-gate wait --min-free-mb 8000)$ gpu-gate status
idx name free total util
0 NVIDIA L40S 44211 MiB 46068 MiB 3%
1 NVIDIA L40S 812 MiB 46068 MiB 97%
$ gpu-gate status --json| Code | Meaning |
|---|---|
| 0 | Command ran (its own code is forwarded) |
| 2 | Bad invocation (for example, no command after --) |
| 124 | Timed out waiting for a GPU |
| 3 | Requirements could never be met |
| 4 | Could not read GPU state (no driver / NVML error) |
A GPU is eligible when it has enough free memory, is below the utilization
ceiling, is not excluded, and is not currently locked by another gpu-gate
caller. Eligible cards are ranked by most free memory, then lowest
utilization, then index, and the top --count are chosen. The ordering is
fully deterministic.
While a command runs, gpu-gate holds an advisory file lock per claimed
device under $GPU_GATE_LOCK_DIR (a per-user directory by default). Other
gpu-gate invocations skip locked devices, which avoids the classic race where
two jobs both see the same card free at the same instant. The lock is advisory:
it coordinates gpu-gate callers, not arbitrary CUDA programs.
MIT. See LICENSE.