jmweb-org

Small, well-made command-line tools for the daily workflow of machine learning and AI engineers. Each does one thing, runs offline, returns honest exit codes, and drops into CI. No services, no accounts, nothing to administer.

Data

dsdiff — A git-style diff between two dataset files: schema changes plus column-level distribution drift (PSI), with a CI gate.
splitcheck — Detect rows that leak between train, validation and test splits, exact and after normalization, and fail CI when they do.
pii-sweep — Scan dataset files for personally identifiable information, with a confidence per column and a CI gate, before the data leaves your hands.

Environment and reproducibility

mlenv — Snapshot the full ML stack (Python, CUDA, driver, torch build, GPUs, env vars) to one file and diff two snapshots.
repro-manifest — Capture a portable manifest of a run's environment, code, config and seeds, then diff two manifests to explain why two runs differed.
gpu-gate — Wait for a free GPU, claim it, set CUDA_VISIBLE_DEVICES, and run your command. The wait-pick-export-run loop for shared multi-GPU boxes without a cluster scheduler.

Evaluation

evalgate — Decide whether an eval delta is a real regression or sampling noise, and fail CI only when it is real.
slicemap — Find the data slices where a new model regressed against an old one, ranked by how many rows are affected.

Serving and cost

servectl — Serve a model file over HTTP in one command, with health and Prometheus metrics built in.
tokenmeter — Count tokens and estimate cost for prompts before you send them, from the command line or as a CI budget gate.

Design

The tools share a deliberate shape. Each is a single command that takes a file or two and returns a verdict: a readable terminal view, --json for machines, and a meaningful exit code. Most ship a --check mode that fails CI on the change that matters, so they slot into a pipeline or a pre-commit hook without adopting a platform. They are offline-first and dependency-light, and each is small enough for one engineer to maintain.

More

Machine learning and MLOps work, and the rest of these projects, at jmwebsoluciones.com.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

jmweb-org

jmweb-org

Data

Environment and reproducibility

Evaluation

Serving and cost

Design

More

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!