Adapt aca-model to the pylcm #361 API restructure#12
Open
hmgaudecker wants to merge 21 commits into
Open
Conversation
c1c976f to
ab8b67f
Compare
Adopt pylcm's public `lcm/` / private `_lcm/` package split and the accompanying API reorganisation: the `Regime` two-class split, the grid renames (`Piece` → `PiecewiseGridSegment`, `*Process` classes), the `FlatParams` rename, and the `regime` → `regime_id` / `regime_name` distinction. Declare `distributed=True` on the assets grid for multi-GPU sharding, activate the beartype claw on the package, and apply the boilerplate update (hatch-vcs versioning, refreshed pre-commit hooks, expanded `.gitignore`). Co-Authored-By: Claude Opus 4.7 <[email protected]>
ab8b67f to
39ac270
Compare
The terminal `dead` regime carries only a tiny `[pref_type, assets]` value function. Inheriting the distributed `assets` grid made its V-array topology claim a sharded assets axis, while the solver emits a replicated array — a mismatch that surfaces as an opaque XLA sharding error mid-solve on multi-GPU runs. Sharding a terminal 3x24 V-array buys nothing; declare `assets` non-distributed for `dead`. Co-Authored-By: Claude Opus 4.7 <[email protected]>
`GridConfig.subjects_batch_size_by_log_level` is a mapping from `log_level` string (`"off"`, `"warning"`, `"progress"`, `"debug"`) to the per-device simulate chunk size. Empty by default — the lookup helper `GridConfig.get_subjects_batch_size(log_level)` returns 0, matching the existing no-chunking behaviour. `create_model` (both baseline and aca variants) gains an optional `subjects_batch_size` keyword, forwarded to the pylcm `Model`. Callers in aca-estimation look up the value via `grid_config.get_subjects_batch_size(log_level)` keyed on the same `log_level` they pass to `model.simulate(...)`, so each task automatically gets the chunk size sized for its diagnostic budget. Co-Authored-By: Claude Opus 4.7 <[email protected]>
aca-model now passes `subjects_batch_size` to `Model(...)` (see ec29319), which is a new field introduced on the `feat/distributed-V-arrays` branch (PR #364) on top of `refactor/phase-2-api-reorganisation` (PR #361). The CI pin still pointed at #361, so the `pip install` pulled a pylcm without `subjects_batch_size`, and every Model-construction test raised `TypeError: Model.__init__() got an unexpected keyword argument`. Re-point the CI pin to the PR #364 branch. Will move to `main` once #361 and #364 land. Co-Authored-By: Claude Opus 4.7 <[email protected]>
Solve's per-period `max_Q_over_a` integrand spans the full state grid by default; on A100 even with the assets axis distributed across 4 GPUs the working set runs against the 80 GB device limit. Splaying the pref-type axis with a Python loop (`batch_size=1`) shrinks the per-kernel allocation by `n_pref_types`. Default stays `0` (single fused kernel) — the production GridConfig overrides opt into `1` when the unsplayed kernel doesn't fit.
The assets axis is hardcoded distributed=True in regimes; pylcm's grid-init guard rejects batch_size > 0 + distributed. The default has to match that constraint or every fresh GridConfig() raises GridInitializationError at construction.
The pylcm sharding-consistency validator now requires every state to carry the same `distributed` flag in every regime that declares it. Restoring `assets` to the shared grid lets the dead regime's V-array sharding match the alive regimes that transition into it. Also drops the test that codified the workaround. Co-Authored-By: Claude Opus 4.7 <[email protected]>
… test. `consumption_dollars_floor` is a DAG function output; the scalar key in `params` is `consumption_equiv_floor`. Looking up the function name raised `KeyError` before the constraint check ran. Co-Authored-By: Claude Opus 4.7 <[email protected]>
…fixes. Picks up the validated cross-grid (`health_trans_cross`) and same-grid pre-65 (`health_trans_pre65`) transition matrices that now carry valid probability rows at every source-active age (51-64). Co-Authored-By: Claude Opus 4.7 <[email protected]>
…filter handles unreachable per-target. Co-Authored-By: Claude Opus 4.7 <[email protected]>
…ed_params.
The pkl key is consumed by `get_benchmark_params` via
`fixed_params.pop("max_consumption_dollars")` (benchmark.py:119) and
threaded into `inject_consumption_dollars_points`. Earlier regenerations
of the pkl dropped this entry, which broke both `tests/test_model_creation.py`
and pylcm's `benchmark-pr` GPU run (the latter installs aca-model from
this same pkl).
Value (300000.0) matches `MAX_CONSUMPTION_DOLLARS` in
`aca-data/src/aca_data/task_environment_constants.py:23`.
Co-Authored-By: Claude Opus 4.7 <[email protected]>
- `_common.py`: ContinuousGrid annotations replaced with concrete PiecewiseLinSpacedGrid (aime) and IrregSpacedGrid (consumption_dollars), matching the actual built types. - test_initial_conditions_extreme_assets: import validate_initial_conditions through `lcm.model` instead of `_lcm.simulation.initial_conditions`.
`import plotly.io` + `pio.templates.default = ...` and the unused module-level `DataCatalog()` instance contributed ~250ms of import time to every task collection that touched aca_model. Plotting defaults belong in aca-post, where the plotting actually happens. Brings `import aca_model.baseline.regimes` from ~1.6s to ~1.07s (the remaining cost is dominated by the `import lcm` beartype claw). 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.7 <[email protected]>
Wires `RouwenhorstAR1Process.batch_size` through `GridConfig` so the wage-residual stochastic productmap can be split with an inner Python loop. At `n_wage_res_batch_size=1` the per-target Q_and_F intermediate shrinks by `n_wage_res_gridpoints` (5), bringing the kernel under 80 GB for the ACA-overlay nongroup_nomc_* regimes where the unsplayed kernel hit 144 GB. `n_pref_type_batch_size` remains a no-op pending separate splay wiring. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.7 <[email protected]>
Threads the existing pylcm `distributed=True` semantics out of hardcoded values into GridConfig so production runs can swap which axis is sharded. Defaults preserve the assets-sharded layout. The production override pairs `assets_distributed=False + pref_type_distributed=True` for the 3-device pref_type-sharded mesh: zero cross-shard collectives (pref_type is a fixed state with no transitions) and 1/n_pref_types per-device V_arr reduction. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.7 <[email protected]>
Plumbs `n_health_batch_size`, `n_spousal_income_batch_size`, `n_lagged_labor_supply_batch_size`, `n_claimed_ss_batch_size` through GridConfig and into `build_states` via `Grids.grid_config`. The flags let production runs splay every non-sharded discrete state with a Python-level outer loop, compounding the per-call Q intermediate reduction by the product of the splayed axes' cardinalities — outer to the continuous block per the `_ordered_state_action_names` sort. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.7 <[email protected]>
Continuous-grid sharding is rejected at pylcm grid init: every interpolation lookup over a sharded continuous axis compiles to an `all-gather` of the full V-array per device. The `assets` axis therefore cannot be sharded. The `assets_distributed` knob and its consumer in `build_grids` are removed; runs select `pref_type_distributed` (the remaining discrete-sharding option) or fall back to single-device splaying.
The `subjects_batch_size` knob was added alongside a pylcm change that has since been reverted; current pylcm `Model.__init__` does not accept it, so every `create_model` call was raising `TypeError` against unmodified pylcm. Drop the kwarg from both `baseline.create_model` and `aca.create_model`; drop the matching `subjects_batch_size_by_log_level` field and `get_subjects_batch_size` helper from `GridConfig`. If per-subject chunking turns out to be needed later, it should re-land together with the corresponding pylcm Model parameter — wired through a single source of truth, not split across two repos.
…tates Add `lagged_labor_supply_distributed`, `claimed_ss_distributed`, and `spousal_income_distributed` to `GridConfig` and thread them through `build_states` into the inline-built `DiscreteGrid(...)` calls. Enables the 2x2 (lagged_labor_supply x claimed_ss) and 3-way (spousal_income) sharding configurations needed by the OOM / performance experiment matrix on Marvin. Co-Authored-By: Claude Opus 4.7 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adapt aca-model to pylcm's #361 API restructure:
lcm// private_lcm/package split and the API reorganisation.Regimetwo-class split, grid renames (Piece→PiecewiseGridSegment,*Processclasses), theFlatParamsrename, and theregime→regime_id/regime_namedistinction.distributed=Trueon the assets grid for multi-GPU sharding..gitignore.Test plan
pixi run -e tests-cpu testspixi run -e type-checking typrek run --all-files🤖 Generated with Claude Code