fix: serialize MultiIndex level names as JSON for scipy netCDF backend#674
Merged
FabianHofmann merged 4 commits intoPyPSA:masterfrom May 7, 2026
Merged
Conversation
scipy's netCDF3 backend cannot write unicode-array (`<U_`) attributes,
so to_netcdf failed with `KeyError: ('U', 24)` when MultiIndex level
names were stored as a Python list of strings. Encode them as a
JSON-encoded scalar string instead (matching the existing pattern for
_relaxed_registry / _piecewise_formulations); read_netcdf still accepts
the legacy list form for backward compatibility. Also drops the now-dead
xarray 2024.1.x MultiIndex skipif since linopy requires xarray>=2024.2.0.
Closes PyPSA#525
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- Type the helper as `str | Iterable[str] -> list[str]` instead of `Any`, and coerce items to `str` so backend differences (Python list vs numpy unicode array) don't leak into downstream code. - Extend the scipy regression test to assert the on-disk attribute is a scalar string, locking in the actual fix not just the round-trip. - Add a backward-compat test that rewrites a netCDF file with the old list-of-strings *_multiindex attr and confirms read_netcdf still parses it (skipped if netCDF4 isn't installed, since scipy can't produce that format). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Tightening parse_multiindex_attr from `Any` to `list[str]` exposed an
existing dict-invariance gap in xarray's set_index stub: a literal
`{dim: names}` with `names: list[str]` doesn't unify with
`Mapping[Any, Hashable | Sequence[Hashable]]` even though `list[str]`
is a valid `Sequence[Hashable]`. Add `type: ignore[dict-item]` with a
note explaining why.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Drop multi-line comment blocks that explained what the code does; keep one short line where the WHY is non-obvious (scipy attr limit, JSON-vs-legacy parser branch). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
FabianHofmann
approved these changes
May 7, 2026
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
to_netcdffails withKeyError: ('U', 24)when onlyscipyis installed (nonetCDF4/h5netcdf). xarray falls back to scipy's netCDF3 backend, which cannot write unicode-array (<U_) attributes — the format used bylinopyto store MultiIndex level names as a Python list of strings._relaxed_registry/_piecewise_formulations.read_netcdfstill accepts the legacy list form for backward compatibility with files written by olderlinopyversions.skipifon the existing MultiIndex test (the xarray bug it guarded was fixed inxarray==2024.02.0, andpyproject.tomlalready pinsxarray>=2024.2.0).Repro (without this patch, on any xarray version including current
2026.4.0)The bug is a scipy-backend limitation, not an xarray regression — upgrading xarray does not fix it.
Test plan
test_model_to_netcdf_with_multiindex_scipy_engine— exercisesengine="scipy"explicitly and asserts the on-disk*_multiindexattribute is a scalar string (locks in the actual fix, not just the round-trip).test_read_netcdf_with_multiindex_legacy_list_attr— backward-compat: rewrites a netCDF file with the old list-of-strings attribute and confirmsread_netcdfstill parses it. Skipped whennetCDF4isn't installed (scipy cannot produce that format).test_model_to_netcdf_with_multiindexstill passes (default engine).pytest test/test_io.py— 25/25 pass.ruff check/ruff format --check/ pre-commit clean.Notes
Debatable if we need this i think