Skip to content

fix: serialize MultiIndex level names as JSON for scipy netCDF backend#674

Merged
FabianHofmann merged 4 commits intoPyPSA:masterfrom
FBumann:fix/issue-525-multiindex-scipy-netcdf
May 7, 2026
Merged

fix: serialize MultiIndex level names as JSON for scipy netCDF backend#674
FabianHofmann merged 4 commits intoPyPSA:masterfrom
FBumann:fix/issue-525-multiindex-scipy-netcdf

Conversation

@FBumann
Copy link
Copy Markdown
Collaborator

@FBumann FBumann commented May 7, 2026

Summary

  • Fixes ci: FAILED test/test_io.py::test_model_to_netcdf_with_multiindex - KeyError: ('U', 24) #525: to_netcdf fails with KeyError: ('U', 24) when only scipy is installed (no netCDF4 / h5netcdf). xarray falls back to scipy's netCDF3 backend, which cannot write unicode-array (<U_) attributes — the format used by linopy to store MultiIndex level names as a Python list of strings.
  • Encode the level names as a JSON-encoded scalar string instead, mirroring the existing pattern in the same file for _relaxed_registry / _piecewise_formulations. read_netcdf still accepts the legacy list form for backward compatibility with files written by older linopy versions.
  • Drops a now-dead skipif on the existing MultiIndex test (the xarray bug it guarded was fixed in xarray==2024.02.0, and pyproject.toml already pins xarray>=2024.2.0).

Repro (without this patch, on any xarray version including current 2026.4.0)

import pandas as pd
from linopy import LESS_EQUAL, Model

m = Model()
idx = pd.MultiIndex.from_tuples(
    [(1, "a"), (1, "b"), (2, "a"), (2, "b")], names=["first", "second"]
)
x = m.add_variables(4, pd.Series([8, 10, 12, 14], index=idx), name="x-var")
y = m.add_variables(0, pd.DataFrame([[1, 2], [3, 4], [5, 6], [7, 8]], index=idx), name="y-var")
m.add_constraints(x + y, LESS_EQUAL, 10, name="c1")
m.add_objective(2 * x + 3 * y)
m.to_netcdf("test.nc", engine="scipy")  # KeyError: ('U', 24)

The bug is a scipy-backend limitation, not an xarray regression — upgrading xarray does not fix it.

Test plan

  • test_model_to_netcdf_with_multiindex_scipy_engine — exercises engine="scipy" explicitly and asserts the on-disk *_multiindex attribute is a scalar string (locks in the actual fix, not just the round-trip).
  • test_read_netcdf_with_multiindex_legacy_list_attr — backward-compat: rewrites a netCDF file with the old list-of-strings attribute and confirms read_netcdf still parses it. Skipped when netCDF4 isn't installed (scipy cannot produce that format).
  • Existing test_model_to_netcdf_with_multiindex still passes (default engine).
  • pytest test/test_io.py — 25/25 pass.
  • ruff check / ruff format --check / pre-commit clean.

Notes

Debatable if we need this i think

FBumann and others added 4 commits May 7, 2026 08:09
scipy's netCDF3 backend cannot write unicode-array (`<U_`) attributes,
so to_netcdf failed with `KeyError: ('U', 24)` when MultiIndex level
names were stored as a Python list of strings. Encode them as a
JSON-encoded scalar string instead (matching the existing pattern for
_relaxed_registry / _piecewise_formulations); read_netcdf still accepts
the legacy list form for backward compatibility. Also drops the now-dead
xarray 2024.1.x MultiIndex skipif since linopy requires xarray>=2024.2.0.

Closes PyPSA#525

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- Type the helper as `str | Iterable[str] -> list[str]` instead of `Any`,
  and coerce items to `str` so backend differences (Python list vs
  numpy unicode array) don't leak into downstream code.
- Extend the scipy regression test to assert the on-disk attribute is a
  scalar string, locking in the actual fix not just the round-trip.
- Add a backward-compat test that rewrites a netCDF file with the old
  list-of-strings *_multiindex attr and confirms read_netcdf still
  parses it (skipped if netCDF4 isn't installed, since scipy can't
  produce that format).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Tightening parse_multiindex_attr from `Any` to `list[str]` exposed an
existing dict-invariance gap in xarray's set_index stub: a literal
`{dim: names}` with `names: list[str]` doesn't unify with
`Mapping[Any, Hashable | Sequence[Hashable]]` even though `list[str]`
is a valid `Sequence[Hashable]`. Add `type: ignore[dict-item]` with a
note explaining why.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Drop multi-line comment blocks that explained what the code does;
keep one short line where the WHY is non-obvious (scipy attr limit,
JSON-vs-legacy parser branch).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@FabianHofmann FabianHofmann merged commit 73dd70b into PyPSA:master May 7, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: FAILED test/test_io.py::test_model_to_netcdf_with_multiindex - KeyError: ('U', 24)

2 participants