Skip to content

Declarative Apriel hybrid block layout via ListDispatchConfigConverter#518

Open
jlamypoirier wants to merge 6 commits into
mainfrom
jlp_apriel_list_dispatch
Open

Declarative Apriel hybrid block layout via ListDispatchConfigConverter#518
jlamypoirier wants to merge 6 commits into
mainfrom
jlp_apriel_list_dispatch

Conversation

@jlamypoirier
Copy link
Copy Markdown
Collaborator

Summary

  • Adds ListDispatchConfigConverter to the post-Declarative checkpoint config conversion #508 declarative converter framework: a primitive for HF formats that discriminate per-position block types via a positional list (e.g. Apriel's hybrid_block_layout) and flat-merge each block's HF claims at the HF root.
  • Migrates AprielBlockConverter / AprielDecoderConverter from imperative dispatchers to weight-side helpers + dispatch registries consumed by the new primitive; deletes ~50 lines of imperative import_config / export_config / _consumed_hf_paths plumbing.
  • Net diff is +117 / -94 across external.py, apriel.py, and test_converters.py. Coverage walker, _consumed_hf_paths, and the test fixture's _children walker each get one new branch for the primitive.

Test plan

  • tests/models/test_converters.py + test_checkpoint.py + test_hf_roundtrip.py (cluster): 281 passed, matches post-Declarative checkpoint config conversion #508 baseline.
  • fast_llm_external_models/tests/ (cluster, separate invocation): 2109 passed, 42 skipped, matches baseline.
  • Local end-to-end runtime smoke: AprielBaseModelConverter.export_config.import_config round-trips both Fixed (["t"]) and Pattern (["t", "t"]) layouts.

jlamypoirier and others added 6 commits May 19, 2026 14:52
Adds a ``ListDispatchConfigConverter`` primitive that subsumes Apriel's
positional ``hybrid_block_layout`` discriminator (one entry per pipeline position
selecting per-block converters whose HF claims flat-merge at the HF root). The
``AprielBlockConverter``/``AprielDecoderConverter`` imperative dispatchers shrink
to weight-side helpers plus the dispatch registries the primitive consumes.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Addresses /review-coarse @ 3397144:

* ``ConfigSectionConverter._consumed_hf_paths`` no longer special-cases primitive types via an
  isinstance switch. Each primitive that has dynamic claims (Nested, Dispatch, TypedDictContainer,
  ListDispatch) overrides ``ConfigConverter._consumed_hf_paths``; the aggregator collapses to a
  single union loop. The framework knows nothing about which primitives are "special".
* ``ListDispatchConfigConverter`` moves out of ``fast_llm/engine/checkpoint/external.py`` and into
  ``fast_llm/models/gpt/conversion/apriel.py`` next to its only consumer. The engine→layers import
  (``FixedBlockSequenceConfig`` / ``PatternBlockSequenceConfig``) goes away.
* The primitive's ``__init__`` now asserts ``set(layout_names) <= set(registry)`` so the
  load-bearing subset invariant is encoded in code rather than prose.
* ``AprielDecoderConverter`` is deleted; ``AprielBaseModelConverter`` carries the dispatch
  registry directly (``block_dispatcher_class``, named to avoid colliding with the parent's
  ``block_converter_class: type[ConfigSectionConverter]``) and inlines the per-block-index
  weight loop into its own ``get_converters``.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Addresses /review-fine @ 3397144:

* ``ListDispatchConfigConverter``, ``AprielBlockConverter``, ``AprielBaseModelConverter``
  docstrings rewritten to describe what each class is, not how the refactor arrived at it
  (no references to parent overrides, downstream consumers, or design intent).
* ``ListDispatchConfigConverter.export_to`` drops the ``list(...)`` wrap around
  ``block_sequence.blocks.values()`` — the values view is iterated once.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
…est walker

Addresses /review-coarse @ 07d77c2:

* ``AprielBaseModelConverter.get_converters`` now dispatches on the decoder type before
  iterating per-position blocks. The previous code indexed ``decoder.blocks`` /
  ``decoder.pattern`` unconditionally, which crashed for ``FixedBlockSequenceConfig``
  (the shape ``ListDispatchConfigConverter.import_to`` returns for a single-entry layout).
  This was a latent bug inherited from the deleted ``AprielDecoderConverter.get_converters``.
* ``tests/models/test_converters.py::_children`` now picks up any dispatch primitive
  exposing a ``_registry: dict`` via ``hasattr`` instead of an explicit isinstance tuple
  reaching across module boundaries. Drops the now-unneeded
  ``ListDispatchConfigConverter`` import from the apriel module.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Addresses /review-fine @ 07d77c2:

* ``ListDispatchConfigConverter.export_to``'s ``NotImplementedError`` now reports the offending
  type plus the ``fast_llm_path`` (matches ``DispatchConfigConverter``'s pattern).
* ``Assert.custom`` lambda parameter renamed ``lns`` → ``names``.
* Docstring rewrites the disjoint-vs-shared claim so it describes the actual invariant
  (per-mixer nested subdicts disjoint; shared top-level scalars cross-validate).
* ``AprielBaseModelConverter.get_converters`` uses ``decoder.expanded_pattern`` instead of the
  manual ``pattern[i % len(pattern)]`` modulo (folded in per the reviewer's note).

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant