Skip to content

M2 epic: TWAP + EthFlow modules + module.toml manifests#22

Draft
brunota20 wants to merge 30 commits into
nullislabs:mainfrom
bleu:dev/m2-base
Draft

M2 epic: TWAP + EthFlow modules + module.toml manifests#22
brunota20 wants to merge 30 commits into
nullislabs:mainfrom
bleu:dev/m2-base

Conversation

@brunota20

Copy link
Copy Markdown

M2 epic — consolidated for review

This PR aggregates the M2 deliverable for review. M2 ships two production-shaped modules that consume the M1 host surface end-to-end:

  • modules/twap-monitor/ — indexes ComposableCoW.ConditionalOrderCreated, polls watches via eth_call, builds OrderCreation and submits via cow-api, applies OrderPostError::retry_hint() for typed retry classification.
  • modules/ethflow-watcher/ — decodes CoWSwapEthFlow.OrderPlacement logs, lifts the embedded GPv2OrderData into an OrderCreation with Signature::Eip1271, submits, persists submitted:{uid} / dropped:{uid} / backoff:{uid} for re-delivery idempotency.

Plus module.toml manifests for both, exercising the capability declaration + subscription contracts.

Note on diff scope

nullislabs:main is the pre-M1 baseline. Until your in-flight M1 PRs (#8 cow-api, #9 supervisor event loop, #12 ADR bundle, #15 cowprotocol patch) merge, the diff here also includes their contents. Once those land, this PR rebases clean to M2-only.

To focus the M2 review, the M2-specific paths are:

  • modules/twap-monitor/
  • modules/ethflow-watcher/
  • The cow-rs patch bump at the tip (chore(deps): bump cowprotocol patch to bleu/cow-rs main (BLEU-822 + BLEU-823 in))

Paths that belong to your M1 PRs and can be ignored for M2 review:

Validation

  • Unit tests: 20 host tests across the 2 modules (parsers, encoders, retry classifiers, idempotency guards).
  • Supervisor integration tests: each module loads under the real wit-bindgen + WitBindgenHost + supervisor dispatch path (5 integration tests; see the M3 epic PR for the SDK helper layer they consume).
  • Live testnet (Sepolia): documented in docs/operations/m2-testnet-runbook.md. Both modules boot against Sepolia public WS, subscriptions stay alive, and EthFlow round-trip was confirmed end-to-end via a real swap.cow.fi swap (decoder fired, build_eth_flow_creation rejected on the orderbook's app_data digest mismatch — the documented limitation).
  • cargo clippy --all-targets --workspace -- -D warnings clean.
  • cargo fmt --all --check clean.

cow-rs dependency

Patches cowprotocol to bleu/cow-rs main (rev 57f5f55). The fork carries:

Drop the patch once cowprotocol >= 1.0.0-alpha.4 ships upstream. Tracked as ADR-0007 + ADR-0004.

Architectural notes

  • M2 modules use the strategy / lib.rs split (strategy.rs is pure logic against &impl Host; lib.rs is the wit-bindgen adapter). ADR-0009 (added in the M3 epic) captures the decision; the M3 SDK enables it.
  • All wire-format error mapping (typed OrderPostErrorKindRetryAction) lives in the SDK (see M3 epic). M2 modules call classify_api_error(host_error.data.as_deref()).

Architectural surface for review

The host-module architecture surface lives in the M3 epic (the shepherd-sdk Host trait + ADR-0009); M2 itself consumes the M1 surface as-is and does not touch host architecture. The M3 epic PR carries the architecture write-up.

Closes BLEU-813, BLEU-818, BLEU-819, BLEU-820, BLEU-821, BLEU-822, BLEU-823, BLEU-834.

Linear milestone: M2 - TWAP + EthFlow modules.

@brunota20

Copy link
Copy Markdown
Author

Fix-pass on the linearised stack: rebased dev/m2-base onto a doc-link backport so RUSTDOCFLAGS="-D warnings" cargo doc --workspace is green.

  • New tip: a0e34a3 (was 2b11f91)
  • Added: docs(nexum-engine): fix rustdoc intra-doc links after pub(crate) sweep
  • Reason: the M2 compliance commit narrowed manifest::{load, capabilities, ...} and host::{cow_orderbook, ...} re-exports from pub to pub(crate). Three M1-era intra-doc links (crate::host::impls, [load] -> module-vs-fn ambiguity in manifest/mod.rs, same in manifest/types.rs) no longer resolved. The fix matches the same change M3 already carries via a11068d (COW-1069).
  • 4 gates verified green at the new tip on a fresh detached worktree: cargo fmt --all --check, cargo clippy --workspace --all-targets --all-features -- -D warnings, cargo test --workspace --all-features (41/41 nexum-engine + 0 example), RUSTDOCFLAGS="-D warnings" cargo doc --workspace --all-features --no-deps.

@brunota20

Copy link
Copy Markdown
Author

Audit-driven fix pass landed on dev/m2-base.

Before: a0e34a3
After: 184aa45

Fixes applied (3 commits):

Audit reference: bruno-brain/wiki/projects/shepherd-audits/milestone-rubric-grant-audit-2026-06-25.md
Gates green: fmt, clippy -D warnings, cargo test --workspace --all-features, RUSTDOCFLAGS=-D warnings cargo doc.

brunota20 added 28 commits June 25, 2026 20:10
Adds a `[workspace.dependencies]` table to the root manifest
consolidating every dep used by 2+ crates across the full nullis-
shepherd stack (anyhow, thiserror, tokio, futures, serde, serde_json,
tracing, tracing-subscriber, strum, alloy-*, cowprotocol, reqwest,
wit-bindgen, clap). Per-crate manifests inherit with `dep.workspace
= true`, and may add features per call site via `dep = { workspace
= true, features = ["extra"] }`. Single-consumer deps (wasmtime,
toml, redb, getrandom, url, hex, axum, rand, ...) stay per-crate.

Adds `[workspace.lints]` with light-touch defaults: `dbg_macro` and
`todo` denied via clippy, `unsafe_op_in_unsafe_fn` warned via rust.
`unsafe_code = deny` cannot be applied workspace-wide because every
wit-bindgen guest module emits an `unsafe extern "C"` shim.

Also pre-declares `auto_impl` and `derive_more` in the workspace deps
table so future `Arc<dyn Trait>` boundaries and newtype-heavy crates
can opt in without touching the root manifest.

The version-drift failure mode (cowprotocol pinned to `1.0.0-alpha`
in nexum-engine but `1.0.0-alpha.3` in shepherd-sdk, flagged in the
2026-06-25 audit) is now impossible by construction: every consumer
inherits the single workspace pin.

Audit reference: milestone-rubric-grant-audit-2026-06-25.md, judgment
calls 1 + 3.
Replaces the `std::env::args().skip(1)` walker with a `#[derive(clap::
Parser)]` struct so the engine binary picks up `--help`, `--version`,
proper argument validation, and structured error reporting for free.

The positional surface is preserved one-for-one (`<wasm-path>
[manifest-path]`); behaviour for callers that already pass two paths
is identical. Help output now documents each argument inline rather
than hiding the usage in an anyhow message that only fires on misuse.

`clap.workspace = true` consumes the workspace dep added in the
prior commit; no new direct version pin in this crate.

Audit reference: milestone-rubric-grant-audit-2026-06-25.md, judgment
call 2.
…irection

A casual reader of `07-rpc-namespace-design.md` hitting the file
top or the "Method Allowlisting" subsection could plausibly walk
away believing the 0.2 runtime gates RPC methods on a read-only
allowlist and intercepts signing methods to delegate them to the
identity backend. The shipped host implementation does neither:
`chain::request` forwards any method string through to the
configured alloy provider.

Adds an explicit `Status: Future direction (0.3+ target)` callout
both at the file top and right above the "Method Allowlisting"
subsection so the gap between design intent and shipped behaviour
is visible without having to scroll the design narrative end-to-end.

Audit reference: milestone-rubric-grant-audit-2026-06-25.md, judgment
call 4.
Adds the dependencies the 0.2 host backends need:

- cowprotocol (1.0.0-alpha) for the cow-api submission path
  (OrderBookApi, OrderCreation, OrderUid, Chain).
- alloy-provider / -rpc-client / -transport-ws / -primitives (1.5)
  for the chain JSON-RPC dispatch. The reqwest feature on
  alloy-provider engages connect_http; the pubsub/ws features back
  eth_subscribe-class methods.
- redb (2) for local-store. Same crate cowprotocol's own watch-tower
  picked, so the dep tree does not bifurcate when both are used in
  the same workspace.
- reqwest (0.12, rustls-tls) — direct, so the import survives any
  future cowprotocol feature rearrangement.
- tracing + tracing-subscriber (env-filter + fmt) — replaces the 0.1
  eprintln! debug log so the engine can drop into a structured log
  pipeline without re-instrumenting every host call.
- thiserror (2) — typed error enums in each backend.
- tempfile + wiremock as dev-deps for the host backend tests.

Adds engine.example.toml documenting the [engine] state_dir + per-
chain RPC URLs the chain backend reads at boot; data/ is now
ignored so a local run does not leave the redb file in tree.
Replaces the 0.2 Unsupported stubs with working backends. Each
capability lives in its own host submodule so the trait impls in
main.rs stay thin (dispatch + project the backend's typed error
onto HostError).

cow_api::submit_order
  - Parses the guest's bytes as JSON cowprotocol::OrderCreation.
  - Dispatches via cowprotocol::OrderBookApi::post_order.
  - Returns the assigned OrderUid as a 0x-prefixed hex string.

cow_api::request
  - REST passthrough. The base URL is whichever URL the pool's
    OrderBookApi client carries — so OrderBookApi::new_with_base_url
    overrides (staging, wiremock) flow through transparently.
  - Method/path validated host-side; orderbook 4xx/5xx bodies are
    surfaced verbatim so the guest can decode {errorType,description}.

chain::request
  - Raw JSON-RPC dispatch over an alloy DynProvider opened from
    engine.toml at boot. WebSocket URLs engage pubsub (eth_subscribe);
    HTTP URLs use the HTTP transport. Params are passed as
    serde_json::RawValue so alloy does not re-encode.
  - request-batch falls back to per-call dispatch (same shape as the
    earlier stub but now backed by real RPC).

local_store
  - redb file under engine_config.engine.state_dir.
  - Single shared table. Per-module namespacing is enforced
    host-side via [len:u8][module_name][raw_key] prefix on every
    key. list_keys strips the prefix before returning to the guest.

logging
  - Routes through tracing::event! tagged with module=<namespace>.
  - Engine boot installs an EnvFilter-based subscriber; RUST_LOG
    overrides the engine.toml log_level.

identity / remote-store / messaging / http stay at Unsupported per
the 0.2 roadmap (keystore / Swarm / Waku land in 0.3).

Tests (14, all green):
  - cow_orderbook: pool default chains, unknown-chain typing, REST
    GET passthrough, relative-path resolution, unknown-method
    rejection, submit_order round-trip — last three under wiremock
    so the full HTTP path is exercised without hitting api.cow.fi.
  - provider_pool: empty pool surfaces UnknownChain.
  - local_store: roundtrip, namespace isolation, delete, list_keys
    prefix-stripping, empty-namespace rejection.

End-to-end against modules/example: example.wasm loads under the
new wiring, logs init + on_event through the tracing pipeline.
…ed_crate_dependencies, drop redundant map_err)
PR #9 specific:
- main: warn + return when block/log streams end (WebSocket dropped)
- supervisor: simplify dispatch_block by extracting chain_id before move
- supervisor: temp_local_store returns (TempDir, LocalStore) instead of leaking
- README: correct engine.toml chain syntax to [chains.<id>] with rpc_url

Rebased from PR #8:
- local_store_redb: table.range() instead of iter() for O(matching) keys
- provider_pool: dedupe method clone on the success path
- main: hex_encode writes into the pre-allocated buffer
- cow_orderbook: drop blank line nit
- manifest: collapse nested if and use ? operator (clippy)
- alloy_rpc_client / alloy_transport(_ws) imports as _ to satisfy
  unused_crate_dependencies.
Move the manifest.rs monolith into a directory module with four
focused submodules (types, load, capabilities, error). Includes the
Subscription enum and the four PR #9 tests for subscription parsing.

Behaviour unchanged - pure code motion.
main.rs went from 739 lines of mixed bootstrap + 8 Host trait impls +
CLI parser + event loop to ~125 lines of pure orchestration. New
layout:

- bindings.rs: wasmtime::component::bindgen!() moved out so other
  modules can name the generated types.
- cli.rs: Cli struct + manual parser.
- host/state.rs: HostState + WasiView impl.
- host/error.rs: unimplemented / internal_error / hex_encode helpers.
- host/impls/{chain,cow_api,identity,local_store,remote_store,messaging,
  logging,clock,random,http,types}.rs: one Host trait impl per file.
- runtime/limits.rs: DEFAULT_FUEL_PER_EVENT + DEFAULT_MEMORY_LIMIT.
- runtime/event_loop.rs: open_block_streams, open_log_streams, run,
  wait_for_shutdown_signal, TaggedBlockStream, TaggedLogStream.

Adding a new capability is now a single new file under host/impls/
rather than a 60-80 line diff in main.rs.
local_store_redb.rs was 89% tests, cow_orderbook.rs was 60%, and
supervisor.rs was 32% (205 lines absolute). Promote each to a directory
module with the test suite living in a sibling tests.rs so impl-side
diffs stop competing with test churn for attention.
Per ADR-0001 (module.toml schema), authored for the two M2
modules:

twap-monitor / module.toml
- capabilities.required = ["logging", "local-store", "chain",
  "cow-api"] — matches the Rust imports the BLEU-826/827/828
  paths exercise.
- [[subscription]] log on Sepolia (chain_id 11155111) against
  ComposableCoW (0xfdaFc9d1902f4e0b84f65F49f244b32b31013b74)
  with topic-0 keccak256(
    "ConditionalOrderCreated(address,(address,bytes32,bytes))"
  ) = 0x2cceac5555b0ca45a3744ced542f54b56ad2eb45e521962372eef212a2cbf361.
- [[subscription]] block on Sepolia for the BLEU-827 poll loop.

ethflow-watcher / module.toml
- Same capability set (chain reserved for a future eth_call —
  e.g. read the EthFlow refund pointer — without churning the
  manifest).
- [[subscription]] log on Sepolia against CoWSwapEthFlow
  production (0xbA3cB449bD2B4ADddBc894D8697F5170800EAdeC) with
  topic-0 keccak256(
    "OrderPlacement(address,(address,address,address,uint256,uint256,
     uint32,bytes32,uint256,bytes32,bool,bytes32,bytes32),
     (uint8,bytes),bytes)"
  ) = 0xcf5f9de2984132265203b5c335b25727702ca77262ff622e136baa7362bf1da9.

Both [capabilities.http].allow stay empty: all outbound HTTP
flows through the cow-api capability, which routes via the
host's pinned orderbook URL.

The content hash field is the 0.2 placeholder all-zero sha256;
0.3 will validate it against the loaded component bytes.

Linear: BLEU-834. Ref ADR-0001.
Three threads from the internal review mirror of upstream nullislabs/shepherd PR #17:

1. ethflow-watcher/module.toml capabilities: move `chain` from required to optional. The comment on the original manifest already said the module does not call `chain` today; declaring it as required widened the grant for a capability the module does not exercise. Optional keeps "future-proofing" (BLEU-855 can use it without manifest churn) without violating least-privilege.

2. ethflow-watcher/module.toml subscription comment: soften the "identical on every chain" claim. cow-rs::ETH_FLOW_PRODUCTION is identical across chains today, but unlike ComposableCoW's CREATE2 address EthFlow has had multiple per-network and per-version deployments historically. Multi-chain config in M5 must re-check per `chain_id` instead of assuming the address carries. Address itself stays unchanged: 0xbA3cB449bD2B4ADddBc894D8697F5170800EAdeC is verified against the live Sepolia deployment (event firing observed in the COW-1064 dry-run on 2026-06-18 + cow-rs canonical constant + multiple load-test runs).

3. README.md module manifest example: the documented `address` field said `0xC92E8bdf79f0507f65a392b0ab4667716BFE0110` labeled "ComposableCoW", but that is the GPv2VaultRelayer (per scripts/lib.sh). ComposableCoW is `0xfdaFc9d1902f4e0b84f65F49f244b32b31013b74`. Fixed the address; expanded the comment to clarify it is the canonical CREATE2 address (same on every supported chain).

Stays on `feat/m2-module-manifests-bleu-834` as a stacked branch so upstream PR #17 + internal mirror PR #54 can see the fixes as a separate, atomic commit.
…ance)

Filtered subset of the compliance applied in PRs #66/#67 of bleu/nullis-shepherd,
restricted to files that exist on the M2 epic head. M3+ files (shepherd-sdk, examples,
backtest, deploy artifacts) and M4-coupled hunks (ProviderError typed-source variants,
JoinSet reconnect tasks, supervisor restart helper) are skipped — they land via their
own upstream PRs.

Brings M2 epic in line with the repo-wide rust rubric (typed errors, no anyhow in libs,
em-dash sweep, #[non_exhaustive] on public error enums).
The M2 compliance pass (2b11f91) narrowed several manifest/host re-exports
from `pub` to `pub(crate)`. Three intra-doc links inherited from the wider
M1 docs no longer resolve under `-D warnings`:

- `crate::host::impls` — module is `mod impls;` (always private); the link
  doc-rendered as code at the M1-era visibility. Demote to a plain code
  span; the path is still grep-able and accurate.
- `manifest::mod` link to `[load]` — ambiguous now that `pub(crate) use
  load::{... load};` makes `load` both a module and a function. Use
  `mod@load` to disambiguate to the module (matches the surrounding
  prose, which describes the file's responsibilities).
- `manifest::types` link to `[super::load]` — same ambiguity, same fix:
  `mod@super::load`.

Fixes the `RUSTDOCFLAGS="-D warnings" cargo doc` gate on dev/m2-base.
Audit reference: milestone-rubric-grant-audit-2026-06-25.md, Major #1.
Vendored rubric mandates `strum::IntoStaticStr` (with
`#[strum(serialize_all = "snake_case")]`) on every error enum so
`error_kind` labels on the `shepherd_chain_request_total` /
`shepherd_cow_api_*` counters stay in lock-step with the Rust source
of truth instead of growing a `match err { ... => "connect" ... }`
ladder per call site.

Enums covered on this milestone (the ones present on dev/m2-base):
- `nexum_engine::host::cow_orderbook::CowApiError`
- `nexum_engine::host::provider_pool::ProviderError`
- `nexum_engine::manifest::error::ParseError`
- `nexum_engine::engine_config::EngineConfigError`

Also adds `#[non_exhaustive]` to `CowApiError` and `ProviderError`
(audit Major #2). The other two already carried it.

`strum = "0.26"` lands as a direct dep on nexum-engine. The
workspace-deps hoist (audit P1, Major #5) is intentionally a separate
judgment call left to Bruno; this commit ships the substantive rubric
fix without coupling to the broader Cargo.toml restructure.
Audit reference: milestone-rubric-grant-audit-2026-06-25.md,
duplication finding "internal_error('local-store', err.to_string())
map closures" (4 sites in one file).

The four `local-store` host endpoints all `.map_err`-ed the same
`StorageError -> HostError` conversion inline. Replace with a single
`local_store_err(StorageError) -> HostError` free function so a
future error-model change (richer kind, structured `data`) lands in
one place instead of four call sites.

Behaviour is identical; the helper is `fn`, not a closure, so
codegen is one shared symbol.
Audit reference: milestone-rubric-grant-audit-2026-06-25.md, Major #10
(`eprintln!` in daemon-side manifest loader code).

Every other deprecation site in the engine routes through `tracing`
(the main binary installs a JSON `tracing_subscriber` and the
manifest loader itself uses `warn!`). The supervisor's `nexum.toml`
fallback was the lone `eprintln!` survivor, which bypasses the
structured-log pipeline and breaks operators who set
`RUST_LOG=warn,manifest=warn` for daemon log aggregation.

Switches to `warn!(target: "manifest", path = %legacy.display(),
...)` so the deprecation surfaces through the same channel as the
no-`[capabilities]` warning emitted a few lines later by
`manifest::load`.
@brunota20

Copy link
Copy Markdown
Author

Audit judgment-call pass complete on top of #22. New tip: 8f0a4feba05393b819bc99f05ba2cf172f23a980 (was 184aa456...).

Changes layered on top by the bleu/nullis-shepherd audit pass:

  • JC1 (workspace deps hoist): [workspace.dependencies] now owns the table for every dep shared across the engine + module crates (anyhow, thiserror, tokio, futures, serde, serde_json, tracing, tracing-subscriber, strum, auto_impl, derive_more, clap, alloy stack, cowprotocol, reqwest, wit-bindgen). Crates inherit via dep.workspace = true; per-call-site features still opt in.
  • JC1 (workspace lints): [workspace.lints] published; unsafe_op_in_unsafe_fn = "warn" instead of unsafe_code = "deny" because every wit-bindgen guest emits an unsafe extern "C" shim that would otherwise trip the workspace lint.
  • JC2 (clap migration): nexum-engine's CLI moved off hand-rolled std::env::args into the workspace clap derive (matches every other binary crate). M4's --pretty-logs flag retained as a #[arg(long = "pretty-logs")] field on the same struct.
  • JC3 (auto_impl/derive_more): hoisted with default-features = false so opt-in is per-crate.
  • JC4: docs/07-rpc-namespace-design.md carries the allowlist header.

All 4 gates (fmt, clippy --workspace --all-targets --all-features -D warnings, test --workspace --all-features, RUSTDOCFLAGS="-D warnings" doc) green on the new tip. No upstream commits amended; the JC changes land as their own commits on top.

Pushed alongside as bleu/nullis-shepherd:feat/m2-module-manifests-bleu-834 for the PR head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant