[controller][schema] Coerce legacy numeric defaults during store migration#2802
[controller][schema] Coerce legacy numeric defaults during store migration#2802xunyin8 wants to merge 6 commits into
Conversation
| return changed; | ||
| } | ||
|
|
||
| private static JsonNode coerceNumber(JsonNode value, String declaredType) { |
There was a problem hiding this comment.
Jackson always parses a JSON float literal like 0.0 into a DoubleNode, never a FloatNode. so the value.isDouble() short-circuit on the "float" branch means a legacy schema written as {"type":"float","default":0.0} would slip through without being rewritten.
The existing tests cover IntNode-on-float (e.g. "default":0) but not DoubleNode-on-float, so I can't tell from the suite alone whether avroutil1's strict parser tolerates that combo or not. Could you please add a quick test for {"type":"float","default":0.0} running through coerceNumericDefaultsToFieldType followed by parseSchemaFromJSONStrictValidation?
| try { | ||
| AvroSchemaParseUtils.parseSchemaFromJSONStrictValidation(schemaStr); | ||
| return schemaStr; | ||
| } catch (Exception strictFailure) { |
There was a problem hiding this comment.
The catch (Exception strictFailure) block doesn't log or chain the original exception. That matters because the defensive parseSchemaFromJSONStrictValidation on the next line is the one most likely to re-throw for the non-numeric cases your comment already calls out (union default ordering, bad names, etc.), and when it does the operator has no idea what was wrong with the source schema.
A LOGGER.info("strict parse failed for store {}, attempting coercion", storeName, strictFailure) before the coercion attempt would cover it. Or chain strictFailure as the cause of whatever the second parse throws.
…ation
Adds a destination-side rewrite ({0 -> 0.0} on float-typed fields, etc.)
so legacy schemas registered before validateNumericDefaultValueTypes was
enforced can be migrated into clusters where the controller's STRICT
parser now rejects them. Gated on storeConfig.migrationDestCluster, so
non-migration writes are unaffected; defensively re-strict-parses the
output so non-numeric violations (bad names, dangling content, union
default not first branch) still fail loudly.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The existing tests for parseSchemaFromJSONLooseNumericValidation and coerceNumericDefaultsToFieldType live in venice-push-job; diff coverage on venice-client-common (where the class lives) only sees branches exercised by tests in the same module. Adds 16 focused tests covering the strict/loose/loose-numeric parsers, the parseSchemaFromJSON wrapper with both extendedSchemaValidityCheckEnabled values, and every branch of the JSON walker (each numeric tier, nested record recursion, non-textual type passthrough, identity short-circuit for clean input, IOException → VeniceException wrap). Diff coverage on the changed lines: 76.92% branches (45% required). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Assert.assertSame on String operands compiles to ==, which trips ES_COMPARING_STRINGS_WITH_EQ. Switch to Assert.assertEquals — the observable behavior (walker returns the input unchanged for clean or non-textual-type schemas) is still verified. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Jackson parses any JSON decimal literal (e.g. 0.0) into a DoubleNode
regardless of the declared field type. The "float" branch of
coerceNumber short-circuits on value.isDouble(), so legacy schemas
written as {"type":"float","default":0.0} are NOT rewritten by the
walker — they pass through unchanged. Empirically avro-util1's STRICT
parser accepts DoubleNode-on-float (the numeric-tier check is
asymmetric: rejects IntNode-on-float, accepts DoubleNode-on-float), so
the output is still strict-clean.
Adds a regression-pinning test for this combination. If avro-util1 ever
tightens the float numeric-tier check to be symmetric, this test will
fail and the "float" branch will need to coerce DoubleNode -> FloatNode.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
When the initial strict parse fails inside normalizeSchemaForMigration, log strictFailure at INFO before attempting the LOOSE_NUMERICS-based coercion, and on the post-coercion strict re-check attach strictFailure as a suppressed exception of whatever the second parse throws. For non-numeric violations (union default not first branch, bad names, dangling content) the post-coercion strict parse still fails — and without chaining, the operator only sees that second exception and has no idea what was wrong with the source schema. The suppressed entry puts the original message into the same stack trace. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
fc5908f to
9c88232
Compare
Problem Statement
Store migration fails due to strict default numeric value check. Legacy stores did not enforce this check and now they cannot be migrated.
Solution
Adds a destination-side rewrite ({0 -> 0.0} on float-typed fields, etc.) so legacy schemas registered before validateNumericDefaultValueTypes was enforced can be migrated into clusters where the controller's STRICT parser now rejects them. Gated on storeConfig.migrationDestCluster, so non-migration writes are unaffected; defensively re-strict-parses the output so non-numeric violations (bad names, dangling content, union default not first branch) still fail loudly.
Code changes
Concurrency-Specific Checks
Both reviewer and PR author to verify
synchronized,RWLock) are used where needed.ConcurrentHashMap,CopyOnWriteArrayList).How was this PR tested?
Does this PR introduce any user-facing or breaking changes?