fix(scraper): RBI schema resilience + auto-sync banknames.json across…#465
Open
PriyankaMarbill wants to merge 1 commit into
Open
fix(scraper): RBI schema resilience + auto-sync banknames.json across…#465PriyankaMarbill wants to merge 1 commit into
PriyankaMarbill wants to merge 1 commit into
Conversation
… SDKs [ISS-1697801]
Three coordinated changes that together solve the first issue on
ISS-1697801 ("Bank Name Sync Issue / RBI Schema Change").
1. Read the bank-name column tolerantly. RBI periodically renames the
bank-name header in the NEFT/RTGS sheets. The scraper used to read
row['BANK'] directly, so a rename silently set every bank name to
nil. parse_csv now reads via read_bank_name(), which tries a list of
known header variants (BANK, BANK NAME, BANK_NAME, Bank Name, ...)
and normalises the value back into row['BANK']. A WARN fires if none
of the variants match, so the next rename is caught immediately
instead of failing silently.
2. Never drop a bank just because banknames.json is stale. merge_dataset
used to overwrite combined_data['BANK'] with the value returned by
bank_name_from_code (which reads only our local banknames.json), so
any bank RBI added that we had not registered yet ended up with no
name and effectively disappeared. It now falls back to the value we
captured from the RBI sheet when the local lookup is empty, and
surfaces the new bank via a WARN.
3. Auto-propagate new banks to every language SDK.
- After the export, generate.rb calls sync_banknames!(dataset). For
every IFSC, it derives the 4-char bank code and appends it to
src/banknames.json if missing, using the name from the RBI sheet.
- The name is passed through normalize_bank_name() which applies the
CONTRIBUTING.md "Bank Names Guidelines": drop trailing 'Ltd'/'Limited'
(rule 1), drop leading 'The ' (rule 3), canonicalise to
'Co-operative' with no trailing period (rules 4 + 9), 'sahkari' ->
'Sahakari' (rule 11), and collapse SHOUTY-CASE to Title Case while
preserving short acronyms (rule 2). Rules that need human judgement
(city-in-brackets, Grameen/Gramin spelling, unexpanded abbreviations)
are flagged via a WARN that points at CONTRIBUTING.md.
- When any new bank is added, generate.rb invokes
`make generate-constants`, which already regenerates bank.rb,
Bank.php, bank.js and constants.go from banknames.json via the Go
template generator. End result: a single scraper run picks up a new
bank from RBI, registers it in the source of truth, and updates all
four SDK files in lockstep — no manual `make` step required.
- A new `make check-constants` target regenerates the four files and
fails with a diff if they drift from banknames.json. Intended to
run in CI so that hand-edits to banknames.json without a regen, or
hand-edits to a generated file, fail loudly.
Verification done locally:
- ruby -c on methods.rb and generate.rb
- Dry-run of sync_banknames! on a synthetic dataset (existing bank,
brand-new bank, bank with nil name) -> correctly added the new bank,
skipped the nil one, kept banknames.json sorted with 2-space indent
- 14-case unit run of normalize_bank_name covering all CONTRIBUTING.md
guidelines (rules 1, 2, 3, 4, 9, 11, acronym preservation, nil/empty
input) -> all pass
- `make -n check-constants` expands to the expected `go run` invocations
plus the git-diff guard
- End-to-end Go generator run requires Go, which is not available in
this sandbox; will run in CI / on the maintainer's machine
Refs: ISS-1697801
Compliance: DOCUMENTATION.md (release rules), CONTRIBUTING.md (bank names
+ code style + build-must-pass)
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
… SDKs [ISS-1697801]
Three coordinated changes that together solve the first issue on ISS-1697801 ("Bank Name Sync Issue / RBI Schema Change").
Read the bank-name column tolerantly. RBI periodically renames the bank-name header in the NEFT/RTGS sheets. The scraper used to read row['BANK'] directly, so a rename silently set every bank name to nil. parse_csv now reads via read_bank_name(), which tries a list of known header variants (BANK, BANK NAME, BANK_NAME, Bank Name, ...) and normalises the value back into row['BANK']. A WARN fires if none of the variants match, so the next rename is caught immediately instead of failing silently.
Never drop a bank just because banknames.json is stale. merge_dataset used to overwrite combined_data['BANK'] with the value returned by bank_name_from_code (which reads only our local banknames.json), so any bank RBI added that we had not registered yet ended up with no name and effectively disappeared. It now falls back to the value we captured from the RBI sheet when the local lookup is empty, and surfaces the new bank via a WARN.
Auto-propagate new banks to every language SDK.
make generate-constants, which already regenerates bank.rb, Bank.php, bank.js and constants.go from banknames.json via the Go template generator. End result: a single scraper run picks up a new bank from RBI, registers it in the source of truth, and updates all four SDK files in lockstep — no manualmakestep required.make check-constantstarget regenerates the four files and fails with a diff if they drift from banknames.json. Intended to run in CI so that hand-edits to banknames.json without a regen, or hand-edits to a generated file, fail loudly.Verification done locally:
make -n check-constantsexpands to the expectedgo runinvocations plus the git-diff guardRefs: ISS-1697801
Compliance: DOCUMENTATION.md (release rules), CONTRIBUTING.md (bank names
Note :- Please follow the below points while attaching test cases document link below:
- If label
Testedis added then test cases document URL is mandatory.- Link added should be a valid URL and accessible throughout the org.
- If the branch name contains hotfix / revert by default the BVT workflow check will pass.