Skip to content

Rasmapan v0.4.6 — Arabic writing studio pack (alphabet + word trace + score-me)#254

Draft
Umanistan wants to merge 22 commits into
corpora-inc:mainfrom
Umanistan:arabic-calligraphy
Draft

Rasmapan v0.4.6 — Arabic writing studio pack (alphabet + word trace + score-me)#254
Umanistan wants to merge 22 commits into
corpora-inc:mainfrom
Umanistan:arabic-calligraphy

Conversation

@Umanistan

@Umanistan Umanistan commented May 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Rasmapan v0.4.6 — sister pack to Hanzipan for the Arabic abjad. Lives entirely under corpan/packs/rasmapan/; no changes to corpan-app, the catalog, root scripts, or dja/.

Pack source (Vite + vanilla JS IIFE), in-pack Python data builder (fontTools + Amiri OFL), runtime SQLite, lesson flow, brush canvas, host-i18next integration across 51 locales.

DRAFT — try it now

Preview build attached as a release asset on the fork. Paste this URL into Corpan → Settings → tap version 7× → Install from URL — works on macOS, iOS, iPadOS, Android:

https://github.com/Umanistan/encorpora/releases/download/rasmapan-v0.4.6-preview/rasmapan.zip

Iteration loop for testers: uninstall + reinstall from the same URL when a new build is published.

What's in v0.4.6

Area Detail
Letters mode 28-letter alphabet with all four positional forms (isolated / initial / medial / final), lam-alif ligature. The glyph fills the trace canvas centered, sized from the writer's actual outline bbox (not the full viewBox) so tall narrow letters like alif span ~80% of the canvas height while wide letters like baa span the width. Inner counter shapes (ه م ظ و ف ق ل etc.) are honored: all contours combined into one Path2D, filled with even-odd, so counter holes subtract from the body fill and stay empty. Per-stroke highlight uses ctx.clip(target) + ctx.fill(combined, evenodd) so guided-trace mode also respects the holes.
Words mode Canvas ctx.fillText with the Amiri webfont renders the active word as one big elegant glyph. Browser text engine handles RTL shaping, contextual forms, ligatures, kerning. Awaits document.fonts.load("Amiri") before first paint to avoid a Georgia-fallback flash. Conservative realH = fontPx * 1.6 height multiplier replaces actualBoundingBoxAscent/Descent (unreliable across Tauri WebKit/Blink builds) so words never overflow the trace canvas.
"Score me" Geometric similarity scoring against the target outline (Letters mode) or against a rasterized canvas-text mask (Words mode). Two metrics: precision (fraction of user points inside the target shape) and coverage (probe points along the target edge that have a user point within 90 viewBox units). Quality = 0.6·coverage + 0.4·precision. Banner shows percentage + a feedback message. Pure JS, no recognition model, fully offline.
Swipe navigation Fires from anywhere on the screen except the brush canvas + letter-picker row (their own gestures preserved). 32 px threshold, dominant-axis check, per-pointer state Map. Same direction convention as the lessons swipe: left → next, right → previous.
Intro lessons 6-step intro flow (RTL direction → abjad → four positional forms → harakat → sound-to-letter → trace alif) + 4 classical-calligraphy notes (Naskh / Thuluth / Diwani / Kufic) + Bismillah phrase TTS lesson. Localized into 51 corpan locales.
40 curated words Hand-curated everyday vocabulary words (أب، أم، يد، سلام، ماء، خبز ...) with transliteration + multilingual gloss across 51 locales. Each word is traceable in Words mode.
Host corpus integration hostApi.searchEntriesByText surfaces Arabic phrases from the main Corpan corpus in BOTH Letters mode (phrases containing the active letter) and Words mode (phrases containing the active word). Tap-to-speak chips per Arabic word.
Android safe-area floor UA-detect Android, apply max(env(safe-area-inset-*), <floor>) so close button + chrome don't sit under the status bar when the host's viewport-fit=cover propagation isn't perfect. iOS keeps its real notch via env().
Theme Earthgate parchment + gold palette (mirrors Hanzipan). Amiri (SIL OFL 1.1) is the display font.
Build Vite library build. New arabic_letter_writer / arabic_letter_note / arabic_word / arabic_lesson / arabic_ligature / arabic_style tables. Build runs under Python 3.11 (homebrew Python 3.14 has broken pyexpat). npm run pack:all produces ~1.2 MB rasmapan.zip.

Journey: things tried and dropped

This PR squashes 6 iterations (v0.1.1 → v0.4.6) of experimentation. Things we tried that didn't make the cut:

  • Per-letter stroke-order animation from Calliar trajectories. Multiple attempts (raw primitives, outline-edge walk, outline-masked Calliar). All produced visually wrong traces — pen riding the top edge of bowls instead of the centerline, pens extending outside the glyph, composite letters with no animation. Removed in v0.1.1. The masked-Calliar medians stay in the DB as a potential scoring target.
  • Calligraphy mode with 1642 Calliar word/sentence recordings + a "watch a calligrapher" canvas. Raw handwriting samples didn't read as calligraphy — even with smoothed quadratic curves, slow playback, and a finished-frame default. Removed in v0.4.0. Pack zip dropped from 7.5 MB back to 1.2 MB. The extractor (build/extract_calliar_words.py) stays for a possible v0.5 revisit; the regenerable JSON output (17 MB) is gitignored.
  • Bismillah Watch button with inline calligrapher canvas. Removed in v0.4.0 — the phrase TTS + the existing styled HTML Amiri rendering is enough. v0.4.6 stripped the obsolete "tap the pen-tip below to watch a real calligrapher trace all 23 strokes" paragraph from the lesson body across 51 locales.

Test plan

  • iOS: install URL above → alphabet picker shows; pick any letter; trace; tap ✓ to score
  • iPadOS: same; verify the Words mode ghost renders big + centered without clipping
  • Android: same; verify the safe-area floor keeps close button clear of status bar
  • Letters with counters (ه م ظ و ف ق ل): visible empty inner hole (parchment shows through)
  • Tall-dot words (ث خبز شمس): fit inside the dashed canvas border, no clip top/bottom
  • Swipe: left/right anywhere except canvas + letter-picker row advances next/previous
  • Bismillah lesson (step 11): static phrase + TTS Play button only, no Watch button
  • 51-locale fallback: switch corpan primary to e.g. Greek → letter notes / word glosses resolve

🤖 Generated with Claude Code

Umanistan and others added 22 commits May 17, 2026 23:34
Sister pack to Hanzipan for the Arabic abjad. Self-contained under
corpan/packs/rasmapan/ with its own data builder, seed JSON, and
Amiri (SIL OFL 1.1) font. No changes to corpan-app, the catalog,
or root package scripts — install in dev via the standard manifest
URL input (Settings → About → tap version 7×).

What's in:
- 6-step intro lesson (RTL, abjad, four forms, harakat, sound-to-
  letter, trace alif).
- 28 letters with all positional forms (4 for connectors, 2 for
  non-connectors) — 100 glyph variants total. Outlines extracted
  from Amiri at build time via fontTools.
- Lam-alif ligature (isolated + final).
- 40 curated 2–4-letter words (kitaab, bayt, salam, ...) with
  per-language meaning glosses (en, es, fr).
- 4 calligraphic-style cards (Naskh, Thuluth, Diwani, Kufic) with
  short descriptions and Amiri-rendered sample text.
- Brush canvas with five preset profiles (reused from
  corpan/packs/hanzipan/brush/).
- Earthgate parchment + gold palette, RTL-scoped for the Arabic
  hero, examples, and word-trace run.

What's not in v0.1.0:
- Hand-authored stroke-order medians per letter (currently
  auto-derived per contour; scoring degrades to "inside outline ±
  tolerance" when no median is available).
- Native-language lesson copy (English-only for v0.1.0; will be
  routed through host i18n for v0.2).
- Public-domain calligraphy sample images for the Styles cards
  (cards render Amiri-rendered "بسم الله" as a placeholder).
- Validation inside the real Tauri corpan window with real
  searchEntriesByText / speak / queryPackDb (drafted from in-
  browser preview using sql.js-backed mock hostApi).

PR is DRAFT — not ready for review until real-corpan validation
and native-language lesson copy land.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Builds on 5c87418. Key changes since the original draft:

Data + corpus
- Fixed corpus search "no phrases found" — was passing the
  presentation-form codepoint (U+FE-range) to
  searchEntriesByText while the corpus stores base Arabic
  letters (U+0628 ب, not U+FE8F ﺏ). Added a `base_letter` column
  to arabic_letter, populated from a new `base_unicode` seed
  field on each family. Every letter now surfaces real corpus
  phrases.
- queryPackDb no longer permanently disables on a single error
  (stale cached pack-DB connections after hot reinstall +
  schema change). loadFamilies / queryAlphabet / styles SELECT
  all tier their queries so missing columns fall back to legacy.
- Each example phrase tokenized via a port of juice-squeeze's
  Unicode-aware tokenizer (corpan/packs/juice-squeeze/src/data.ts
  pattern). Renders as RTL row of clickable word chips; tap a
  chip → speak just that word; words containing the active
  letter get a gold-accent highlight.

i18n / multilingual
- All chrome strings route through the host i18next instance
  (window.__corpanI18n) via a `rasmapan` namespace registered at
  mount with addResourceBundle. Subscribes to languageChanged
  so chrome re-renders the moment the user changes their
  primary language in corpan Settings.
- Locale bundles ship for every corpan-supported locale (51
  total: en/es/fr/it/pt-BR/pt-PT/de/nl/sv/no/da/fi/pl/cs/sk/sl/
  hr/sr/bg/ro/hu/el/uk/ru/tr/lt/ca/id/ms/sw/vi/th/ja/ko-polite/
  zh-Hans/zh-Hant/yue-Hant-HK/hi/bn/ta/te/kn/mr/gu/pa-Guru/
  pa-Arab/ur/fa/ne/he/ar). Scripts/build-locales.mjs aggregates
  src/locales/*.json into src/locales.generated.js because
  Vite's library-mode IIFE build drops import.meta.glob silently.
- pickByLang helper (locale-base fallback for "ko-polite" → "ko"
  etc.) used everywhere translation selection happens.
- speak() wrapper prefers hostApi.speakConcurrent when available
  (juice-squeeze pattern), falls back to speak.
- Speak now uses the BASE codepoint (U+0628) instead of the
  presentation form (U+FE8F) so macOS Arabic TTS actually
  pronounces individual letters beyond alif.

UX / layout
- Single-row horizontal-scroll letter picker (28 pills, scrolls
  L↔R when narrow) — replaced the multi-row wrap that squeezed
  the canvas. Generous padding-bottom keeps Arabic descenders
  (ج، ر، و، ي، ع) clear of the overflow-y clip line.
- Pointer-events fixed on canvas stack: only the draw canvas
  receives input; the fx layer (decorative) and ghost layer
  (read-only) are pointer-events: none. Cursor is crosshair on
  the draw canvas. Calligraphy works on desktop with mouse.
- Ghost-only redraw when in Letters mode — the previous handler
  redrew both layers, and the inactive WordTraceLayer's
  clearRect was wiping the letter ghost.
- Ghost guide opacity bumped 0.22 → 0.45 to read clearly against
  the parchment background. Eye icon (real eye, not graduation
  cap) toggles it; slashed-eye state when off.
- Always-visible Tutorial book icon in the top-left corner —
  mirrors the Exit X. Replays the intro flow from step 1 in
  whatever language is active.
- Restored chrome text the iconic pass over-stripped: position
  name beside hero label ("ألف · alif — isolated"), examples
  section headings, "3 strokes" / "5 letters" counts.
- Words mode hero-right shows the word's component letters as
  clickable chips (glyph + English name); tap one → drills into
  Letters mode for that family.
- Styles mode prev/next nav arrows cycle through the 4 cards;
  active card gets a gold-bordered highlight and scrolls into
  view. Style descriptions tier the SELECT so a stale connection
  without description_md_i18n still renders.
- Play button speaks the current letter/word (was a no-op).
- Lesson 5's tap-letters now play sound for every button (used
  the base codepoint so TTS works for all 28, not just alif).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…1-locale pack content

- Styles become intro lessons 7–10 (naskh, thuluth, diwani, kufic);
  Styles mode tab removed
- Lesson overlay gains swipe-left/right navigation (pointer events)
- Android safe area via env(safe-area-inset-*); UA-based --safe-top
  override removed
- @media (max-width: 380px) tightens chrome for slim phones
- Words mode now loads each letter at its real positional form
  (medial/initial/final/isolated) — fixes "3 standalone letters"
  bug
- Pack content translated to 51 corpan locales: 40 words, 10 lessons,
  4 styles, and 28 letter notes (1456 note rows in pack DB)

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…t pack content

The pack DB and lesson i18n objects key on the exact corpan locale code
(zh-Hans, zh-Hant, yue-Hant-HK, pa-Arab, pa-Guru, pt-BR, pt-PT). The old
`currentLanguage()` ran .toLowerCase(), turning "zh-Hans" into "zh-hans"
which is case-mismatched against every consumer:
  - SQL `WHERE language_code IN (?,?,?)` for `arabic_letter_note`
  - JS `lesson.i18n[<lang>]` lookups in lessons.js
  - JS `style.i18n[<lang>]` lookups (now folded into lessons)
Result: Chinese / Cantonese / Punjabi-Arab / Punjabi-Guru / pt-BR/PT
primary users fell back to English even though the translations were
present in the seed.

Removed `.toLowerCase()` — i18next preserves the registered casing,
so the host code matches the seed exactly. `pickByLang` already
case-folds both sides for word glosses, so it's unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
54 → 64 px on the scroller's bottom padding, pill min-height 52 → 58,
inner pill bottom-padding 18 → 22. Same proportional bump in the
@media (max-width: 380px) slim-screen block.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
… polylines)

scoring.js already implements scoreAgainstMedian() in full — endpoint
proximity, mean-distance, length-ratio, direction-agnostic. Every
writer record already ships an auto-derived median polyline extracted
from the Amiri outline at build time. The pipeline was just disabled
by a "outline" default. Flipping to median routes the existing data
through the existing scorer.

Per-letter overrides still work: any letter whose auto-derived median
turns out to be a poor approximation of actual stroke order can be
downgraded back to "outline" via the overrides dict, or upgraded with
a hand-tuned medians array.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Adds a canonical stroke-order preview that plays alongside TTS when
the user hits the Play/Speak button. Trajectories come from
Calliar (https://github.com/ARBML/Calliar, MIT-licensed), an open
Arabic-calligraphy dataset of 2,500 stroke-annotated samples from
real calligraphers.

Why not the auto-derived Amiri medians? They're geometric centerlines
of font contours, ordered however fontTools happens to return them —
no relation to actual calligraphic stroke order. Animating those
would teach learners a fake order that's hard to unlearn. The
previous commit (ae3ad4e) defaulted every letter to median scoring
on those fake medians; this commit reverts that default to "outline"
so only letters with real Calliar-derived data get the median path.

Pipeline:
  build/vendor/calliar/        — Calliar snapshot (LICENSE + scripts;
                                  dataset.zip gitignored, fetched on
                                  demand by fetch_dataset.sh)
  build/extract_calliar_strokes.py
                                — walks all 2,500 samples, groups
                                  strokes by primitive, picks one
                                  canonical trajectory per primitive
                                  by aspect-ratio + path-length
                                  heuristics, composes letters from
                                  primitives + classically-placed
                                  dots
  build/seed/stroke_orders_seed.json
                                — generated overrides (alif + baa
                                  for Phase A)
  src/trace.js                  — playStrokeOrder()/cancelAnimation()
                                  on LetterTraceLayer + WordTraceLayer;
                                  setFxCanvas() to receive the empty
                                  fx canvas as an animation surface
  src/main.js                   — wires fx canvas, fires animation
                                  from speak/replay actions alongside
                                  TTS, cancels on letter navigation
  LICENSES.md                   — Amiri + Calliar attributions

Phase B will extend LETTER_RECIPES in extract_calliar_strokes.py to
cover the remaining 26 letters. Letters without entries fall back to
the permissive outline scorer (no fake animation, no regression vs.
the previously-shipped state).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Extends `LETTER_RECIPES` in build/extract_calliar_strokes.py to cover
every Arabic letter:

  Single-primitive (12):  alif, Haa, daal, raa, siin, Saad, ain,
                          laam, miim, haa, waaw, kaaf
  Single-primitive + dots (14):  baa, taa, thaa, jiim, khaa, dhaal,
                                 zaay, shiin, Daad, ghain, faa,
                                 qaaf, nuun, yaa
  Two-primitive composites (2):  Taa (ط = alif + Saad-medial body),
                                 DHaa (ظ = Taa + dot above)

All trajectories come from Calliar (MIT, https://github.com/ARBML/Calliar)
— real Arabic calligraphers' hands. No geometric font-outline fakes,
no invented stroke order.

Selection pipeline (choose_canonical):
  1. Aspect-ratio band per primitive (ASPECT_HINTS) — rejects
     medial-position distortions where connecting strokes warp
     the shape.
  2. Path-length sanity (within [0.5×, 2.0×] of band median) —
     rejects micro-stubs and overlong flourishes.
  3. Lowest tortuosity (path length ÷ bbox diagonal) wins.
     Tortuosity 1.0 = a straight line; 1.2 = a clean curve; >1.6
     = a wandering or doubled-back stroke (sentence-context
     connector tails). Cleanest curve gets shipped.

Two-primitive composites use find_canonical_pair(): scans Calliar
for adjacent stroke pairs tagged [alif, ﺻ] (or [ﺻ, alif] — order
varies by writer) and lifts BOTH strokes with their absolute
coords preserved, so the stem-to-base spatial relationship is
authentic. Output re-ordered to base-first per classical Naskh.

Dot placement is bbox-relative (DOT_GAP, DOT_SPACING_2,
DOT_SPACING_3, DOT_TRIANGLE_RISE constants) so dots sit just
above or below the actual primitive shape regardless of how tall
or wide the base naturally is. Three-dot patterns (thaa, shiin)
use a triangle layout with the middle dot raised away from the
primitive.

New `build/audit_strokes.py` renders per-letter SVGs and a
28-letter grid for visual review; outputs are gitignored. The
trace canvas's existing playStrokeOrder() animation (from Phase
A) now activates for all 28 letters automatically — `scoring:
"median"` flag in writer records routes through the same code
path.

Word-mode animation defers to Phase C — would require
positional-form (initial/medial/final) trajectories which
Calliar doesn't tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
… tip

Tuned for learning rather than snappiness:
  - strokeDuration: 750 → 1100 ms per stroke (eye can track the pen)
  - gapDuration: 200 → 250 ms between strokes
  - holdMs: 700 → 1500 ms final-state hold before clear (study time)
  - Trail line width scaled by DPR so it matches the user's own brush
    stroke width visually on Retina displays.
  - Pen tip now has a halo + core (sumi-ink center surrounded by a
    soft gold ring) so the moving point pops against any background.
  - Dot strokes get the same halo + core treatment, signalling that
    the same pen placed them.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Writer records for non-isolated positions (e.g. baa.initial, baa.medial,
baa.final) carry auto-derived medians from the Amiri outline as a scoring
fallback. Those are geometric contour centerlines, not real stroke order
— animating them would teach a fake sequence.

playStrokeOrder() now checks writer.scoring === "median" before
animating. The builder sets that flag only when a Calliar override is
applied, so positional forms (which currently only have isolated-form
Calliar data) silently skip animation. Tracing still works on every
position via the permissive outline scorer; just no preview animation
outside the isolated form.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Calliar writers don't agree on stroke direction — some draw alif
top-to-bottom, others bottom-to-top; some draw the baa bowl
right-to-left (RTL convention), others left-to-right. For the
scorer this is fine (it's direction-agnostic). For an instructional
animation it would teach inconsistency.

Adds DIRECTION_RULES per primitive (e.g. ا = top-to-bottom, ٮ =
right-to-left, د = top-right-to-bottom-left) and a
normalize_direction() pass that reverses the chosen polyline if
its start/end is the wrong way around. Complex primitives whose
direction varies legitimately by writer (waaw loop, ain enclosed
shape, Saad head+tail) get rule (0, 0) — trust the calligrapher's
hand.

Applied to both single-primitive picks and the two-primitive
composite pairs (Taa, DHaa), so every stroke in the seed runs
the canonical Naskh direction consistently.

Verified: alif starts at top, daal/raa start top-right and end
bottom-left, siin/baa/nuun-bowls run right-to-left, laam/kaaf top
to bottom, Taa stem top-to-bottom.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
build/calliar_stroke_audit.svg shows every letter with its Amiri
outline (gray fill) + Calliar-derived stroke trajectory (gold
polyline) + start-position green ring + end-position red dot.
Lets reviewers eyeball the stroke order and direction at a glance
without running the audit script.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The Naskh / Thuluth / Diwani / Kufic style lessons (intro lessons
7-10) referenced sample_image paths that didn't exist in the pack
— broken-image fallback. Adding real public-domain manuscript
samples from Wikimedia Commons resolves that.

Images:
  Naskh   — A folio of the Qur'an by Zayno'l-‘Abedin Esfahani
  Thuluth — Yaqut al-Musta'simi (13th century calligrapher)
  Diwani  — Ferman of Sultan Mehmed II (15th century Ottoman decree)
  Kufic   — Kufi parchment with Qur'anic verses (early Islamic)

All four are public domain (PD-old: works by authors who died
more than 70 years ago). Sourced via Wikimedia Commons; resized
to ≤ 800 px on the longer side and JPEG-recompressed at 75 for
pack-size economy (~480 KB combined across the four).

Source URLs + attributions documented in LICENSES.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…, multi-writer variants

Second-pass audit of the Calliar dataset post-Phase B, ranked by
impact-per-effort. Top picks for follow-up iterations:

  1. Bismillah lesson (62 samples in Calliar, 23-stroke consistent
     primitive sequence) — half a day
  2. Positional-form animations (extract init/medial/final by
     inferring stroke position from neighbors) — full day
  3. Word-mode animation (depends on #2) — half a day

Lower-priority: multi-writer variant showcase (clustering already
exists upstream), pix2pix-generated style images (current Wikimedia
ones are more authoritative).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Adds intro lesson 11 (type "phrase") that animates "بسم الله الرحمن الرحيم"
in classical Naskh stroke order at the end of the intro flow.

Bismillah is the opening of every Qur'an chapter (except one) and
traditionally the first multi-letter phrase calligraphy students
learn. Calliar's authors curated 62 recordings of it explicitly
(see upstream `notebooks/Collect bism allah.ipynb`); 54 match the
canonical 23-stroke primitive sequence:

  ٮ . س م ا ل ل ه ا ل ر ح م . ں ا ل ر ح ى . . م

Pipeline:
  build/extract_calliar_bismillah.py
    walks Calliar samples, filters to the 23-stroke pattern,
    picks the median sample by total path length, normalizes to
    a wide 2000×500 viewBox, resamples each stroke to ~22 points.
    Outputs build/seed/phrases_seed.json.
  build/add_bismillah_lesson.py
    appends the lesson to lessons_seed.json with title + body
    translations across all 51 corpan locales (+ ar).
  src/lessons.js
    new "phrase" lesson type: large RTL Arabic text up top, wide
    canvas auto-playing the 23-stroke animation, Play button +
    transliteration + translation row underneath. Animation is
    standalone (separate from LetterTraceLayer) because the
    lesson canvas has its own DPR sizing and aspect.
  src/styles.css
    .lesson-phrase + child classes — wide canvas wrap with
    aspect-ratio padding-top trick, sumi-ink + gold accent
    matching the rest of the pack.

Auto-plays on first render so the learner immediately sees the
preview; the Play button replays. Speak ("ar", phrase_ar) fires in
parallel so the phrase is heard too.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…iants

Three sister features that together complete the "every glyph
variant a user sees has a real Calliar-derived animation" story.

(1) POSITIONAL FORMS — new build/extract_calliar_positional.py
classifies each primitive's appearance by neighbors:
    prev connects forward AND this connects back   → linked right
    this connects forward AND next connects back   → linked left
    both linked → MEDIAL    only right → FINAL
    only left → INITIAL     neither → ISOLATED
Dots are skipped (part of the current letter, not boundaries).
Letters are then composed per position using the same recipe
machinery (primitive + dot pattern). 94 of 100 (family, position)
writer rows now ship with scoring="median" and real Naskh
trajectories. The remaining 6 are positional Taa/DHaa — they'd
need a 4-primitive (stem + body + tails) pair-search we defer.

(2) WORD-MODE ANIMATION — new WordTraceLayer.playStrokeOrder()
overrides the inherited LetterTraceLayer version. Pre-projects
each letter's medians using a factored _layoutSlots() helper
(extracted from the existing redraw), then runs every letter's
strokes in turn at its per-slot transform with a 350ms gap
between letters. Letters without real medians (would-be-positional
forms that don't have Calliar data yet) are silently skipped so
the pen doesn't draw a fake trajectory.

(3) MULTI-WRITER VARIANTS — extract_calliar_positional.choose_variants()
picks 3 trajectories at the 25th/50th/75th percentile of each
primitive's isolated-form aspect distribution, then composes
them into full letters (base + dots). Stored on the isolated
writer as record["variants"]. A new "three-dots" icon chip
appears in the toolbar next to the Play button; tapping it
cycles through variants 0..N-1. Hidden when the current letter
has fewer than 2 variants (composite Taa/DHaa, kaaf with its
single isolated Calliar sample). 25 of 28 letters now ship
3 variants each.

Bundle: 117 KB (+3 KB vs. previous), pack ZIP: 1.2 MB.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…e pen

Two coupled fixes for the "writing looks like a child who doesn't
know how to write" UX:

(1) ALIGNMENT — medians now fit inside the outline's natural bbox
instead of filling the whole viewBox. Calliar primitives are
extracted normalized to viewBox 1000×1000, which makes the audit
grid readable, but the visible Amiri glyph the user is trying to
copy occupies only a fraction of that area (e.g. alif outline
y=461..782, ~32% of viewBox). Previously the animation pen traced
a "letter-shaped" path through air at 3× the actual letter size,
completely off the ghost outline.

New `refit_medians_to_outline_bbox()` in build_arabic_pack.py
scales + translates each letter's strokes (including programmatic
dots) so their combined bbox fits inside the outline's bbox,
preserving aspect ratio + centering. Applied at build time to
both the canonical medians and every multi-writer variant, so the
chip-cycled trajectories all align too.

Verified: alif median y now spans 461..782 exactly (was 80..920);
baa median bbox is (313, 640, 680, 867), entirely inside the
outline (313, 609, 680, 898). Positional forms (baa.initial,
baa.medial, etc.) likewise fit inside their respective outline
bboxes.

(2) PEN — drastically thinned. trace.js lineWidth 18 → 4 CSS-px
(8 device px on retina); tip core 10 → 4, halo 20 → 8; dot core
8 → 3, halo +10 → +5. lessons.js Bismillah canvas even finer:
lineWidth 8 → 3, since 23 strokes packed into a 4:1 wide canvas
need a fine qalam nib to avoid smearing into neighbors.

Net effect: the animation pen tip now traces inside the visible
letter, with an elegant fine line instead of a paintbrush smear.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Calliar-recorded trajectories were the wrong source: each
primitive's recorded stroke included trailing connecting strokes
that bled out of the canonical isolated-letter shape. Result was
animations that looked unrelated to the visible ghost outline.

Instead, flatten each Amiri contour to a polygon via a fontTools
BasePen-derived FlattenPen, then walk one side of the polygon
between two extreme vertices to derive a median that *provably*
traces the visible outline:

- Tall contours (alif, laam, kaaf, raa, …) — walk topmost to
  leftmost along the higher-X edge (the spine + hook).
- Wide / square contours (baa, siin, dots) — walk rightmost to
  leftmost along the lower-Y edge (the upper silhouette).

Median is arc-length-resampled to 28 points so the animation is
smooth. Every letter writer now ships scoring="median" and
animates on Play. Multi-writer variant chip auto-hides (no
variants left).

Also disable the Bismillah phrase-canvas animation — the 23-stroke
Calliar trajectory packed into a 4:1 canvas produced unreadable
overlap. The lesson keeps its phrase text + transliteration +
translation; Play is TTS-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The breakthrough: combine Calliar's authentic calligrapher pen
recordings with the Amiri outline as a mask. Calliar gives
direction, curvature, and naturalness; Amiri tells us where the
canonical letter ENDS so we can trim off connecting tails.

Per letter:
- Independent-axis bbox-align each Calliar candidate to the
  letter's body outline.
- shapely Polygon.contains test → longest contiguous IN-run is
  the canonical letter portion.
- Pick the candidate with the highest in-polygon coverage.
- scipy Savitzky-Golay smooth, arc-length resample to 24 points.
- Orient so first point is closest to the polygon's bbox top-right
  corner (natural RTL start).

This replaces both prior approaches: Calliar's raw trajectories
(too much trailing tail) AND outline-edge walks (wrong shape for
anything wider than alif). The resulting medians visibly trace
through the centerline of each letter — what a 4-CSS-px pen tip
needs to draw inside the visible glyph.

Bismillah phrase animation stays disabled (TTS-only) for v0.1;
word-level masking is a follow-up.

New build deps: numpy, shapely, scipy. Build runs under Python
3.11 (homebrew Python 3.14 has a broken pyexpat).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Big-picture: between v0.1.0 and v0.4.6 the pack settled into its
shippable shape. The Calligraphy mode + Bismillah animation
experiments from v0.2 were stripped (the raw Calliar handwriting
samples didn't deliver visually). Words mode now renders each word
as a single big elegant Amiri-rendered glyph via canvas fillText.
Score-me works in both Letters and Words. Swipe nav fires from
anywhere on the screen except the brush canvas + letter-picker row.

Key changes:

- v0.1.1: removed misleading per-letter stroke-order animation.
- v0.2.0 → v0.4.0: spike + revert of the Calligraphy mode
  (1642 Calliar recordings, gallery, Bismillah Watch). Pack dropped
  from 7.5 MB back to 1.2 MB. Calliar extractor + recordings JSON
  kept on disk for a potential v0.5 revisit (gitignored, regenerable).
- v0.3.0: Score-me geometric similarity (precision + coverage)
  against the letter outline. Letters mode only first; v0.4 added
  Words mode via a canvas-text-rasterized mask.
- v0.4.0: Words ghost = ctx.fillText with Amiri webfont (proper RTL
  shaping, contextual forms, kerning). Replaces the per-letter
  slot-by-slot rendering that read as "separate letters."
- v0.4.1 → v0.4.4: iterated layout safety margins, found the real
  bug — DrawingEngine.resize() was forcing canvas-layer CSS width
  to the shell's bounding rect, pushing the canvas buffer 24 px past
  the visible overflow clip on the right + bottom and visually
  shifting every centered glyph toward the lower-right. Now reads
  the canvas-layer's CSS-governed rect via getBoundingClientRect.
- v0.4.4: counter holes honored (ه م ظ و ف ق ل etc.) by combining
  all outline contours into one Path2D + fill once with evenodd —
  not per-contour fill which painted the hole solid.
- v0.4.5: the dirEnabled per-stroke highlight was still
  overpainting the hole. Switched to clip(target) + fill(combined,
  evenodd) so the highlight respects the counter.
- v0.4.6: swipe fires on pointermove (snappier) at 32 px threshold;
  all listeners on root matching lessons.js _wireSwipe pattern;
  per-pointer state Map for multi-touch robustness. Bismillah
  lesson body stripped of the obsolete "watch a calligrapher trace
  all 23 strokes" paragraph across all 51 locales.
- Build pipeline: arabic_calliar_recording table dropped from
  build_arabic_pack.py (the data is regenerable). DB back to ~1.4 MB.
- Android safe-area UA-detect floor (28 px top, 18 px bottom,
  8 px lateral) kicks in only when env(safe-area-inset-*) returns
  0 — iOS keeps its real notch via env(); Android with edge-to-edge
  disabled gets a sensible floor.

Files added (lean, tracked):
- build/extract_calliar_words.py — Calliar word/phrase extractor
  for a possible v0.5 revisit.
- build/audit_render.py — Pillow-based audit renderer that
  rasterizes each letter's ghost outline + pen trail and reports
  coverage metrics. Useful when iterating on layout.

Files gitignored (regenerable):
- build/vendor/calliar_recordings.json (17 MB)
- build/render_audit/ (per-letter PNGs)

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@Umanistan Umanistan changed the title Rasmapan v0.1.0 (DRAFT) — Arabic writing studio pack Rasmapan v0.4.6 — Arabic writing studio pack (alphabet + word trace + score-me) May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant