Skip to content

Add NVIDIA NeMo Text Processing example#438

Open
GeoSegun wants to merge 3 commits into
mainfrom
nemo-text-processing
Open

Add NVIDIA NeMo Text Processing example#438
GeoSegun wants to merge 3 commits into
mainfrom
nemo-text-processing

Conversation

@GeoSegun
Copy link
Copy Markdown
Member

Summary

This PR adds a new Saturn Cloud workspace template for NVIDIA NeMo Text Processing — a library for normalizing text in speech AI pipelines. No GPU or API key required; everything runs on CPU using Weighted Finite-State Transducers (WFST).

What the template demonstrates

Two core operations used in speech AI systems:

Operation Direction Use case
Text Normalization (TN) Written → Spoken Preprocess text before Text-to-Speech (TTS)
Inverse Text Normalization (ITN) Spoken → Written Post-process transcript from Automatic Speech Recognition (ASR)

Files added

File Purpose
.saturn/saturn.json Workspace recipe — CPU Large, 10Gi disk, 1-hour auto-shutoff
start.sh Installs nemo_text_processing on first start; skips if already installed
nemo_text_processing_demo.ipynb Pre-built notebook with 6 sections users can run immediately
README.md Full user-facing documentation
.gitignore Excludes logs, cache and notebook checkpoints

Notebook structure

  1. Verify Installation — confirms library is ready before running anything
  2. Text Normalization — numbers, dates, times, abbreviations and measurements in English
  3. Multilingual TN — same operations in German and Spanish to demonstrate 15-language support
  4. Inverse Text Normalization — converts ASR spoken output back to written form
  5. Batch Processingnormalize_list() for processing multiple texts in parallel
  6. TTS / ASR Pipeline Examples — end-to-end pre/post-processing scenarios showing real-world usage
  7. Try It Yourself — open sandbox cells for users to test their own text and language

Design decisions

  • No API key, no GPU — pure WFST rules baked into the library; nothing external needed
  • Notebook-first — the NVIDIA documentation recommends notebooks for this library; aligns with user expectations
  • Standard Saturn Python imagenemo_text_processing installs cleanly via pip on Linux x86_64
  • Install in start_script — keeps the image standard; idempotent check skips reinstall on restart
  • 15 languages showcased — English is primary, German and Spanish included to demonstrate multilingual capability without overwhelming the notebook

Prerequisites for users

None — just click Start.

Test plan

  • Workspace deployed on Saturn Cloud — nemo_text_processing installed in ~25 seconds
  • All 6 notebook sections run successfully with correct output
  • TN and ITN examples produce expected results
  • Multilingual normalizers (German, Spanish) work correctly
  • Batch processing cell completes without error
  • start.sh idempotent check verified — skips reinstall on second start
  • CI validation passes (recipe schema, naming, notebook outputs cleared)
  • Add entry to templates-hosted.json when promoting to the template gallery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant