Skip to content

Support StyleTTS2 OOD data requirements #816

Open
roedoejet wants to merge 18 commits into
mainfrom
dev.ap/ood-data
Open

Support StyleTTS2 OOD data requirements #816
roedoejet wants to merge 18 commits into
mainfrom
dev.ap/ood-data

Conversation

@roedoejet

Copy link
Copy Markdown
Member

PR Goal?

Figure out how to handle OOD data properly. This is data that is used for the WavLM based loss used in StyleTTS2. It intentionally uses text (from the same language) to synthesize that is unseen during training, to make the synthesizer more robust to out of domain utterances. This PR:

Fixes?

Fixes part of #686

Feedback sought?

This is a big one, so mostly looking for feedback on the UX in the wizard for handling OOD data. Is it explained OK? How intuitive is it? Try configuring + preprocessing + training if you can.

Priority?

high. this is one of the last things for StyleTTS2, and probably the second-to-last major change.

Tests added?

I added a few.

How to test?

Configure a new project, add some OOD data, preprocess, train.

Confidence?

medium-high

Version change?

already changed for StyleTTS2

Related PRs?

EveryVoiceTTS/StyleTTS2#13

@semanticdiff-com

semanticdiff-com Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review changes with  SemanticDiff

Changed Files
File Status
  everyvoice/wizard/basic.py  64% smaller
  everyvoice/tests/test_cli.py  55% smaller
  everyvoice/tests/data/relative/config/everyvoice-text-to-wav.yaml  49% smaller
  mkdocs.yml  40% smaller
  everyvoice/tests/test_custom_g2p.py  39% smaller
  everyvoice/tests/test_wizard.py  23% smaller
  everyvoice/cli.py  9% smaller
  everyvoice/tests/test_preprocessing.py  6% smaller
  everyvoice/preprocessor/preprocessor.py  1% smaller
  Contributing.md Unsupported file format
  README.md Unsupported file format
  docs/guides/custom.md Unsupported file format
  docs/guides/fastspeech2.md Unsupported file format
  docs/guides/styletts2.md Unsupported file format
  everyvoice/.schema/everyvoice-text-to-wav-0.5.json  0% smaller
  everyvoice/base_cli/helpers.py  0% smaller
  everyvoice/model/e2e/StyleTTS2_lightning  0% smaller
  everyvoice/tests/data/mixed-cleaners-resume Unsupported file format
  everyvoice/wizard/__init__.py  0% smaller

@roedoejet roedoejet changed the title Dev.ap/ood data Support StyleTTS2 OOD data requirements Jun 11, 2026
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor
CLI load time: 0:00.28
Pull Request HEAD: 9b3edf619bc132c64553d9c61c33711eeadfb635

Imports that take more than 0.1 s:
import time: self [us] | cumulative | imported package
import time:       648 |     101588 |   typer
import time:      3603 |     101215 |             loguru
import time:      2554 |     150482 |           everyvoice.utils
import time:       849 |     151330 |         everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli.benchmark
import time:       531 |     159821 |       everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli.cli
import time:       321 |     160141 |     everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli
import time:      1309 |     161450 |   everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli.check_data
import time:      8340 |     407152 | everyvoice.cli
import time:      5385 |     132153 |   rich.markdown
import time:      4420 |     221490 | typer.rich_utils

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 78.32168% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.41%. Comparing base (f7b2d4d) to head (9b3edf6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
everyvoice/wizard/basic.py 81.30% 31 Missing and 12 partials ⚠️
everyvoice/cli.py 38.09% 12 Missing and 1 partial ⚠️
everyvoice/preprocessor/preprocessor.py 82.35% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #816      +/-   ##
==========================================
- Coverage   84.99%   84.41%   -0.58%     
==========================================
  Files          46       46              
  Lines        4219     4403     +184     
  Branches      632      656      +24     
==========================================
+ Hits         3586     3717     +131     
- Misses        494      533      +39     
- Partials      139      153      +14     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joanise joanise left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my comments so far, I'll keep reviewing next week.

I know some of my comments are on text that you didn't just write, but rather only moved, but I noticed things we could improve anyway.

And I know, so far only on docs, nothing on the code yet. Next week...

Comment thread docs/guides/custom.md
Comment thread docs/guides/fastspeech2.md Outdated
Comment thread docs/guides/fastspeech2.md Outdated
Comment thread docs/guides/fastspeech2.md Outdated
Comment thread docs/guides/fastspeech2.md Outdated
Comment thread docs/guides/styletts2.md Outdated
@roedoejet roedoejet requested a review from joanise June 15, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants