Support StyleTTS2 OOD data requirements #816
Open
roedoejet wants to merge 18 commits into
Open
Conversation
separate fastspeech2 and styletts2 documentation
… all configs fix minor issues
Changed Files
|
Contributor
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #816 +/- ##
==========================================
- Coverage 84.99% 84.41% -0.58%
==========================================
Files 46 46
Lines 4219 4403 +184
Branches 632 656 +24
==========================================
+ Hits 3586 3717 +131
- Misses 494 533 +39
- Partials 139 153 +14 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
bff624b to
c619313
Compare
This was referenced Jun 11, 2026
joanise
reviewed
Jun 12, 2026
joanise
left a comment
Member
There was a problem hiding this comment.
Just my comments so far, I'll keep reviewing next week.
I know some of my comments are on text that you didn't just write, but rather only moved, but I noticed things we could improve anyway.
And I know, so far only on docs, nothing on the code yet. Next week...
small typos and som clarifications
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Goal?
Figure out how to handle OOD data properly. This is data that is used for the WavLM based loss used in StyleTTS2. It intentionally uses text (from the same language) to synthesize that is unseen during training, to make the synthesizer more robust to out of domain utterances. This PR:
Fixes?
Fixes part of #686
Feedback sought?
This is a big one, so mostly looking for feedback on the UX in the wizard for handling OOD data. Is it explained OK? How intuitive is it? Try configuring + preprocessing + training if you can.
Priority?
high. this is one of the last things for StyleTTS2, and probably the second-to-last major change.
Tests added?
I added a few.
How to test?
Configure a new project, add some OOD data, preprocess, train.
Confidence?
medium-high
Version change?
already changed for StyleTTS2
Related PRs?
EveryVoiceTTS/StyleTTS2#13