Skip to content

docs: document the filtering process in a pr head that will persist#12

Open
joanise wants to merge 1 commit into
mainfrom
dev.ej/document-filter
Open

docs: document the filtering process in a pr head that will persist#12
joanise wants to merge 1 commit into
mainfrom
dev.ej/document-filter

Conversation

@joanise

@joanise joanise commented Jun 4, 2026

Copy link
Copy Markdown
Member

This branch of https://github.com/EveryVoiceTTS/StyleTTS2 is the original, unfiltered
branch, as forked from https://github.com/yl4579/StyleTTS2. It includes very large
pre-trained model files right in the repo, which make it expensive to clone.

We have now run git-filter-repo to remove the following large files from the history:

  • asr/epoch_00080.pth (94MB)
  • plbert/step_1000000.t7 (25MB)
  • jdc/bst.t7 (21MB)
  • data/OOD_texts.txt (31MB)

Instead of living in the repo with the source code, these files are now fetched
from HuggingFace repos when they are needed.

Now, this message exists in a branch that is going to be deleted, so that when
we clone StyleTTS2, the large files are not downloaded, but it's going to remain
on GitHub protected from garbage collection by being the head of this pull request.
It will be fetcheable in the future by running

git fetch origin pull/12/head:unfiltered-branch

Please never merge any commits from any unfiltered branch back into the filtered
main history! This is here only for future reference when needed, and so that
when we checkout EveryVoice commits from before we filtered this repo, git submodule update is able to find the required commits.

@joanise

joanise commented Jun 4, 2026

Copy link
Copy Markdown
Member Author

Once we are ready to switch to the filtered history, do this:

  • rebase this PR onto main
  • fast-forward merge it (from the CLI, not from the GUI!)
  • force push the filtered-history onto main
  • delete or force push all other branches so there are no traces of the unfiltered history in existing branches (only in PR heads)
  • ask everyone to delete their existing sandboxes and start fresh, or else force update their sandboxes while making sure they never merge the unfiltered history back in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant