Thanks for the great set of templates for NLI model testing! I noticed you provided error rates for five models in data/checklist_master.tsv (shown below)
BERT DistilBERT RoBERTa-large DeBERTa RoBERTa-SNLI-MNLI-FEVER-ANLI
I am trying to reproduce some of the errors for a smaller subset of your dataset. However, I am not sure what exactly these models correspond to. Are they models you find-tuned from scratch or are they the checkpoints already fine-tuned and hosted on HuggingFace model hub?
If you fine-tuned you models from scratch, would it be possible to share your fine-tuning scripts so I could start in the exactly same setting (mostly hyperparameters) as you did.
Thank you in advance!
Thanks for the great set of templates for NLI model testing! I noticed you provided error rates for five models in
data/checklist_master.tsv(shown below)I am trying to reproduce some of the errors for a smaller subset of your dataset. However, I am not sure what exactly these models correspond to. Are they models you find-tuned from scratch or are they the checkpoints already fine-tuned and hosted on HuggingFace model hub?
If you fine-tuned you models from scratch, would it be possible to share your fine-tuning scripts so I could start in the exactly same setting (mostly hyperparameters) as you did.
Thank you in advance!