ibraheemmoosa
/

xlmindic-base-uniscript

masked-language-modeling

sentence-order-prediction

transliteration

Carbon Emissions

Inference Endpoints

Model card Files Files and versions Community

ibraheemmoosa commited on Jan 11, 2022

Commit

6107c56

•

1 Parent(s): 722d270

Add pretraining checkpoint documentation

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -124,7 +124,11 @@ The details of the sentence order prediction example generation procedure for ea
 - Split the sentence into two parts A and B at a random index.
 - With 50% probability swap the two parts.
-The model was pretrained on TPUv3-8 for 1M steps. We have checkpoints available every 10k steps. We will upload these in the future.
 ## Evaluation results
 We evaluated this model on the Indo-Aryan subset of languages (Panjabi, Oriya, Assamese, Bangla, Hindi, Marathi, Gujarati) from the [IndicGLUE](https://huggingface.co/datasets/indic_glue) benchmark dataset. We report the mean and standard deviation of nine fine-tuning runs for this model. We compare with an [ablation model](https://huggingface.co/ibraheemmoosa/xlmindic-base-multiscript) that do not use transliteration and is instead trained on original scripts.

 - Split the sentence into two parts A and B at a random index.
 - With 50% probability swap the two parts.
+The model was pretrained on TPUv3-8 for 1M steps. We have checkpoints available at every 100k pretraining steps. These are available at different branches of this repository. You can load these checkpoints by passing the `revision` parameter. For example to load the checkpoint at 500k you can use the following code.
+```python
+>>> AutoModel.from_pretrained('ibraheemmoosa/xlmindic-base-uniscript', revision='checkpoint_500k')
+```
 ## Evaluation results
 We evaluated this model on the Indo-Aryan subset of languages (Panjabi, Oriya, Assamese, Bangla, Hindi, Marathi, Gujarati) from the [IndicGLUE](https://huggingface.co/datasets/indic_glue) benchmark dataset. We report the mean and standard deviation of nine fine-tuning runs for this model. We compare with an [ablation model](https://huggingface.co/ibraheemmoosa/xlmindic-base-multiscript) that do not use transliteration and is instead trained on original scripts.