ibraheemmoosa
commited on
Commit
•
6107c56
1
Parent(s):
722d270
Add pretraining checkpoint documentation
Browse files
README.md
CHANGED
@@ -124,7 +124,11 @@ The details of the sentence order prediction example generation procedure for ea
|
|
124 |
- Split the sentence into two parts A and B at a random index.
|
125 |
- With 50% probability swap the two parts.
|
126 |
|
127 |
-
The model was pretrained on TPUv3-8 for 1M steps. We have checkpoints available every
|
|
|
|
|
|
|
|
|
128 |
|
129 |
## Evaluation results
|
130 |
We evaluated this model on the Indo-Aryan subset of languages (Panjabi, Oriya, Assamese, Bangla, Hindi, Marathi, Gujarati) from the [IndicGLUE](https://huggingface.co/datasets/indic_glue) benchmark dataset. We report the mean and standard deviation of nine fine-tuning runs for this model. We compare with an [ablation model](https://huggingface.co/ibraheemmoosa/xlmindic-base-multiscript) that do not use transliteration and is instead trained on original scripts.
|
|
|
124 |
- Split the sentence into two parts A and B at a random index.
|
125 |
- With 50% probability swap the two parts.
|
126 |
|
127 |
+
The model was pretrained on TPUv3-8 for 1M steps. We have checkpoints available at every 100k pretraining steps. These are available at different branches of this repository. You can load these checkpoints by passing the `revision` parameter. For example to load the checkpoint at 500k you can use the following code.
|
128 |
+
|
129 |
+
```python
|
130 |
+
>>> AutoModel.from_pretrained('ibraheemmoosa/xlmindic-base-uniscript', revision='checkpoint_500k')
|
131 |
+
```
|
132 |
|
133 |
## Evaluation results
|
134 |
We evaluated this model on the Indo-Aryan subset of languages (Panjabi, Oriya, Assamese, Bangla, Hindi, Marathi, Gujarati) from the [IndicGLUE](https://huggingface.co/datasets/indic_glue) benchmark dataset. We report the mean and standard deviation of nine fine-tuning runs for this model. We compare with an [ablation model](https://huggingface.co/ibraheemmoosa/xlmindic-base-multiscript) that do not use transliteration and is instead trained on original scripts.
|