morit commited on
Commit
ec4b402
1 Parent(s): 3dbf0b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -8
README.md CHANGED
@@ -50,17 +50,18 @@ classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesi
50
 
51
 
52
  ## Training
53
- This model was pre-trained on a set of 100 languages and follwed further training on 198M multilingual tweets as described in the original [paper](https://arxiv.org/abs/2104.12250). Further it was trained on the training set of XNLI dataset in french which is a machine translated version of the MNLI dataset. It was trained on 3 epochs and the following specifications
54
- - learning rate: 5e-5
 
 
55
  - batch size: 32
56
  - max sequence: length 128
57
 
58
- on one GPU (NVIDIA GeForce RTX 3090) resulting in a training time of 1h 47 mins.
59
 
60
-
61
  ## Evaluation
62
- The model was evaluated after each epoch on the eval set of the XNLI Corpus and at the end of training on the Test set of the XNLI corpus.
63
- Using the test set the model reached an accuracy of
64
- ```
65
- predict_accuracy = 77.72 %
66
  ```
 
 
 
50
 
51
 
52
  ## Training
53
+ This model was pre-trained on a set of 100 languages and follwed further training on 198M multilingual tweets as described in the original [paper](https://arxiv.org/abs/2104.12250). Further it was trained on the training set of XNLI dataset in french which is a machine translated version of the MNLI dataset. It was trained on 5 epochs of the XNLI train set and evaluated on the XNLI eval dataset at the end of every to find the best performing model. The model which had the highest accuracy on the eval set was chosen at the end.
54
+
55
+ ![Training Charts from wandb](screen_wandb.png)
56
+ - learning rate: 2e-5
57
  - batch size: 32
58
  - max sequence: length 128
59
 
60
+ using a GPU (NVIDIA GeForce RTX 3090) resulting in a training time of 1h 47 mins.
61
 
 
62
  ## Evaluation
63
+
64
+ The best performing model was evaluatated on the XNLI test set to get a comparable result
 
 
65
  ```
66
+ predict_accuracy = 78.02 %
67
+ ```