update model card
Browse files
README.md
CHANGED
@@ -111,7 +111,7 @@ The following hyperparamenters were set on the Fairseq toolkit:
|
|
111 |
| Dropout | 0.1 |
|
112 |
| Label smoothing | 0.1 |
|
113 |
|
114 |
-
The model was trained using shards of 10 million sentences, for a total of 8.548 updates. Weights were saved every 1000 updates and reported results are the average of the last
|
115 |
|
116 |
## Evaluation
|
117 |
|
|
|
111 |
| Dropout | 0.1 |
|
112 |
| Label smoothing | 0.1 |
|
113 |
|
114 |
+
The model was trained using shards of 10 million sentences, for a total of 8.548 updates. Weights were saved every 1000 updates and reported results are the average of the last 8 checkpoints.
|
115 |
|
116 |
## Evaluation
|
117 |
|