patrixtano
/

mt5-base-finetuned-anaphora_czech

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

patrixtano commited on Sep 12

Commit

ca67270

•

1 Parent(s): bef632d

End of training

Files changed (2) hide show

README.md +9 -9
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -16,8 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0348
-- Score: 29.6164
 - Char Order: 6
 - Word Order: 0
 - Beta: 2
@@ -40,8 +40,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -49,11 +49,11 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Score   | Char Order | Word Order | Beta |
-|:-------------:|:-----:|:----:|:---------------:|:-------:|:----------:|:----------:|:----:|
-| 0.1076        | 1.0   | 2900 | 0.0491          | 29.5250 | 6          | 0          | 2    |
-| 0.0713        | 2.0   | 5800 | 0.0363          | 29.5776 | 6          | 0          | 2    |
-| 0.0628        | 3.0   | 8700 | 0.0348          | 29.6164 | 6          | 0          | 2    |
 ### Framework versions

 This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0213
+- Score: 28.7290
 - Char Order: 6
 - Word Order: 0
 - Beta: 2
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 4
+- eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Score   | Char Order | Word Order | Beta |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|:----------:|:----------:|:----:|
+| 0.0604        | 1.0   | 11598 | 0.0277          | 28.6422 | 6          | 0          | 2    |
+| 0.041         | 2.0   | 23196 | 0.0224          | 28.7007 | 6          | 0          | 2    |
+| 0.0366        | 3.0   | 34794 | 0.0213          | 28.7290 | 6          | 0          | 2    |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ca8d75cab9af779afc73de06b281abe2cf9a34e46153129204dcb8804b0cab25
 size 2329638768

 version https://git-lfs.github.com/spec/v1
+oid sha256:0248c3310446c293f038466ed848a189b455bf3d7480182dbdb682652ae10db6
 size 2329638768