sunfu-chou
/

git-base-naruto

@@ -15,8 +15,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/git-base](https://huggingface.co/microsoft/git-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0506
-- Wer Score: 1.3548
 ## Model description
@@ -36,11 +36,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 2
 - eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 2
-- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 50
@@ -50,19 +50,33 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss | Wer Score |
 |:-------------:|:-------:|:----:|:---------------:|:---------:|
-| 7.32          | 3.7037  | 50   | 4.4920          | 7.1290    |
-| 2.3088        | 7.4074  | 100  | 0.4180          | 0.3710    |
-| 0.1216        | 11.1111 | 150  | 0.0417          | 0.4677    |
-| 0.0159        | 14.8148 | 200  | 0.0359          | 0.4194    |
-| 0.0112        | 18.5185 | 250  | 0.0404          | 0.4516    |
-| 0.0089        | 22.2222 | 300  | 0.0454          | 0.4839    |
-| 0.0076        | 25.9259 | 350  | 0.0468          | 0.4677    |
-| 0.0065        | 29.6296 | 400  | 0.0474          | 0.4355    |
-| 0.0045        | 33.3333 | 450  | 0.0517          | 0.8548    |
-| 0.0032        | 37.0370 | 500  | 0.0494          | 3.0       |
-| 0.0018        | 40.7407 | 550  | 0.0511          | 1.0323    |
-| 0.0011        | 44.4444 | 600  | 0.0512          | 1.5323    |
-| 0.001         | 48.1481 | 650  | 0.0506          | 1.3548    |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/git-base](https://huggingface.co/microsoft/git-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0495
+- Wer Score: 4.7488
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 4
 - eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 2
+- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 50
 | Training Loss | Epoch   | Step | Validation Loss | Wer Score |
 |:-------------:|:-------:|:----:|:---------------:|:---------:|
+| 7.2487        | 1.8182  | 50   | 4.3718          | 22.0605   |
+| 2.0953        | 3.6364  | 100  | 0.2876          | 4.0186    |
+| 0.0846        | 5.4545  | 150  | 0.0417          | 0.4419    |
+| 0.023         | 7.2727  | 200  | 0.0380          | 0.4233    |
+| 0.018         | 9.0909  | 250  | 0.0369          | 0.4186    |
+| 0.0144        | 10.9091 | 300  | 0.0393          | 3.0093    |
+| 0.0116        | 12.7273 | 350  | 0.0407          | 6.9628    |
+| 0.0087        | 14.5455 | 400  | 0.0406          | 3.5209    |
+| 0.0062        | 16.3636 | 450  | 0.0423          | 14.7023   |
+| 0.0034        | 18.1818 | 500  | 0.0429          | 9.0372    |
+| 0.0024        | 20.0    | 550  | 0.0471          | 8.3442    |
+| 0.0013        | 21.8182 | 600  | 0.0469          | 13.5907   |
+| 0.0009        | 23.6364 | 650  | 0.0464          | 14.6186   |
+| 0.0005        | 25.4545 | 700  | 0.0468          | 11.1674   |
+| 0.0004        | 27.2727 | 750  | 0.0476          | 7.9907    |
+| 0.0003        | 29.0909 | 800  | 0.0480          | 7.3070    |
+| 0.0003        | 30.9091 | 850  | 0.0480          | 7.2140    |
+| 0.0003        | 32.7273 | 900  | 0.0484          | 6.9628    |
+| 0.0003        | 34.5455 | 950  | 0.0487          | 6.8512    |
+| 0.0003        | 36.3636 | 1000 | 0.0489          | 6.0698    |
+| 0.0003        | 38.1818 | 1050 | 0.0491          | 5.4837    |
+| 0.0003        | 40.0    | 1100 | 0.0492          | 4.9256    |
+| 0.0002        | 41.8182 | 1150 | 0.0493          | 4.7860    |
+| 0.0002        | 43.6364 | 1200 | 0.0493          | 4.8140    |
+| 0.0002        | 45.4545 | 1250 | 0.0494          | 4.8       |
+| 0.0002        | 47.2727 | 1300 | 0.0495          | 4.7581    |
+| 0.0002        | 49.0909 | 1350 | 0.0495          | 4.7488    |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:548033d6446f937090522afaaa3225d214b0daa2b6e8988aeab94b8b15ca4601
 size 706516040

 version https://git-lfs.github.com/spec/v1
+oid sha256:a1b671164e2bd941a53638004974d052182327625ae9a94b38e72ee1b5fa1a7d
 size 706516040