hugo-albert commited on
Commit
6ffc23c
·
verified ·
1 Parent(s): e580162

Training complete

Browse files
Files changed (4) hide show
  1. README.md +19 -17
  2. adapter_config.json +2 -2
  3. adapter_model.bin +1 -1
  4. training_args.bin +1 -1
README.md CHANGED
@@ -17,9 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 0.7738
21
- - Bleu: 67.4647
22
- - Gen Len: 75.9455
23
 
24
  ## Model description
25
 
@@ -39,27 +39,29 @@ More information needed
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 0.0001
42
- - train_batch_size: 16
43
- - eval_batch_size: 16
44
  - seed: 42
 
 
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: linear
47
  - num_epochs: 10
48
 
49
  ### Training results
50
 
51
- | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
52
- |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
53
- | No log | 1.0 | 67 | 2.6896 | 29.0389 | 96.5455 |
54
- | No log | 2.0 | 134 | 1.6534 | 30.4693 | 96.6727 |
55
- | No log | 3.0 | 201 | 1.2046 | 55.0467 | 76.7455 |
56
- | No log | 4.0 | 268 | 1.0048 | 59.5519 | 76.9091 |
57
- | No log | 5.0 | 335 | 0.9176 | 64.2229 | 75.5455 |
58
- | No log | 6.0 | 402 | 0.8610 | 65.8311 | 73.6909 |
59
- | No log | 7.0 | 469 | 0.8160 | 65.5771 | 76.4727 |
60
- | 1.5731 | 8.0 | 536 | 0.7968 | 67.9558 | 74.7636 |
61
- | 1.5731 | 9.0 | 603 | 0.7794 | 67.5994 | 75.8 |
62
- | 1.5731 | 10.0 | 670 | 0.7738 | 67.4647 | 75.9455 |
63
 
64
 
65
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 1.1142
21
+ - Bleu: 58.3679
22
+ - Gen Len: 74.4727
23
 
24
  ## Model description
25
 
 
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 0.0001
42
+ - train_batch_size: 8
43
+ - eval_batch_size: 8
44
  - seed: 42
45
+ - gradient_accumulation_steps: 4
46
+ - total_train_batch_size: 32
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
49
  - num_epochs: 10
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
54
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:--------:|
55
+ | No log | 0.99 | 33 | 3.5500 | 28.0752 | 98.4545 |
56
+ | No log | 2.0 | 67 | 2.6889 | 28.4762 | 97.7273 |
57
+ | No log | 2.99 | 100 | 2.1016 | 13.9425 | 131.9636 |
58
+ | No log | 4.0 | 134 | 1.6955 | 20.9551 | 114.3091 |
59
+ | No log | 4.99 | 167 | 1.4578 | 44.5358 | 83.4 |
60
+ | No log | 6.0 | 201 | 1.2986 | 53.9615 | 75.0545 |
61
+ | No log | 6.99 | 234 | 1.2113 | 56.6086 | 77.4182 |
62
+ | No log | 8.0 | 268 | 1.1550 | 57.2346 | 73.8364 |
63
+ | No log | 8.99 | 301 | 1.1222 | 58.1529 | 74.2 |
64
+ | No log | 9.85 | 330 | 1.1142 | 58.3679 | 74.4727 |
65
 
66
 
67
  ### Framework versions
adapter_config.json CHANGED
@@ -20,8 +20,8 @@
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
- "v_proj",
24
- "q_proj"
25
  ],
26
  "task_type": "SEQ_2_SEQ_LM",
27
  "use_dora": false,
 
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
+ "q_proj",
24
+ "v_proj"
25
  ],
26
  "task_type": "SEQ_2_SEQ_LM",
27
  "use_dora": false,
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d95e076dd55aa2db5b8d4f5e24176646443ea33ff2c8cbff3ca1d8b09db968d2
3
  size 9490378
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:57fe5c5a9627945acff7a4c3b2533a16c27cb0237c2a7359e9fb3624bba63c68
3
  size 9490378
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9568172fb4f6db188aab6fa3d5fb9b6a67ac546b3ef822e05a6a5d92fcc3da1c
3
  size 4664
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a706b21a1a550b35711b310fd213517ab949b1fbc24ae6e053bf6cb0d2a55fb
3
  size 4664