Marcos12886 commited on
Commit
4c99118
1 Parent(s): 969bb37

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +9 -33
  2. checkpoint-111/model.safetensors +1 -1
  3. checkpoint-111/optimizer.pt +1 -1
  4. checkpoint-111/scheduler.pt +1 -1
  5. checkpoint-111/trainer_state.json +33 -51
  6. checkpoint-111/training_args.bin +1 -1
  7. checkpoint-126/config.json +84 -0
  8. checkpoint-126/model.safetensors +3 -0
  9. checkpoint-126/optimizer.pt +3 -0
  10. checkpoint-126/rng_state.pth +3 -0
  11. checkpoint-126/scheduler.pt +3 -0
  12. checkpoint-126/trainer_state.json +105 -0
  13. checkpoint-126/training_args.bin +3 -0
  14. checkpoint-18/model.safetensors +1 -1
  15. checkpoint-18/optimizer.pt +1 -1
  16. checkpoint-18/scheduler.pt +1 -1
  17. checkpoint-18/trainer_state.json +8 -11
  18. checkpoint-18/training_args.bin +1 -1
  19. checkpoint-37/model.safetensors +1 -1
  20. checkpoint-37/optimizer.pt +1 -1
  21. checkpoint-37/scheduler.pt +1 -1
  22. checkpoint-37/trainer_state.json +14 -20
  23. checkpoint-37/training_args.bin +1 -1
  24. checkpoint-54/model.safetensors +1 -1
  25. checkpoint-54/optimizer.pt +1 -1
  26. checkpoint-54/rng_state.pth +2 -2
  27. checkpoint-54/trainer_state.json +25 -16
  28. checkpoint-54/training_args.bin +1 -1
  29. checkpoint-55/model.safetensors +1 -1
  30. checkpoint-55/optimizer.pt +1 -1
  31. checkpoint-55/scheduler.pt +1 -1
  32. checkpoint-55/trainer_state.json +18 -27
  33. checkpoint-55/training_args.bin +1 -1
  34. checkpoint-74/model.safetensors +1 -1
  35. checkpoint-74/optimizer.pt +1 -1
  36. checkpoint-74/scheduler.pt +1 -1
  37. checkpoint-74/trainer_state.json +23 -35
  38. checkpoint-74/training_args.bin +1 -1
  39. checkpoint-93/model.safetensors +1 -1
  40. checkpoint-93/optimizer.pt +1 -1
  41. checkpoint-93/scheduler.pt +1 -1
  42. checkpoint-93/trainer_state.json +29 -44
  43. checkpoint-93/training_args.bin +1 -1
  44. model.safetensors +1 -1
  45. runs/Sep02_21-37-15_ubumarcos/events.out.tfevents.1725305838.ubumarcos +3 -0
  46. runs/Sep02_23-16-29_ubumarcos/events.out.tfevents.1725311792.ubumarcos +3 -0
  47. runs/Sep02_23-18-00_ubumarcos/events.out.tfevents.1725311883.ubumarcos +3 -0
  48. runs/Sep03_00-14-06_ubumarcos/events.out.tfevents.1725315248.ubumarcos +3 -0
  49. runs/Sep03_13-16-32_ubumarcos/events.out.tfevents.1725362195.ubumarcos +3 -0
  50. runs/Sep03_13-18-34_ubumarcos/events.out.tfevents.1725362316.ubumarcos +3 -0
README.md CHANGED
@@ -8,9 +8,6 @@ datasets:
8
  - audiofolder
9
  metrics:
10
  - accuracy
11
- - f1
12
- - precision
13
- - recall
14
  model-index:
15
  - name: distilhubert-finetuned-mixed-data
16
  results:
@@ -26,16 +23,7 @@ model-index:
26
  metrics:
27
  - name: Accuracy
28
  type: accuracy
29
- value: 0.9026845637583892
30
- - name: F1
31
- type: f1
32
- value: 0.9017814679012008
33
- - name: Precision
34
- type: precision
35
- value: 0.901095676384633
36
- - name: Recall
37
- type: recall
38
- value: 0.9026845637583892
39
  ---
40
 
41
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -45,11 +33,8 @@ should probably proofread and complete it, then remove this comment. -->
45
 
46
  This model is a fine-tuned version of [ntu-spml/distilhubert](https://huggingface.co/ntu-spml/distilhubert) on the audiofolder dataset.
47
  It achieves the following results on the evaluation set:
48
- - Loss: 0.2976
49
- - Accuracy: 0.9027
50
- - F1: 0.9018
51
- - Precision: 0.9011
52
- - Recall: 0.9027
53
 
54
  ## Model description
55
 
@@ -77,24 +62,15 @@ The following hyperparameters were used during training:
77
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
78
  - lr_scheduler_type: cosine
79
  - lr_scheduler_warmup_ratio: 0.001
80
- - num_epochs: 12
81
 
82
  ### Training results
83
 
84
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
85
- |:-------------:|:-------:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
86
- | No log | 0.9664 | 18 | 0.6696 | 0.7819 | 0.7264 | 0.6898 | 0.7819 |
87
- | No log | 1.9866 | 37 | 0.5068 | 0.7752 | 0.7203 | 0.6849 | 0.7752 |
88
- | No log | 2.9530 | 55 | 0.4304 | 0.8087 | 0.7535 | 0.7242 | 0.8087 |
89
- | No log | 3.9732 | 74 | 0.4109 | 0.8523 | 0.8434 | 0.8728 | 0.8523 |
90
- | No log | 4.9933 | 93 | 0.3263 | 0.8725 | 0.8718 | 0.8719 | 0.8725 |
91
- | No log | 5.9597 | 111 | 0.3036 | 0.8826 | 0.8824 | 0.8824 | 0.8826 |
92
- | No log | 6.9799 | 130 | 0.3046 | 0.8893 | 0.8876 | 0.8892 | 0.8893 |
93
- | No log | 8.0 | 149 | 0.3244 | 0.8758 | 0.8770 | 0.8787 | 0.8758 |
94
- | No log | 8.9664 | 167 | 0.2962 | 0.9027 | 0.9018 | 0.9012 | 0.9027 |
95
- | No log | 9.9866 | 186 | 0.2971 | 0.9027 | 0.9010 | 0.9014 | 0.9027 |
96
- | No log | 10.9530 | 204 | 0.2974 | 0.9094 | 0.9082 | 0.9077 | 0.9094 |
97
- | No log | 11.5973 | 216 | 0.2976 | 0.9027 | 0.9018 | 0.9011 | 0.9027 |
98
 
99
 
100
  ### Framework versions
 
8
  - audiofolder
9
  metrics:
10
  - accuracy
 
 
 
11
  model-index:
12
  - name: distilhubert-finetuned-mixed-data
13
  results:
 
23
  metrics:
24
  - name: Accuracy
25
  type: accuracy
26
+ value: 0.7919463087248322
 
 
 
 
 
 
 
 
 
27
  ---
28
 
29
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
33
 
34
  This model is a fine-tuned version of [ntu-spml/distilhubert](https://huggingface.co/ntu-spml/distilhubert) on the audiofolder dataset.
35
  It achieves the following results on the evaluation set:
36
+ - Loss: 0.4952
37
+ - Accuracy: 0.7919
 
 
 
38
 
39
  ## Model description
40
 
 
62
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
63
  - lr_scheduler_type: cosine
64
  - lr_scheduler_warmup_ratio: 0.001
65
+ - num_epochs: 3
66
 
67
  ### Training results
68
 
69
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
70
+ |:-------------:|:------:|:----:|:---------------:|:--------:|
71
+ | No log | 0.9664 | 18 | 0.7078 | 0.7584 |
72
+ | No log | 1.9866 | 37 | 0.5109 | 0.7852 |
73
+ | No log | 2.8993 | 54 | 0.4952 | 0.7919 |
 
 
 
 
 
 
 
 
 
74
 
75
 
76
  ### Framework versions
checkpoint-111/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8afb68ff2611e3603dee528e572e6fc36c40e47cec34c6ee683636922c8055e1
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6cfaf117c65cf6ad96ea90019d829932297275d9e389ebcc4ceeb3e2d099f17
3
  size 94765560
checkpoint-111/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:105e2897a04f3a74cea8691093cd964cb3e2476cb37888e94bd97ac020d7bc23
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50c80d46c6ed45a1190e000a005d9359cffdc2f8b2d211aab96d7fddd708c357
3
  size 189556666
checkpoint-111/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fd5bf82e806804b25d214305b99f2c178bc6f19a61f077621eaca5b3cb5523cd
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2dc9ec36b920afe942421e68f88dd5c12762bff976ab7c419a2e088a9640109
3
  size 1064
checkpoint-111/trainer_state.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "best_metric": 0.8825503355704698,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-111",
4
  "epoch": 5.959731543624161,
5
  "eval_steps": 500,
@@ -10,81 +10,63 @@
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
- "eval_accuracy": 0.7818791946308725,
14
- "eval_f1": 0.7264205130236912,
15
- "eval_loss": 0.669560968875885,
16
- "eval_precision": 0.689807639599501,
17
- "eval_recall": 0.7818791946308725,
18
- "eval_runtime": 0.9033,
19
- "eval_samples_per_second": 329.896,
20
- "eval_steps_per_second": 42.067,
21
  "step": 18
22
  },
23
  {
24
  "epoch": 1.9865771812080537,
25
- "eval_accuracy": 0.7751677852348994,
26
- "eval_f1": 0.7202681570933687,
27
- "eval_loss": 0.5067932605743408,
28
- "eval_precision": 0.684911313518696,
29
- "eval_recall": 0.7751677852348994,
30
- "eval_runtime": 0.907,
31
- "eval_samples_per_second": 328.546,
32
- "eval_steps_per_second": 41.895,
33
  "step": 37
34
  },
35
  {
36
  "epoch": 2.953020134228188,
37
- "eval_accuracy": 0.8087248322147651,
38
- "eval_f1": 0.7535236037076262,
39
- "eval_loss": 0.43038079142570496,
40
- "eval_precision": 0.7241626365959,
41
- "eval_recall": 0.8087248322147651,
42
- "eval_runtime": 0.8664,
43
- "eval_samples_per_second": 343.963,
44
- "eval_steps_per_second": 43.861,
45
  "step": 55
46
  },
47
  {
48
  "epoch": 3.9731543624161074,
49
- "eval_accuracy": 0.8523489932885906,
50
- "eval_f1": 0.8433916249277822,
51
- "eval_loss": 0.4109182059764862,
52
- "eval_precision": 0.8727817866814688,
53
- "eval_recall": 0.8523489932885906,
54
- "eval_runtime": 0.8712,
55
- "eval_samples_per_second": 342.059,
56
- "eval_steps_per_second": 43.618,
57
  "step": 74
58
  },
59
  {
60
  "epoch": 4.993288590604027,
61
- "eval_accuracy": 0.87248322147651,
62
- "eval_f1": 0.8717711524765707,
63
- "eval_loss": 0.3263051509857178,
64
- "eval_precision": 0.8718521382399975,
65
- "eval_recall": 0.87248322147651,
66
- "eval_runtime": 0.87,
67
- "eval_samples_per_second": 342.548,
68
- "eval_steps_per_second": 43.681,
69
  "step": 93
70
  },
71
  {
72
  "epoch": 5.959731543624161,
73
- "eval_accuracy": 0.8825503355704698,
74
- "eval_f1": 0.8824400125399595,
75
- "eval_loss": 0.3035907447338104,
76
- "eval_precision": 0.8824270850226767,
77
- "eval_recall": 0.8825503355704698,
78
- "eval_runtime": 0.8921,
79
- "eval_samples_per_second": 334.055,
80
- "eval_steps_per_second": 42.598,
81
  "step": 111
82
  }
83
  ],
84
  "logging_steps": 500,
85
- "max_steps": 216,
86
  "num_input_tokens_seen": 0,
87
- "num_train_epochs": 12,
88
  "save_steps": 500,
89
  "stateful_callbacks": {
90
  "EarlyStoppingCallback": {
 
1
  {
2
+ "best_metric": 0.8657718120805369,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-111",
4
  "epoch": 5.959731543624161,
5
  "eval_steps": 500,
 
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.686046838760376,
15
+ "eval_runtime": 3.2719,
16
+ "eval_samples_per_second": 91.079,
17
+ "eval_steps_per_second": 11.614,
 
 
 
18
  "step": 18
19
  },
20
  {
21
  "epoch": 1.9865771812080537,
22
+ "eval_accuracy": 0.802013422818792,
23
+ "eval_loss": 0.46226799488067627,
24
+ "eval_runtime": 3.3286,
25
+ "eval_samples_per_second": 89.527,
26
+ "eval_steps_per_second": 11.416,
 
 
 
27
  "step": 37
28
  },
29
  {
30
  "epoch": 2.953020134228188,
31
+ "eval_accuracy": 0.8187919463087249,
32
+ "eval_loss": 0.4068666100502014,
33
+ "eval_runtime": 3.2087,
34
+ "eval_samples_per_second": 92.871,
35
+ "eval_steps_per_second": 11.843,
 
 
 
36
  "step": 55
37
  },
38
  {
39
  "epoch": 3.9731543624161074,
40
+ "eval_accuracy": 0.8355704697986577,
41
+ "eval_loss": 0.3811332583427429,
42
+ "eval_runtime": 3.2325,
43
+ "eval_samples_per_second": 92.188,
44
+ "eval_steps_per_second": 11.755,
 
 
 
45
  "step": 74
46
  },
47
  {
48
  "epoch": 4.993288590604027,
49
+ "eval_accuracy": 0.8355704697986577,
50
+ "eval_loss": 0.3542439937591553,
51
+ "eval_runtime": 3.2746,
52
+ "eval_samples_per_second": 91.003,
53
+ "eval_steps_per_second": 11.604,
 
 
 
54
  "step": 93
55
  },
56
  {
57
  "epoch": 5.959731543624161,
58
+ "eval_accuracy": 0.8657718120805369,
59
+ "eval_loss": 0.33795884251594543,
60
+ "eval_runtime": 3.2548,
61
+ "eval_samples_per_second": 91.556,
62
+ "eval_steps_per_second": 11.675,
 
 
 
63
  "step": 111
64
  }
65
  ],
66
  "logging_steps": 500,
67
+ "max_steps": 126,
68
  "num_input_tokens_seen": 0,
69
+ "num_train_epochs": 7,
70
  "save_steps": 500,
71
  "stateful_callbacks": {
72
  "EarlyStoppingCallback": {
checkpoint-111/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:296be9afae72ab3934d873f0cf92f87ef76899c18b11651de670afb49aa1a5d6
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4320ed7eb3857f3356f3c0fd71b66d450b29bc6f61001ac820f978865e977454
3
  size 5240
checkpoint-126/config.json ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "ntu-spml/distilhubert",
3
+ "activation_dropout": 0.1,
4
+ "apply_spec_augment": false,
5
+ "architectures": [
6
+ "HubertForSequenceClassification"
7
+ ],
8
+ "attention_dropout": 0.1,
9
+ "bos_token_id": 1,
10
+ "classifier_proj_size": 256,
11
+ "conv_bias": false,
12
+ "conv_dim": [
13
+ 512,
14
+ 512,
15
+ 512,
16
+ 512,
17
+ 512,
18
+ 512,
19
+ 512
20
+ ],
21
+ "conv_kernel": [
22
+ 10,
23
+ 3,
24
+ 3,
25
+ 3,
26
+ 3,
27
+ 2,
28
+ 2
29
+ ],
30
+ "conv_stride": [
31
+ 5,
32
+ 2,
33
+ 2,
34
+ 2,
35
+ 2,
36
+ 2,
37
+ 2
38
+ ],
39
+ "ctc_loss_reduction": "sum",
40
+ "ctc_zero_infinity": false,
41
+ "do_stable_layer_norm": false,
42
+ "eos_token_id": 2,
43
+ "feat_extract_activation": "gelu",
44
+ "feat_extract_norm": "group",
45
+ "feat_proj_dropout": 0.0,
46
+ "feat_proj_layer_norm": false,
47
+ "final_dropout": 0.0,
48
+ "hidden_act": "gelu",
49
+ "hidden_dropout": 0.1,
50
+ "hidden_size": 768,
51
+ "id2label": {
52
+ "0": "1s_asphyxia",
53
+ "1": "1s_hunger",
54
+ "2": "1s_normal",
55
+ "3": "1s_pain"
56
+ },
57
+ "initializer_range": 0.02,
58
+ "intermediate_size": 3072,
59
+ "label2id": {
60
+ "1s_asphyxia": "0",
61
+ "1s_hunger": "1",
62
+ "1s_normal": "2",
63
+ "1s_pain": "3"
64
+ },
65
+ "layer_norm_eps": 1e-05,
66
+ "layerdrop": 0.0,
67
+ "mask_feature_length": 10,
68
+ "mask_feature_min_masks": 0,
69
+ "mask_feature_prob": 0.0,
70
+ "mask_time_length": 10,
71
+ "mask_time_min_masks": 2,
72
+ "mask_time_prob": 0.05,
73
+ "model_type": "hubert",
74
+ "num_attention_heads": 12,
75
+ "num_conv_pos_embedding_groups": 16,
76
+ "num_conv_pos_embeddings": 128,
77
+ "num_feat_extract_layers": 7,
78
+ "num_hidden_layers": 2,
79
+ "pad_token_id": 0,
80
+ "torch_dtype": "float32",
81
+ "transformers_version": "4.44.2",
82
+ "use_weighted_layer_sum": false,
83
+ "vocab_size": 32
84
+ }
checkpoint-126/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4510ed7e13f4bc4dd001eb3a533578bbc919b6f595428ad96456f352af1cc8e6
3
+ size 94765560
checkpoint-126/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5d9819ed97d940763643a3cfd6dc3ffac7621d0614842f5a7e5119aa1adc61d
3
+ size 189556666
checkpoint-126/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2756daa1d15b38a73c17f51c8dd3dc3188afaca774220382259e764150eca057
3
+ size 14308
checkpoint-126/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e1524b5fb3e7ae768641537cfd8745bb359ffd8b97cac004ef74c63f2b5b06c
3
+ size 1064
checkpoint-126/trainer_state.json ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.8691275167785235,
3
+ "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-126",
4
+ "epoch": 6.76510067114094,
5
+ "eval_steps": 500,
6
+ "global_step": 126,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.686046838760376,
15
+ "eval_runtime": 3.2719,
16
+ "eval_samples_per_second": 91.079,
17
+ "eval_steps_per_second": 11.614,
18
+ "step": 18
19
+ },
20
+ {
21
+ "epoch": 1.9865771812080537,
22
+ "eval_accuracy": 0.802013422818792,
23
+ "eval_loss": 0.46226799488067627,
24
+ "eval_runtime": 3.3286,
25
+ "eval_samples_per_second": 89.527,
26
+ "eval_steps_per_second": 11.416,
27
+ "step": 37
28
+ },
29
+ {
30
+ "epoch": 2.953020134228188,
31
+ "eval_accuracy": 0.8187919463087249,
32
+ "eval_loss": 0.4068666100502014,
33
+ "eval_runtime": 3.2087,
34
+ "eval_samples_per_second": 92.871,
35
+ "eval_steps_per_second": 11.843,
36
+ "step": 55
37
+ },
38
+ {
39
+ "epoch": 3.9731543624161074,
40
+ "eval_accuracy": 0.8355704697986577,
41
+ "eval_loss": 0.3811332583427429,
42
+ "eval_runtime": 3.2325,
43
+ "eval_samples_per_second": 92.188,
44
+ "eval_steps_per_second": 11.755,
45
+ "step": 74
46
+ },
47
+ {
48
+ "epoch": 4.993288590604027,
49
+ "eval_accuracy": 0.8355704697986577,
50
+ "eval_loss": 0.3542439937591553,
51
+ "eval_runtime": 3.2746,
52
+ "eval_samples_per_second": 91.003,
53
+ "eval_steps_per_second": 11.604,
54
+ "step": 93
55
+ },
56
+ {
57
+ "epoch": 5.959731543624161,
58
+ "eval_accuracy": 0.8657718120805369,
59
+ "eval_loss": 0.33795884251594543,
60
+ "eval_runtime": 3.2548,
61
+ "eval_samples_per_second": 91.556,
62
+ "eval_steps_per_second": 11.675,
63
+ "step": 111
64
+ },
65
+ {
66
+ "epoch": 6.76510067114094,
67
+ "eval_accuracy": 0.8691275167785235,
68
+ "eval_loss": 0.33603718876838684,
69
+ "eval_runtime": 3.1993,
70
+ "eval_samples_per_second": 93.145,
71
+ "eval_steps_per_second": 11.877,
72
+ "step": 126
73
+ }
74
+ ],
75
+ "logging_steps": 500,
76
+ "max_steps": 126,
77
+ "num_input_tokens_seen": 0,
78
+ "num_train_epochs": 7,
79
+ "save_steps": 500,
80
+ "stateful_callbacks": {
81
+ "EarlyStoppingCallback": {
82
+ "args": {
83
+ "early_stopping_patience": 3,
84
+ "early_stopping_threshold": 0.0
85
+ },
86
+ "attributes": {
87
+ "early_stopping_patience_counter": 0
88
+ }
89
+ },
90
+ "TrainerControl": {
91
+ "args": {
92
+ "should_epoch_stop": false,
93
+ "should_evaluate": false,
94
+ "should_log": false,
95
+ "should_save": true,
96
+ "should_training_stop": true
97
+ },
98
+ "attributes": {}
99
+ }
100
+ },
101
+ "total_flos": 1.831207226112e+16,
102
+ "train_batch_size": 8,
103
+ "trial_name": null,
104
+ "trial_params": null
105
+ }
checkpoint-126/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4320ed7eb3857f3356f3c0fd71b66d450b29bc6f61001ac820f978865e977454
3
+ size 5240
checkpoint-18/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:64f51b8591199469762fae24f74ba430f094c3690d9c48c1cd603b5de70546cc
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fedc5342e503764722ceaa1636b7c8498a2aa3bdaaef986214edb574f615af9b
3
  size 94765560
checkpoint-18/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3e666f8131b7924c8e7c848969c0e2bd68f8876fd6d4f4be6390692b1d8660d2
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc2385515662f218746793346cf7c0642966c0a2fc816dba4bfca3f4f9045571
3
  size 189556666
checkpoint-18/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5c3bfa84a5a5584f0c303cb2a5ebd02ca6effcf09b8f28364ebe555fedff3ef2
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79e952bae3d444fea9fb53c21720e163f1659ebc933422f10b6da73663ab7443
3
  size 1064
checkpoint-18/trainer_state.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "best_metric": 0.7818791946308725,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-18",
4
  "epoch": 0.9664429530201343,
5
  "eval_steps": 500,
@@ -10,21 +10,18 @@
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
- "eval_accuracy": 0.7818791946308725,
14
- "eval_f1": 0.7264205130236912,
15
- "eval_loss": 0.669560968875885,
16
- "eval_precision": 0.689807639599501,
17
- "eval_recall": 0.7818791946308725,
18
- "eval_runtime": 0.9033,
19
- "eval_samples_per_second": 329.896,
20
- "eval_steps_per_second": 42.067,
21
  "step": 18
22
  }
23
  ],
24
  "logging_steps": 500,
25
- "max_steps": 216,
26
  "num_input_tokens_seen": 0,
27
- "num_train_epochs": 12,
28
  "save_steps": 500,
29
  "stateful_callbacks": {
30
  "EarlyStoppingCallback": {
 
1
  {
2
+ "best_metric": 0.7583892617449665,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-18",
4
  "epoch": 0.9664429530201343,
5
  "eval_steps": 500,
 
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.7078245878219604,
15
+ "eval_runtime": 3.2791,
16
+ "eval_samples_per_second": 90.879,
17
+ "eval_steps_per_second": 11.589,
 
 
 
18
  "step": 18
19
  }
20
  ],
21
  "logging_steps": 500,
22
+ "max_steps": 54,
23
  "num_input_tokens_seen": 0,
24
+ "num_train_epochs": 3,
25
  "save_steps": 500,
26
  "stateful_callbacks": {
27
  "EarlyStoppingCallback": {
checkpoint-18/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:296be9afae72ab3934d873f0cf92f87ef76899c18b11651de670afb49aa1a5d6
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b74ef34b0c98ff8e2f446712958b67f0e9dae2c892b2706cb653c6c4a3cba29
3
  size 5240
checkpoint-37/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bd1228d12cca62942f22b7cd2a53221799d4a8f6f0c65cecd82e15483991b3bc
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0cb23af36fc9e2835fb254591859fcc2a29bed34d849c9632574cf4e143c2ffe
3
  size 94765560
checkpoint-37/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:770512ff23ef603be29f7cf1b3dd36bbdc3cd34ae0b9d588aa475cdae922d2c0
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b3ee13fed17092efc92242544dbde28f36e42511a859596375a636a90508461
3
  size 189556666
checkpoint-37/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f744fe2b4201b36e575e55f1dd02cb6625ec800d3b29b1053597233d2f6239fa
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d61595ea9a3653f16ffab19b4744db7fac53b6e9dd3b6b622460e0cb5901f7cd
3
  size 1064
checkpoint-37/trainer_state.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
- "best_metric": 0.7818791946308725,
3
- "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-18",
4
  "epoch": 1.9865771812080537,
5
  "eval_steps": 500,
6
  "global_step": 37,
@@ -10,33 +10,27 @@
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
- "eval_accuracy": 0.7818791946308725,
14
- "eval_f1": 0.7264205130236912,
15
- "eval_loss": 0.669560968875885,
16
- "eval_precision": 0.689807639599501,
17
- "eval_recall": 0.7818791946308725,
18
- "eval_runtime": 0.9033,
19
- "eval_samples_per_second": 329.896,
20
- "eval_steps_per_second": 42.067,
21
  "step": 18
22
  },
23
  {
24
  "epoch": 1.9865771812080537,
25
- "eval_accuracy": 0.7751677852348994,
26
- "eval_f1": 0.7202681570933687,
27
- "eval_loss": 0.5067932605743408,
28
- "eval_precision": 0.684911313518696,
29
- "eval_recall": 0.7751677852348994,
30
- "eval_runtime": 0.907,
31
- "eval_samples_per_second": 328.546,
32
- "eval_steps_per_second": 41.895,
33
  "step": 37
34
  }
35
  ],
36
  "logging_steps": 500,
37
- "max_steps": 216,
38
  "num_input_tokens_seen": 0,
39
- "num_train_epochs": 12,
40
  "save_steps": 500,
41
  "stateful_callbacks": {
42
  "EarlyStoppingCallback": {
 
1
  {
2
+ "best_metric": 0.785234899328859,
3
+ "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-37",
4
  "epoch": 1.9865771812080537,
5
  "eval_steps": 500,
6
  "global_step": 37,
 
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.7078245878219604,
15
+ "eval_runtime": 3.2791,
16
+ "eval_samples_per_second": 90.879,
17
+ "eval_steps_per_second": 11.589,
 
 
 
18
  "step": 18
19
  },
20
  {
21
  "epoch": 1.9865771812080537,
22
+ "eval_accuracy": 0.785234899328859,
23
+ "eval_loss": 0.5109438300132751,
24
+ "eval_runtime": 3.2689,
25
+ "eval_samples_per_second": 91.162,
26
+ "eval_steps_per_second": 11.625,
 
 
 
27
  "step": 37
28
  }
29
  ],
30
  "logging_steps": 500,
31
+ "max_steps": 54,
32
  "num_input_tokens_seen": 0,
33
+ "num_train_epochs": 3,
34
  "save_steps": 500,
35
  "stateful_callbacks": {
36
  "EarlyStoppingCallback": {
checkpoint-37/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:296be9afae72ab3934d873f0cf92f87ef76899c18b11651de670afb49aa1a5d6
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b74ef34b0c98ff8e2f446712958b67f0e9dae2c892b2706cb653c6c4a3cba29
3
  size 5240
checkpoint-54/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ddc06b1a02837e21e95387003f1e736a75ada2cc311f92602719c1edbbc04f50
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5aa09bf08a82037b1a7e97fc26fb24670bc804169b2866e90999fb465178164d
3
  size 94765560
checkpoint-54/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:21244447ac8c1fa88b7aeba15d59e7279b15722946d0143bcbee60e3d2bf3ce7
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:753d093b226b6f0025d58bd26e5f213b98a28689a979e9ea31621429593e3533
3
  size 189556666
checkpoint-54/rng_state.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9a2fce6e9deba3361ccb9abfc78de8b2f74d3006b3dab904337047451f90d0ba
3
- size 14244
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9fc2ffa6937057fd69aff15425ea0616520fc92b279414eaf3b97409628bc19
3
+ size 14308
checkpoint-54/trainer_state.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "best_metric": 0.802013422818792,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-54",
4
  "epoch": 2.899328859060403,
5
  "eval_steps": 500,
@@ -10,29 +10,29 @@
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
- "eval_accuracy": 0.7684563758389261,
14
- "eval_loss": 0.7515301704406738,
15
- "eval_runtime": 0.7411,
16
- "eval_samples_per_second": 402.118,
17
- "eval_steps_per_second": 51.277,
18
  "step": 18
19
  },
20
  {
21
  "epoch": 1.9865771812080537,
22
- "eval_accuracy": 0.7953020134228188,
23
- "eval_loss": 0.5268774628639221,
24
- "eval_runtime": 0.7559,
25
- "eval_samples_per_second": 394.247,
26
- "eval_steps_per_second": 50.273,
27
  "step": 37
28
  },
29
  {
30
  "epoch": 2.899328859060403,
31
- "eval_accuracy": 0.802013422818792,
32
- "eval_loss": 0.4906807839870453,
33
- "eval_runtime": 0.739,
34
- "eval_samples_per_second": 403.255,
35
- "eval_steps_per_second": 51.422,
36
  "step": 54
37
  }
38
  ],
@@ -42,6 +42,15 @@
42
  "num_train_epochs": 3,
43
  "save_steps": 500,
44
  "stateful_callbacks": {
 
 
 
 
 
 
 
 
 
45
  "TrainerControl": {
46
  "args": {
47
  "should_epoch_stop": false,
 
1
  {
2
+ "best_metric": 0.7919463087248322,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-54",
4
  "epoch": 2.899328859060403,
5
  "eval_steps": 500,
 
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.7078245878219604,
15
+ "eval_runtime": 3.2791,
16
+ "eval_samples_per_second": 90.879,
17
+ "eval_steps_per_second": 11.589,
18
  "step": 18
19
  },
20
  {
21
  "epoch": 1.9865771812080537,
22
+ "eval_accuracy": 0.785234899328859,
23
+ "eval_loss": 0.5109438300132751,
24
+ "eval_runtime": 3.2689,
25
+ "eval_samples_per_second": 91.162,
26
+ "eval_steps_per_second": 11.625,
27
  "step": 37
28
  },
29
  {
30
  "epoch": 2.899328859060403,
31
+ "eval_accuracy": 0.7919463087248322,
32
+ "eval_loss": 0.4952092468738556,
33
+ "eval_runtime": 3.2798,
34
+ "eval_samples_per_second": 90.86,
35
+ "eval_steps_per_second": 11.586,
36
  "step": 54
37
  }
38
  ],
 
42
  "num_train_epochs": 3,
43
  "save_steps": 500,
44
  "stateful_callbacks": {
45
+ "EarlyStoppingCallback": {
46
+ "args": {
47
+ "early_stopping_patience": 3,
48
+ "early_stopping_threshold": 0.0
49
+ },
50
+ "attributes": {
51
+ "early_stopping_patience_counter": 0
52
+ }
53
+ },
54
  "TrainerControl": {
55
  "args": {
56
  "should_epoch_stop": false,
checkpoint-54/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1c7f4e93a08117554edcac2e7ce68e97841c557e813806f1446c2ab82115baa2
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b74ef34b0c98ff8e2f446712958b67f0e9dae2c892b2706cb653c6c4a3cba29
3
  size 5240
checkpoint-55/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c618f9432ed808a829c6b9322c2de70a050c8d68461263263a1ef25ba955522c
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a56d023e24e3488758e046ad64c3d057691e3e8253a5b6f0139dfed87b15032
3
  size 94765560
checkpoint-55/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:eda417e68dbd70d3239f0ee68d35d7c4c4ee28b662a007739b7af7e3358a7b8b
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6832cd2547bddccb9a7ffc2cd19f963924a3e7ae37e344b0759b7f9310c16429
3
  size 189556666
checkpoint-55/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b619a76e273bca0b5b749883b1e3347edb727e6888b4090828a33e2f2c1f4fae
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d4af5c5db34db2bd723c5a003a705e44ba7bbc4acb13f2908560574feaabe15f
3
  size 1064
checkpoint-55/trainer_state.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "best_metric": 0.8087248322147651,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-55",
4
  "epoch": 2.953020134228188,
5
  "eval_steps": 500,
@@ -10,45 +10,36 @@
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
- "eval_accuracy": 0.7818791946308725,
14
- "eval_f1": 0.7264205130236912,
15
- "eval_loss": 0.669560968875885,
16
- "eval_precision": 0.689807639599501,
17
- "eval_recall": 0.7818791946308725,
18
- "eval_runtime": 0.9033,
19
- "eval_samples_per_second": 329.896,
20
- "eval_steps_per_second": 42.067,
21
  "step": 18
22
  },
23
  {
24
  "epoch": 1.9865771812080537,
25
- "eval_accuracy": 0.7751677852348994,
26
- "eval_f1": 0.7202681570933687,
27
- "eval_loss": 0.5067932605743408,
28
- "eval_precision": 0.684911313518696,
29
- "eval_recall": 0.7751677852348994,
30
- "eval_runtime": 0.907,
31
- "eval_samples_per_second": 328.546,
32
- "eval_steps_per_second": 41.895,
33
  "step": 37
34
  },
35
  {
36
  "epoch": 2.953020134228188,
37
- "eval_accuracy": 0.8087248322147651,
38
- "eval_f1": 0.7535236037076262,
39
- "eval_loss": 0.43038079142570496,
40
- "eval_precision": 0.7241626365959,
41
- "eval_recall": 0.8087248322147651,
42
- "eval_runtime": 0.8664,
43
- "eval_samples_per_second": 343.963,
44
- "eval_steps_per_second": 43.861,
45
  "step": 55
46
  }
47
  ],
48
  "logging_steps": 500,
49
- "max_steps": 216,
50
  "num_input_tokens_seen": 0,
51
- "num_train_epochs": 12,
52
  "save_steps": 500,
53
  "stateful_callbacks": {
54
  "EarlyStoppingCallback": {
 
1
  {
2
+ "best_metric": 0.8187919463087249,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-55",
4
  "epoch": 2.953020134228188,
5
  "eval_steps": 500,
 
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.686046838760376,
15
+ "eval_runtime": 3.2719,
16
+ "eval_samples_per_second": 91.079,
17
+ "eval_steps_per_second": 11.614,
 
 
 
18
  "step": 18
19
  },
20
  {
21
  "epoch": 1.9865771812080537,
22
+ "eval_accuracy": 0.802013422818792,
23
+ "eval_loss": 0.46226799488067627,
24
+ "eval_runtime": 3.3286,
25
+ "eval_samples_per_second": 89.527,
26
+ "eval_steps_per_second": 11.416,
 
 
 
27
  "step": 37
28
  },
29
  {
30
  "epoch": 2.953020134228188,
31
+ "eval_accuracy": 0.8187919463087249,
32
+ "eval_loss": 0.4068666100502014,
33
+ "eval_runtime": 3.2087,
34
+ "eval_samples_per_second": 92.871,
35
+ "eval_steps_per_second": 11.843,
 
 
 
36
  "step": 55
37
  }
38
  ],
39
  "logging_steps": 500,
40
+ "max_steps": 126,
41
  "num_input_tokens_seen": 0,
42
+ "num_train_epochs": 7,
43
  "save_steps": 500,
44
  "stateful_callbacks": {
45
  "EarlyStoppingCallback": {
checkpoint-55/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:296be9afae72ab3934d873f0cf92f87ef76899c18b11651de670afb49aa1a5d6
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4320ed7eb3857f3356f3c0fd71b66d450b29bc6f61001ac820f978865e977454
3
  size 5240
checkpoint-74/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a65df2df2e438591f2a6638db08aa114327846410ac2c5aa11c45b4447cc6349
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a5d3bc1abf730eda54b1adbdc9aa43335bea7850d95611eeed903f77b34a044
3
  size 94765560
checkpoint-74/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a56095d9e4fb10f1821c0613eb1e41b10331d6c0ba1a8e9c13f26b97709b275c
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2ae95457eed519491999c6039260cf9d6720ed842d5964a72039871a9c30af7e
3
  size 189556666
checkpoint-74/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:33230a8f5e96901b9c18a66475024a2d6a7717cc358ba599df0c422ae9370052
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e394f0e7c3dcb107b5480b00977436132ee632367118f6b993703acf51b4f795
3
  size 1064
checkpoint-74/trainer_state.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "best_metric": 0.8523489932885906,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-74",
4
  "epoch": 3.9731543624161074,
5
  "eval_steps": 500,
@@ -10,57 +10,45 @@
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
- "eval_accuracy": 0.7818791946308725,
14
- "eval_f1": 0.7264205130236912,
15
- "eval_loss": 0.669560968875885,
16
- "eval_precision": 0.689807639599501,
17
- "eval_recall": 0.7818791946308725,
18
- "eval_runtime": 0.9033,
19
- "eval_samples_per_second": 329.896,
20
- "eval_steps_per_second": 42.067,
21
  "step": 18
22
  },
23
  {
24
  "epoch": 1.9865771812080537,
25
- "eval_accuracy": 0.7751677852348994,
26
- "eval_f1": 0.7202681570933687,
27
- "eval_loss": 0.5067932605743408,
28
- "eval_precision": 0.684911313518696,
29
- "eval_recall": 0.7751677852348994,
30
- "eval_runtime": 0.907,
31
- "eval_samples_per_second": 328.546,
32
- "eval_steps_per_second": 41.895,
33
  "step": 37
34
  },
35
  {
36
  "epoch": 2.953020134228188,
37
- "eval_accuracy": 0.8087248322147651,
38
- "eval_f1": 0.7535236037076262,
39
- "eval_loss": 0.43038079142570496,
40
- "eval_precision": 0.7241626365959,
41
- "eval_recall": 0.8087248322147651,
42
- "eval_runtime": 0.8664,
43
- "eval_samples_per_second": 343.963,
44
- "eval_steps_per_second": 43.861,
45
  "step": 55
46
  },
47
  {
48
  "epoch": 3.9731543624161074,
49
- "eval_accuracy": 0.8523489932885906,
50
- "eval_f1": 0.8433916249277822,
51
- "eval_loss": 0.4109182059764862,
52
- "eval_precision": 0.8727817866814688,
53
- "eval_recall": 0.8523489932885906,
54
- "eval_runtime": 0.8712,
55
- "eval_samples_per_second": 342.059,
56
- "eval_steps_per_second": 43.618,
57
  "step": 74
58
  }
59
  ],
60
  "logging_steps": 500,
61
- "max_steps": 216,
62
  "num_input_tokens_seen": 0,
63
- "num_train_epochs": 12,
64
  "save_steps": 500,
65
  "stateful_callbacks": {
66
  "EarlyStoppingCallback": {
 
1
  {
2
+ "best_metric": 0.8355704697986577,
3
  "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-74",
4
  "epoch": 3.9731543624161074,
5
  "eval_steps": 500,
 
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.686046838760376,
15
+ "eval_runtime": 3.2719,
16
+ "eval_samples_per_second": 91.079,
17
+ "eval_steps_per_second": 11.614,
 
 
 
18
  "step": 18
19
  },
20
  {
21
  "epoch": 1.9865771812080537,
22
+ "eval_accuracy": 0.802013422818792,
23
+ "eval_loss": 0.46226799488067627,
24
+ "eval_runtime": 3.3286,
25
+ "eval_samples_per_second": 89.527,
26
+ "eval_steps_per_second": 11.416,
 
 
 
27
  "step": 37
28
  },
29
  {
30
  "epoch": 2.953020134228188,
31
+ "eval_accuracy": 0.8187919463087249,
32
+ "eval_loss": 0.4068666100502014,
33
+ "eval_runtime": 3.2087,
34
+ "eval_samples_per_second": 92.871,
35
+ "eval_steps_per_second": 11.843,
 
 
 
36
  "step": 55
37
  },
38
  {
39
  "epoch": 3.9731543624161074,
40
+ "eval_accuracy": 0.8355704697986577,
41
+ "eval_loss": 0.3811332583427429,
42
+ "eval_runtime": 3.2325,
43
+ "eval_samples_per_second": 92.188,
44
+ "eval_steps_per_second": 11.755,
 
 
 
45
  "step": 74
46
  }
47
  ],
48
  "logging_steps": 500,
49
+ "max_steps": 126,
50
  "num_input_tokens_seen": 0,
51
+ "num_train_epochs": 7,
52
  "save_steps": 500,
53
  "stateful_callbacks": {
54
  "EarlyStoppingCallback": {
checkpoint-74/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:296be9afae72ab3934d873f0cf92f87ef76899c18b11651de670afb49aa1a5d6
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4320ed7eb3857f3356f3c0fd71b66d450b29bc6f61001ac820f978865e977454
3
  size 5240
checkpoint-93/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0b1278f478e484fe8966052e7cedbe2cd627eca9979ce11b06e5b8f823427f29
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d3d69c192989ce2813f8d99d6435c07783c8a6f57b3311dc42e47e0b875e811
3
  size 94765560
checkpoint-93/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:eb3c381477600c922a9b63900e0bc0281efab5e1d3889d5971688dd7b8631338
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a1ce9c56f3a961f219bca2e4bb08198e6845fd46ebfc30a3de72b3247c77dcea
3
  size 189556666
checkpoint-93/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b5a98f38af6530b97608c99604027e7e1fd08ea62f92b5d26f4379a4874e5a6e
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2bb70a20d69c42e89da01454789086991cde012c396cfc3a8588724e4a08637
3
  size 1064
checkpoint-93/trainer_state.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
- "best_metric": 0.87248322147651,
3
- "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-93",
4
  "epoch": 4.993288590604027,
5
  "eval_steps": 500,
6
  "global_step": 93,
@@ -10,69 +10,54 @@
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
- "eval_accuracy": 0.7818791946308725,
14
- "eval_f1": 0.7264205130236912,
15
- "eval_loss": 0.669560968875885,
16
- "eval_precision": 0.689807639599501,
17
- "eval_recall": 0.7818791946308725,
18
- "eval_runtime": 0.9033,
19
- "eval_samples_per_second": 329.896,
20
- "eval_steps_per_second": 42.067,
21
  "step": 18
22
  },
23
  {
24
  "epoch": 1.9865771812080537,
25
- "eval_accuracy": 0.7751677852348994,
26
- "eval_f1": 0.7202681570933687,
27
- "eval_loss": 0.5067932605743408,
28
- "eval_precision": 0.684911313518696,
29
- "eval_recall": 0.7751677852348994,
30
- "eval_runtime": 0.907,
31
- "eval_samples_per_second": 328.546,
32
- "eval_steps_per_second": 41.895,
33
  "step": 37
34
  },
35
  {
36
  "epoch": 2.953020134228188,
37
- "eval_accuracy": 0.8087248322147651,
38
- "eval_f1": 0.7535236037076262,
39
- "eval_loss": 0.43038079142570496,
40
- "eval_precision": 0.7241626365959,
41
- "eval_recall": 0.8087248322147651,
42
- "eval_runtime": 0.8664,
43
- "eval_samples_per_second": 343.963,
44
- "eval_steps_per_second": 43.861,
45
  "step": 55
46
  },
47
  {
48
  "epoch": 3.9731543624161074,
49
- "eval_accuracy": 0.8523489932885906,
50
- "eval_f1": 0.8433916249277822,
51
- "eval_loss": 0.4109182059764862,
52
- "eval_precision": 0.8727817866814688,
53
- "eval_recall": 0.8523489932885906,
54
- "eval_runtime": 0.8712,
55
- "eval_samples_per_second": 342.059,
56
- "eval_steps_per_second": 43.618,
57
  "step": 74
58
  },
59
  {
60
  "epoch": 4.993288590604027,
61
- "eval_accuracy": 0.87248322147651,
62
- "eval_f1": 0.8717711524765707,
63
- "eval_loss": 0.3263051509857178,
64
- "eval_precision": 0.8718521382399975,
65
- "eval_recall": 0.87248322147651,
66
- "eval_runtime": 0.87,
67
- "eval_samples_per_second": 342.548,
68
- "eval_steps_per_second": 43.681,
69
  "step": 93
70
  }
71
  ],
72
  "logging_steps": 500,
73
- "max_steps": 216,
74
  "num_input_tokens_seen": 0,
75
- "num_train_epochs": 12,
76
  "save_steps": 500,
77
  "stateful_callbacks": {
78
  "EarlyStoppingCallback": {
 
1
  {
2
+ "best_metric": 0.8355704697986577,
3
+ "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-74",
4
  "epoch": 4.993288590604027,
5
  "eval_steps": 500,
6
  "global_step": 93,
 
10
  "log_history": [
11
  {
12
  "epoch": 0.9664429530201343,
13
+ "eval_accuracy": 0.7583892617449665,
14
+ "eval_loss": 0.686046838760376,
15
+ "eval_runtime": 3.2719,
16
+ "eval_samples_per_second": 91.079,
17
+ "eval_steps_per_second": 11.614,
 
 
 
18
  "step": 18
19
  },
20
  {
21
  "epoch": 1.9865771812080537,
22
+ "eval_accuracy": 0.802013422818792,
23
+ "eval_loss": 0.46226799488067627,
24
+ "eval_runtime": 3.3286,
25
+ "eval_samples_per_second": 89.527,
26
+ "eval_steps_per_second": 11.416,
 
 
 
27
  "step": 37
28
  },
29
  {
30
  "epoch": 2.953020134228188,
31
+ "eval_accuracy": 0.8187919463087249,
32
+ "eval_loss": 0.4068666100502014,
33
+ "eval_runtime": 3.2087,
34
+ "eval_samples_per_second": 92.871,
35
+ "eval_steps_per_second": 11.843,
 
 
 
36
  "step": 55
37
  },
38
  {
39
  "epoch": 3.9731543624161074,
40
+ "eval_accuracy": 0.8355704697986577,
41
+ "eval_loss": 0.3811332583427429,
42
+ "eval_runtime": 3.2325,
43
+ "eval_samples_per_second": 92.188,
44
+ "eval_steps_per_second": 11.755,
 
 
 
45
  "step": 74
46
  },
47
  {
48
  "epoch": 4.993288590604027,
49
+ "eval_accuracy": 0.8355704697986577,
50
+ "eval_loss": 0.3542439937591553,
51
+ "eval_runtime": 3.2746,
52
+ "eval_samples_per_second": 91.003,
53
+ "eval_steps_per_second": 11.604,
 
 
 
54
  "step": 93
55
  }
56
  ],
57
  "logging_steps": 500,
58
+ "max_steps": 126,
59
  "num_input_tokens_seen": 0,
60
+ "num_train_epochs": 7,
61
  "save_steps": 500,
62
  "stateful_callbacks": {
63
  "EarlyStoppingCallback": {
checkpoint-93/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:296be9afae72ab3934d873f0cf92f87ef76899c18b11651de670afb49aa1a5d6
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4320ed7eb3857f3356f3c0fd71b66d450b29bc6f61001ac820f978865e977454
3
  size 5240
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:add11297401b595ef24d8a66b9cf4f5bca92dae882193b715a5917dff07be685
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5aa09bf08a82037b1a7e97fc26fb24670bc804169b2866e90999fb465178164d
3
  size 94765560
runs/Sep02_21-37-15_ubumarcos/events.out.tfevents.1725305838.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3bd2942fa11e2b96ec0bf193c02a3d90f545996ac5d6c9a2a1d79dc5b7e274c1
3
+ size 6562
runs/Sep02_23-16-29_ubumarcos/events.out.tfevents.1725311792.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce158ecf28ec84ea36b857f71a855b6145bd29512438400faabeae605fb1f97d
3
+ size 6562
runs/Sep02_23-18-00_ubumarcos/events.out.tfevents.1725311883.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f455c26277a30f5cb10c5e6da88cd4c1e37afa9b14eb98e3175e7db787029f36
3
+ size 8464
runs/Sep03_00-14-06_ubumarcos/events.out.tfevents.1725315248.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31fa2415a5ff41201a78bc2d0fc4876051bc72fc9ea9d0a62bac3429895a1c37
3
+ size 5897
runs/Sep03_13-16-32_ubumarcos/events.out.tfevents.1725362195.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e860ae2b515ea6737e89cdf0074e7f5ae63319c4e78e1224a05d1342ac0a1d07
3
+ size 5897
runs/Sep03_13-18-34_ubumarcos/events.out.tfevents.1725362316.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0583e8a36e90e098e7c7ce2b2f54d745eb63200aa50d8790a3328390e051822a
3
+ size 6562