Marcos12886 commited on
Commit
608aecb
·
verified ·
1 Parent(s): d28c8ad

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +20 -19
  2. checkpoint-100/config.json +85 -0
  3. checkpoint-100/model.safetensors +3 -0
  4. checkpoint-100/optimizer.pt +3 -0
  5. checkpoint-100/rng_state.pth +3 -0
  6. checkpoint-100/scheduler.pt +3 -0
  7. checkpoint-100/trainer_state.json +94 -0
  8. checkpoint-100/training_args.bin +3 -0
  9. checkpoint-180/config.json +4 -4
  10. checkpoint-180/model.safetensors +1 -1
  11. checkpoint-180/optimizer.pt +1 -1
  12. checkpoint-180/rng_state.pth +1 -1
  13. checkpoint-180/scheduler.pt +1 -1
  14. checkpoint-180/trainer_state.json +88 -123
  15. checkpoint-180/training_args.bin +1 -1
  16. checkpoint-200/config.json +85 -0
  17. checkpoint-200/model.safetensors +3 -0
  18. checkpoint-200/optimizer.pt +3 -0
  19. checkpoint-200/rng_state.pth +3 -0
  20. checkpoint-200/scheduler.pt +3 -0
  21. checkpoint-200/trainer_state.json +146 -0
  22. checkpoint-200/training_args.bin +3 -0
  23. checkpoint-270/config.json +7 -6
  24. checkpoint-270/model.safetensors +1 -1
  25. checkpoint-270/optimizer.pt +1 -1
  26. checkpoint-270/rng_state.pth +2 -2
  27. checkpoint-270/scheduler.pt +1 -1
  28. checkpoint-270/trainer_state.json +117 -51
  29. checkpoint-270/training_args.bin +1 -1
  30. checkpoint-40/config.json +85 -0
  31. checkpoint-40/model.safetensors +3 -0
  32. checkpoint-40/optimizer.pt +3 -0
  33. checkpoint-40/rng_state.pth +3 -0
  34. checkpoint-40/scheduler.pt +3 -0
  35. checkpoint-40/trainer_state.json +41 -0
  36. checkpoint-40/training_args.bin +3 -0
  37. model.safetensors +1 -1
  38. runs/Sep11_11-05-01_ubumarcos/events.out.tfevents.1726045650.ubumarcos +3 -0
  39. runs/Sep11_11-13-29_ubumarcos/events.out.tfevents.1726046010.ubumarcos +3 -0
  40. runs/Sep11_11-13-48_ubumarcos/events.out.tfevents.1726046029.ubumarcos +3 -0
  41. runs/Sep11_11-18-22_ubumarcos/events.out.tfevents.1726046303.ubumarcos +3 -0
  42. runs/Sep11_11-18-47_ubumarcos/events.out.tfevents.1726046328.ubumarcos +3 -0
  43. runs/Sep11_11-18-47_ubumarcos/events.out.tfevents.1726046857.ubumarcos +3 -0
  44. runs/Sep11_11-29-05_ubumarcos/events.out.tfevents.1726046946.ubumarcos +3 -0
  45. runs/Sep11_11-29-05_ubumarcos/events.out.tfevents.1726047291.ubumarcos +3 -0
  46. runs/Sep11_11-37-18_ubumarcos/events.out.tfevents.1726047439.ubumarcos +3 -0
  47. runs/Sep11_11-37-18_ubumarcos/events.out.tfevents.1726047783.ubumarcos +3 -0
  48. runs/Sep11_12-04-17_ubumarcos/events.out.tfevents.1726049058.ubumarcos +3 -0
  49. runs/Sep11_12-04-17_ubumarcos/events.out.tfevents.1726049410.ubumarcos +3 -0
  50. runs/Sep11_12-23-02_ubumarcos/events.out.tfevents.1726050183.ubumarcos +3 -0
README.md CHANGED
@@ -6,9 +6,9 @@ tags:
6
  - generated_from_trainer
7
  metrics:
8
  - accuracy
 
9
  - precision
10
  - recall
11
- - f1
12
  model-index:
13
  - name: distilhubert-finetuned-mixed-data
14
  results: []
@@ -21,11 +21,12 @@ should probably proofread and complete it, then remove this comment. -->
21
 
22
  This model is a fine-tuned version of [ntu-spml/distilhubert](https://huggingface.co/ntu-spml/distilhubert) on an unknown dataset.
23
  It achieves the following results on the evaluation set:
24
- - Loss: 0.5203
25
- - Accuracy: 0.7881
26
- - Precision: 0.7937
27
- - Recall: 0.7881
28
- - F1: 0.7617
 
29
 
30
  ## Model description
31
 
@@ -45,24 +46,24 @@ More information needed
45
 
46
  The following hyperparameters were used during training:
47
  - learning_rate: 0.0003
48
- - train_batch_size: 8
49
- - eval_batch_size: 8
50
  - seed: 123
51
- - gradient_accumulation_steps: 8
52
- - total_train_batch_size: 64
53
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
- - lr_scheduler_type: cosine
55
- - lr_scheduler_warmup_ratio: 0.4
56
- - num_epochs: 15
 
 
57
 
58
  ### Training results
59
 
60
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
61
- |:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
62
- | No log | 0.9897 | 24 | 1.2071 | 0.4935 | 0.2436 | 0.4935 | 0.3262 |
63
- | No log | 1.9794 | 48 | 0.9013 | 0.6718 | 0.7410 | 0.6718 | 0.6040 |
64
- | No log | 2.9691 | 72 | 0.6878 | 0.7235 | 0.7739 | 0.7235 | 0.6509 |
65
- | No log | 4.0 | 97 | 0.5203 | 0.7881 | 0.7937 | 0.7881 | 0.7617 |
66
 
67
 
68
  ### Framework versions
 
6
  - generated_from_trainer
7
  metrics:
8
  - accuracy
9
+ - f1
10
  - precision
11
  - recall
 
12
  model-index:
13
  - name: distilhubert-finetuned-mixed-data
14
  results: []
 
21
 
22
  This model is a fine-tuned version of [ntu-spml/distilhubert](https://huggingface.co/ntu-spml/distilhubert) on an unknown dataset.
23
  It achieves the following results on the evaluation set:
24
+ - Loss: 0.7837
25
+ - Accuracy: 0.8498
26
+ - F1: 0.8473
27
+ - Precision: 0.8494
28
+ - Recall: 0.8498
29
+ - Confusion Matrix: [[76, 11, 0, 0], [7, 35, 17, 0], [0, 5, 59, 0], [1, 0, 0, 62]]
30
 
31
  ## Model description
32
 
 
46
 
47
  The following hyperparameters were used during training:
48
  - learning_rate: 0.0003
49
+ - train_batch_size: 64
50
+ - eval_batch_size: 64
51
  - seed: 123
52
+ - gradient_accumulation_steps: 2
53
+ - total_train_batch_size: 128
54
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
+ - lr_scheduler_type: cosine_with_restarts
56
+ - lr_scheduler_warmup_ratio: 0.1
57
+ - num_epochs: 30
58
+ - mixed_precision_training: Native AMP
59
+ - label_smoothing_factor: 0.1
60
 
61
  ### Training results
62
 
63
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | Confusion Matrix |
64
+ |:-------------:|:-------:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|:--------------------------------------------------------------:|
65
+ | 0.5938 | 11.1111 | 100 | 0.7100 | 0.8278 | 0.8206 | 0.8212 | 0.8278 | [[79, 7, 0, 1], [11, 29, 19, 0], [0, 8, 56, 0], [1, 0, 0, 62]] |
66
+ | 0.3511 | 22.2222 | 200 | 0.7837 | 0.8498 | 0.8473 | 0.8494 | 0.8498 | [[76, 11, 0, 0], [7, 35, 17, 0], [0, 5, 59, 0], [1, 0, 0, 62]] |
 
 
67
 
68
 
69
  ### Framework versions
checkpoint-100/config.json ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "ntu-spml/distilhubert",
3
+ "activation_dropout": 0.1,
4
+ "apply_spec_augment": false,
5
+ "architectures": [
6
+ "HubertForSequenceClassification"
7
+ ],
8
+ "attention_dropout": 0.1,
9
+ "bos_token_id": 1,
10
+ "classifier_proj_size": 256,
11
+ "conv_bias": false,
12
+ "conv_dim": [
13
+ 512,
14
+ 512,
15
+ 512,
16
+ 512,
17
+ 512,
18
+ 512,
19
+ 512
20
+ ],
21
+ "conv_kernel": [
22
+ 10,
23
+ 3,
24
+ 3,
25
+ 3,
26
+ 3,
27
+ 2,
28
+ 2
29
+ ],
30
+ "conv_stride": [
31
+ 5,
32
+ 2,
33
+ 2,
34
+ 2,
35
+ 2,
36
+ 2,
37
+ 2
38
+ ],
39
+ "ctc_loss_reduction": "sum",
40
+ "ctc_zero_infinity": false,
41
+ "do_stable_layer_norm": false,
42
+ "eos_token_id": 2,
43
+ "feat_extract_activation": "gelu",
44
+ "feat_extract_norm": "group",
45
+ "feat_proj_dropout": 0.0,
46
+ "feat_proj_layer_norm": false,
47
+ "final_dropout": 0.0,
48
+ "finetuning_task": "audio-classification",
49
+ "hidden_act": "gelu",
50
+ "hidden_dropout": 0.1,
51
+ "hidden_size": 768,
52
+ "id2label": {
53
+ "0": "1s_normal",
54
+ "1": "1s_pain",
55
+ "2": "1s_hunger",
56
+ "3": "1s_asphyxia"
57
+ },
58
+ "initializer_range": 0.02,
59
+ "intermediate_size": 3072,
60
+ "label2id": {
61
+ "LABEL_0": 0,
62
+ "LABEL_1": 1,
63
+ "LABEL_2": 2,
64
+ "LABEL_3": 3
65
+ },
66
+ "layer_norm_eps": 1e-05,
67
+ "layerdrop": 0.0,
68
+ "mask_feature_length": 10,
69
+ "mask_feature_min_masks": 0,
70
+ "mask_feature_prob": 0.0,
71
+ "mask_time_length": 10,
72
+ "mask_time_min_masks": 2,
73
+ "mask_time_prob": 0.05,
74
+ "model_type": "hubert",
75
+ "num_attention_heads": 12,
76
+ "num_conv_pos_embedding_groups": 16,
77
+ "num_conv_pos_embeddings": 128,
78
+ "num_feat_extract_layers": 7,
79
+ "num_hidden_layers": 2,
80
+ "pad_token_id": 0,
81
+ "torch_dtype": "float32",
82
+ "transformers_version": "4.44.2",
83
+ "use_weighted_layer_sum": false,
84
+ "vocab_size": 32
85
+ }
checkpoint-100/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:08e9cb732906b5f8e32e1a9e02636192812a954053f646a784f30a526c555580
3
+ size 94765560
checkpoint-100/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:661cb52b3461bb635323ecab97b907de2bea8cfd2a616b17ca0b9680a81f2086
3
+ size 189556666
checkpoint-100/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa137f7efe9a3317de9d36d40c6ca56966dff64c2feada684ad95f64e143d9db
3
+ size 14308
checkpoint-100/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a7d8b5ff37d2416ca0350cbb3fade2e2f68f962f7b1bdb89a67d154010119c7e
3
+ size 1064
checkpoint-100/trainer_state.json ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.8205727670173567,
3
+ "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-100",
4
+ "epoch": 11.11111111111111,
5
+ "eval_steps": 100,
6
+ "global_step": 100,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 5.555555555555555,
13
+ "grad_norm": 3.0120761394500732,
14
+ "learning_rate": 0.0002954973225032615,
15
+ "loss": 0.9823,
16
+ "step": 50
17
+ },
18
+ {
19
+ "epoch": 11.11111111111111,
20
+ "grad_norm": 1.183480143547058,
21
+ "learning_rate": 0.0002441718187008148,
22
+ "loss": 0.5938,
23
+ "step": 100
24
+ },
25
+ {
26
+ "epoch": 11.11111111111111,
27
+ "eval_accuracy": 0.8278388278388278,
28
+ "eval_confusion_matrix": [
29
+ [
30
+ 79,
31
+ 7,
32
+ 0,
33
+ 1
34
+ ],
35
+ [
36
+ 11,
37
+ 29,
38
+ 19,
39
+ 0
40
+ ],
41
+ [
42
+ 0,
43
+ 8,
44
+ 56,
45
+ 0
46
+ ],
47
+ [
48
+ 1,
49
+ 0,
50
+ 0,
51
+ 62
52
+ ]
53
+ ],
54
+ "eval_f1": 0.8205727670173567,
55
+ "eval_loss": 0.7099524140357971,
56
+ "eval_precision": 0.8212472631153949,
57
+ "eval_recall": 0.8278388278388278,
58
+ "eval_runtime": 3.0504,
59
+ "eval_samples_per_second": 89.496,
60
+ "eval_steps_per_second": 1.639,
61
+ "step": 100
62
+ }
63
+ ],
64
+ "logging_steps": 50,
65
+ "max_steps": 270,
66
+ "num_input_tokens_seen": 0,
67
+ "num_train_epochs": 30,
68
+ "save_steps": 100,
69
+ "stateful_callbacks": {
70
+ "EarlyStoppingCallback": {
71
+ "args": {
72
+ "early_stopping_patience": 5,
73
+ "early_stopping_threshold": 0.001
74
+ },
75
+ "attributes": {
76
+ "early_stopping_patience_counter": 0
77
+ }
78
+ },
79
+ "TrainerControl": {
80
+ "args": {
81
+ "should_epoch_stop": false,
82
+ "should_evaluate": false,
83
+ "should_log": false,
84
+ "should_save": true,
85
+ "should_training_stop": false
86
+ },
87
+ "attributes": {}
88
+ }
89
+ },
90
+ "total_flos": 2.755907745408e+16,
91
+ "train_batch_size": 64,
92
+ "trial_name": null,
93
+ "trial_params": null
94
+ }
checkpoint-100/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07df27942fd7faa53557a040600995f1eec08c68543b8c4ec8f3d5cdb0edfa8f
3
+ size 5240
checkpoint-180/config.json CHANGED
@@ -58,10 +58,10 @@
58
  "initializer_range": 0.02,
59
  "intermediate_size": 3072,
60
  "label2id": {
61
- "1s_asphyxia": 3,
62
- "1s_hunger": 2,
63
- "1s_normal": 0,
64
- "1s_pain": 1
65
  },
66
  "layer_norm_eps": 1e-05,
67
  "layerdrop": 0.0,
 
58
  "initializer_range": 0.02,
59
  "intermediate_size": 3072,
60
  "label2id": {
61
+ "LABEL_0": 0,
62
+ "LABEL_1": 1,
63
+ "LABEL_2": 2,
64
+ "LABEL_3": 3
65
  },
66
  "layer_norm_eps": 1e-05,
67
  "layerdrop": 0.0,
checkpoint-180/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:acc680f6e102eeadd30e80f8d0917ef2babd69b26da754c6c1506b3aea183be1
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c9283b27b956b9acc632146b05e3c456a753be4f96c74f1376d03f33392f16e
3
  size 94765560
checkpoint-180/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5e17fe6819e7898ce8c3bcb0c40cdbf8d12ea8266388ff9d90e29fc7550d6f67
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd033d769b7d0993c3522a75d31e060fc9cf1f7b58cc6f8afabb3a28318771d8
3
  size 189556666
checkpoint-180/rng_state.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0c3793e89685615bff97fe439d283c746009106914c13ae7640b7ef61a5c6001
3
  size 14308
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4dbe565c2ac73ca25b9906a99d6ed8b3f933fc7d815b6f485836239d23ea7fec
3
  size 14308
checkpoint-180/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a14a0dd800084fc8ff4e37ab512965705777f05e1aa8549e4415e97045313091
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4765f377e252739e4ad79d9a9df21f1b09180125b26de4eebc24a3c4a7f2b691
3
  size 1064
checkpoint-180/trainer_state.json CHANGED
@@ -1,144 +1,109 @@
1
  {
2
- "best_metric": 0.8145695364238411,
3
- "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-169",
4
- "epoch": 9.536423841059603,
5
- "eval_steps": 500,
6
  "global_step": 180,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
- "epoch": 0.9536423841059603,
13
- "eval_accuracy": 0.5298013245033113,
14
- "eval_f1": 0.43216404525386315,
15
- "eval_loss": 1.1326755285263062,
16
- "eval_precision": 0.5213817284211205,
17
- "eval_recall": 0.5298013245033113,
18
- "eval_runtime": 1.2241,
19
- "eval_samples_per_second": 246.709,
20
- "eval_steps_per_second": 31.043,
21
- "step": 18
22
  },
23
  {
24
- "epoch": 1.9602649006622517,
25
- "eval_accuracy": 0.6423841059602649,
26
- "eval_f1": 0.5806184720425087,
27
- "eval_loss": 0.9228919744491577,
28
- "eval_precision": 0.5520930002801896,
29
- "eval_recall": 0.6423841059602649,
30
- "eval_runtime": 1.2236,
31
- "eval_samples_per_second": 246.807,
32
- "eval_steps_per_second": 31.055,
33
- "step": 37
34
  },
35
  {
36
- "epoch": 2.966887417218543,
37
- "eval_accuracy": 0.7086092715231788,
38
- "eval_f1": 0.6539391094940458,
39
- "eval_loss": 0.7409619688987732,
40
- "eval_precision": 0.752516290193287,
41
- "eval_recall": 0.7086092715231788,
42
- "eval_runtime": 1.2459,
43
- "eval_samples_per_second": 242.403,
44
- "eval_steps_per_second": 30.501,
45
- "step": 56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  },
47
  {
48
- "epoch": 3.9735099337748343,
49
- "eval_accuracy": 0.7450331125827815,
50
- "eval_f1": 0.7012377717797856,
51
- "eval_loss": 0.6461689472198486,
52
- "eval_precision": 0.7242129191632504,
53
- "eval_recall": 0.7450331125827815,
54
- "eval_runtime": 1.229,
55
- "eval_samples_per_second": 245.723,
56
- "eval_steps_per_second": 30.919,
57
- "step": 75
58
- },
59
- {
60
- "epoch": 4.9801324503311255,
61
- "eval_accuracy": 0.7980132450331126,
62
- "eval_f1": 0.7903709596982513,
63
- "eval_loss": 0.5553261041641235,
64
- "eval_precision": 0.7925903096412185,
65
- "eval_recall": 0.7980132450331126,
66
- "eval_runtime": 1.2897,
67
- "eval_samples_per_second": 234.157,
68
- "eval_steps_per_second": 29.463,
69
- "step": 94
70
- },
71
- {
72
- "epoch": 5.986754966887418,
73
- "eval_accuracy": 0.7781456953642384,
74
- "eval_f1": 0.7717607879297459,
75
- "eval_loss": 0.5255588293075562,
76
- "eval_precision": 0.7771454278224522,
77
- "eval_recall": 0.7781456953642384,
78
- "eval_runtime": 1.2928,
79
- "eval_samples_per_second": 233.597,
80
- "eval_steps_per_second": 29.393,
81
- "step": 113
82
- },
83
- {
84
- "epoch": 6.993377483443709,
85
- "eval_accuracy": 0.7980132450331126,
86
- "eval_f1": 0.7833793670187674,
87
- "eval_loss": 0.5077652335166931,
88
- "eval_precision": 0.7917508237685551,
89
- "eval_recall": 0.7980132450331126,
90
- "eval_runtime": 1.2898,
91
- "eval_samples_per_second": 234.154,
92
- "eval_steps_per_second": 29.463,
93
- "step": 132
94
- },
95
- {
96
- "epoch": 8.0,
97
- "eval_accuracy": 0.8112582781456954,
98
- "eval_f1": 0.8021247299665692,
99
- "eval_loss": 0.4742371141910553,
100
- "eval_precision": 0.8054865043662888,
101
- "eval_recall": 0.8112582781456954,
102
- "eval_runtime": 1.381,
103
- "eval_samples_per_second": 218.682,
104
- "eval_steps_per_second": 27.516,
105
- "step": 151
106
- },
107
- {
108
- "epoch": 8.95364238410596,
109
- "eval_accuracy": 0.8145695364238411,
110
- "eval_f1": 0.805819805920304,
111
- "eval_loss": 0.4742475152015686,
112
- "eval_precision": 0.8065208989148904,
113
- "eval_recall": 0.8145695364238411,
114
- "eval_runtime": 1.2663,
115
- "eval_samples_per_second": 238.482,
116
- "eval_steps_per_second": 30.008,
117
- "step": 169
118
- },
119
- {
120
- "epoch": 9.536423841059603,
121
- "eval_accuracy": 0.8079470198675497,
122
- "eval_f1": 0.800196523810004,
123
- "eval_loss": 0.473416268825531,
124
- "eval_precision": 0.7994983293345093,
125
- "eval_recall": 0.8079470198675497,
126
- "eval_runtime": 1.2838,
127
- "eval_samples_per_second": 235.232,
128
- "eval_steps_per_second": 29.599,
129
- "step": 180
130
  }
131
  ],
132
- "logging_steps": 500,
133
  "max_steps": 180,
134
  "num_input_tokens_seen": 0,
135
- "num_train_epochs": 10,
136
- "save_steps": 500,
137
  "stateful_callbacks": {
138
  "EarlyStoppingCallback": {
139
  "args": {
140
- "early_stopping_patience": 3,
141
- "early_stopping_threshold": 0.0
142
  },
143
  "attributes": {
144
  "early_stopping_patience_counter": 0
@@ -155,8 +120,8 @@
155
  "attributes": {}
156
  }
157
  },
158
- "total_flos": 2.61990899712e+16,
159
- "train_batch_size": 8,
160
  "trial_name": null,
161
  "trial_params": null
162
  }
 
1
  {
2
+ "best_metric": 0.8109192582521296,
3
+ "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-100",
4
+ "epoch": 20.0,
5
+ "eval_steps": 100,
6
  "global_step": 180,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
+ "epoch": 5.555555555555555,
13
+ "grad_norm": 5.775369644165039,
14
+ "learning_rate": 0.0002768979638879761,
15
+ "loss": 0.9596,
16
+ "step": 50
 
 
 
 
 
17
  },
18
  {
19
+ "epoch": 11.11111111111111,
20
+ "grad_norm": 1.9409778118133545,
21
+ "learning_rate": 0.0001558163056885225,
22
+ "loss": 0.57,
23
+ "step": 100
 
 
 
 
 
24
  },
25
  {
26
+ "epoch": 11.11111111111111,
27
+ "eval_accuracy": 0.8095238095238095,
28
+ "eval_confusion_matrix": [
29
+ [
30
+ 69,
31
+ 11,
32
+ 1,
33
+ 3
34
+ ],
35
+ [
36
+ 9,
37
+ 37,
38
+ 10,
39
+ 0
40
+ ],
41
+ [
42
+ 3,
43
+ 13,
44
+ 58,
45
+ 0
46
+ ],
47
+ [
48
+ 2,
49
+ 0,
50
+ 0,
51
+ 57
52
+ ]
53
+ ],
54
+ "eval_f1": 0.8109192582521296,
55
+ "eval_loss": 0.7797720432281494,
56
+ "eval_normalized_confusion_matrix": [
57
+ [
58
+ 0.8214285714285714,
59
+ 0.13095238095238096,
60
+ 0.011904761904761904,
61
+ 0.03571428571428571
62
+ ],
63
+ [
64
+ 0.16071428571428573,
65
+ 0.6607142857142857,
66
+ 0.17857142857142858,
67
+ 0.0
68
+ ],
69
+ [
70
+ 0.04054054054054054,
71
+ 0.17567567567567569,
72
+ 0.7837837837837838,
73
+ 0.0
74
+ ],
75
+ [
76
+ 0.03389830508474576,
77
+ 0.0,
78
+ 0.0,
79
+ 0.9661016949152542
80
+ ]
81
+ ],
82
+ "eval_precision": 0.8133752269841887,
83
+ "eval_recall": 0.8095238095238095,
84
+ "eval_runtime": 2.8407,
85
+ "eval_samples_per_second": 96.102,
86
+ "eval_steps_per_second": 1.76,
87
+ "step": 100
88
  },
89
  {
90
+ "epoch": 16.666666666666668,
91
+ "grad_norm": 0.1650346964597702,
92
+ "learning_rate": 2.9681521086743422e-05,
93
+ "loss": 0.3729,
94
+ "step": 150
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
  }
96
  ],
97
+ "logging_steps": 50,
98
  "max_steps": 180,
99
  "num_input_tokens_seen": 0,
100
+ "num_train_epochs": 20,
101
+ "save_steps": 100,
102
  "stateful_callbacks": {
103
  "EarlyStoppingCallback": {
104
  "args": {
105
+ "early_stopping_patience": 5,
106
+ "early_stopping_threshold": 0.001
107
  },
108
  "attributes": {
109
  "early_stopping_patience_counter": 0
 
120
  "attributes": {}
121
  }
122
  },
123
+ "total_flos": 4.9578139008e+16,
124
+ "train_batch_size": 64,
125
  "trial_name": null,
126
  "trial_params": null
127
  }
checkpoint-180/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b8f2331f3c3c1c25969cfb888574c70dd0e5a19519d8cecb6198afe5225b5a53
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b720f57085bc18ee2568943908307783d85b14beab1b75a43dcb38d4ee4c11ea
3
  size 5240
checkpoint-200/config.json ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "ntu-spml/distilhubert",
3
+ "activation_dropout": 0.1,
4
+ "apply_spec_augment": false,
5
+ "architectures": [
6
+ "HubertForSequenceClassification"
7
+ ],
8
+ "attention_dropout": 0.1,
9
+ "bos_token_id": 1,
10
+ "classifier_proj_size": 256,
11
+ "conv_bias": false,
12
+ "conv_dim": [
13
+ 512,
14
+ 512,
15
+ 512,
16
+ 512,
17
+ 512,
18
+ 512,
19
+ 512
20
+ ],
21
+ "conv_kernel": [
22
+ 10,
23
+ 3,
24
+ 3,
25
+ 3,
26
+ 3,
27
+ 2,
28
+ 2
29
+ ],
30
+ "conv_stride": [
31
+ 5,
32
+ 2,
33
+ 2,
34
+ 2,
35
+ 2,
36
+ 2,
37
+ 2
38
+ ],
39
+ "ctc_loss_reduction": "sum",
40
+ "ctc_zero_infinity": false,
41
+ "do_stable_layer_norm": false,
42
+ "eos_token_id": 2,
43
+ "feat_extract_activation": "gelu",
44
+ "feat_extract_norm": "group",
45
+ "feat_proj_dropout": 0.0,
46
+ "feat_proj_layer_norm": false,
47
+ "final_dropout": 0.0,
48
+ "finetuning_task": "audio-classification",
49
+ "hidden_act": "gelu",
50
+ "hidden_dropout": 0.1,
51
+ "hidden_size": 768,
52
+ "id2label": {
53
+ "0": "1s_normal",
54
+ "1": "1s_pain",
55
+ "2": "1s_hunger",
56
+ "3": "1s_asphyxia"
57
+ },
58
+ "initializer_range": 0.02,
59
+ "intermediate_size": 3072,
60
+ "label2id": {
61
+ "LABEL_0": 0,
62
+ "LABEL_1": 1,
63
+ "LABEL_2": 2,
64
+ "LABEL_3": 3
65
+ },
66
+ "layer_norm_eps": 1e-05,
67
+ "layerdrop": 0.0,
68
+ "mask_feature_length": 10,
69
+ "mask_feature_min_masks": 0,
70
+ "mask_feature_prob": 0.0,
71
+ "mask_time_length": 10,
72
+ "mask_time_min_masks": 2,
73
+ "mask_time_prob": 0.05,
74
+ "model_type": "hubert",
75
+ "num_attention_heads": 12,
76
+ "num_conv_pos_embedding_groups": 16,
77
+ "num_conv_pos_embeddings": 128,
78
+ "num_feat_extract_layers": 7,
79
+ "num_hidden_layers": 2,
80
+ "pad_token_id": 0,
81
+ "torch_dtype": "float32",
82
+ "transformers_version": "4.44.2",
83
+ "use_weighted_layer_sum": false,
84
+ "vocab_size": 32
85
+ }
checkpoint-200/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:657233a1d355cbeb81915e1e45af39e18c323629f94e2cf7bf66d64986ad4832
3
+ size 94765560
checkpoint-200/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8abe9d8b284fb228c59a6505b05996fbe18662ff5106ccde358c57e5c20c31f3
3
+ size 189556666
checkpoint-200/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1394dd8f999426413eb77e99a83f1edbea2b0b3535f29939867074366221ef00
3
+ size 14308
checkpoint-200/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f8175600f2585122ab6da303d674c8f8128e50c429c8813f5676e9f1a4cf29a9
3
+ size 1064
checkpoint-200/trainer_state.json ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.8473173810316668,
3
+ "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-200",
4
+ "epoch": 22.22222222222222,
5
+ "eval_steps": 100,
6
+ "global_step": 200,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 5.555555555555555,
13
+ "grad_norm": 3.0120761394500732,
14
+ "learning_rate": 0.0002954973225032615,
15
+ "loss": 0.9823,
16
+ "step": 50
17
+ },
18
+ {
19
+ "epoch": 11.11111111111111,
20
+ "grad_norm": 1.183480143547058,
21
+ "learning_rate": 0.0002441718187008148,
22
+ "loss": 0.5938,
23
+ "step": 100
24
+ },
25
+ {
26
+ "epoch": 11.11111111111111,
27
+ "eval_accuracy": 0.8278388278388278,
28
+ "eval_confusion_matrix": [
29
+ [
30
+ 79,
31
+ 7,
32
+ 0,
33
+ 1
34
+ ],
35
+ [
36
+ 11,
37
+ 29,
38
+ 19,
39
+ 0
40
+ ],
41
+ [
42
+ 0,
43
+ 8,
44
+ 56,
45
+ 0
46
+ ],
47
+ [
48
+ 1,
49
+ 0,
50
+ 0,
51
+ 62
52
+ ]
53
+ ],
54
+ "eval_f1": 0.8205727670173567,
55
+ "eval_loss": 0.7099524140357971,
56
+ "eval_precision": 0.8212472631153949,
57
+ "eval_recall": 0.8278388278388278,
58
+ "eval_runtime": 3.0504,
59
+ "eval_samples_per_second": 89.496,
60
+ "eval_steps_per_second": 1.639,
61
+ "step": 100
62
+ },
63
+ {
64
+ "epoch": 16.666666666666668,
65
+ "grad_norm": 1.125301718711853,
66
+ "learning_rate": 0.0001548472927611466,
67
+ "loss": 0.3816,
68
+ "step": 150
69
+ },
70
+ {
71
+ "epoch": 22.22222222222222,
72
+ "grad_norm": 0.025351429358124733,
73
+ "learning_rate": 6.356684850666294e-05,
74
+ "loss": 0.3511,
75
+ "step": 200
76
+ },
77
+ {
78
+ "epoch": 22.22222222222222,
79
+ "eval_accuracy": 0.8498168498168498,
80
+ "eval_confusion_matrix": [
81
+ [
82
+ 76,
83
+ 11,
84
+ 0,
85
+ 0
86
+ ],
87
+ [
88
+ 7,
89
+ 35,
90
+ 17,
91
+ 0
92
+ ],
93
+ [
94
+ 0,
95
+ 5,
96
+ 59,
97
+ 0
98
+ ],
99
+ [
100
+ 1,
101
+ 0,
102
+ 0,
103
+ 62
104
+ ]
105
+ ],
106
+ "eval_f1": 0.8473173810316668,
107
+ "eval_loss": 0.7837256193161011,
108
+ "eval_precision": 0.8494091293737467,
109
+ "eval_recall": 0.8498168498168498,
110
+ "eval_runtime": 2.7557,
111
+ "eval_samples_per_second": 99.068,
112
+ "eval_steps_per_second": 1.814,
113
+ "step": 200
114
+ }
115
+ ],
116
+ "logging_steps": 50,
117
+ "max_steps": 270,
118
+ "num_input_tokens_seen": 0,
119
+ "num_train_epochs": 30,
120
+ "save_steps": 100,
121
+ "stateful_callbacks": {
122
+ "EarlyStoppingCallback": {
123
+ "args": {
124
+ "early_stopping_patience": 5,
125
+ "early_stopping_threshold": 0.001
126
+ },
127
+ "attributes": {
128
+ "early_stopping_patience_counter": 0
129
+ }
130
+ },
131
+ "TrainerControl": {
132
+ "args": {
133
+ "should_epoch_stop": false,
134
+ "should_evaluate": false,
135
+ "should_log": false,
136
+ "should_save": true,
137
+ "should_training_stop": false
138
+ },
139
+ "attributes": {}
140
+ }
141
+ },
142
+ "total_flos": 5.511815490816e+16,
143
+ "train_batch_size": 64,
144
+ "trial_name": null,
145
+ "trial_params": null
146
+ }
checkpoint-200/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07df27942fd7faa53557a040600995f1eec08c68543b8c4ec8f3d5cdb0edfa8f
3
+ size 5240
checkpoint-270/config.json CHANGED
@@ -45,22 +45,23 @@
45
  "feat_proj_dropout": 0.0,
46
  "feat_proj_layer_norm": false,
47
  "final_dropout": 0.0,
 
48
  "hidden_act": "gelu",
49
  "hidden_dropout": 0.1,
50
  "hidden_size": 768,
51
  "id2label": {
52
- "0": "1s_hunger",
53
  "1": "1s_pain",
54
- "2": "1s_normal",
55
  "3": "1s_asphyxia"
56
  },
57
  "initializer_range": 0.02,
58
  "intermediate_size": 3072,
59
  "label2id": {
60
- "1s_asphyxia": "3",
61
- "1s_hunger": "0",
62
- "1s_normal": "2",
63
- "1s_pain": "1"
64
  },
65
  "layer_norm_eps": 1e-05,
66
  "layerdrop": 0.0,
 
45
  "feat_proj_dropout": 0.0,
46
  "feat_proj_layer_norm": false,
47
  "final_dropout": 0.0,
48
+ "finetuning_task": "audio-classification",
49
  "hidden_act": "gelu",
50
  "hidden_dropout": 0.1,
51
  "hidden_size": 768,
52
  "id2label": {
53
+ "0": "1s_normal",
54
  "1": "1s_pain",
55
+ "2": "1s_hunger",
56
  "3": "1s_asphyxia"
57
  },
58
  "initializer_range": 0.02,
59
  "intermediate_size": 3072,
60
  "label2id": {
61
+ "LABEL_0": 0,
62
+ "LABEL_1": 1,
63
+ "LABEL_2": 2,
64
+ "LABEL_3": 3
65
  },
66
  "layer_norm_eps": 1e-05,
67
  "layerdrop": 0.0,
checkpoint-270/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d636b53f9e0cf8edea24d717663234e1b5b0672fb530b0f38d2c2dc4591d9de3
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50e9da842ee1c3049dcf4379a4a2fcb43d7950d7f42d9c0db4c11772ef6946f1
3
  size 94765560
checkpoint-270/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0791bfebebac2dd1405853681ba5e039ad283f7f249c50aea7c048ce8baff45c
3
  size 189556666
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8be4abba30debeb9bc21a960ece8393bd20aff03351a2fa67803328459e9051c
3
  size 189556666
checkpoint-270/rng_state.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:957afb8737ed599d979a7efaf0608ab852299442b9d657e0bab03a7da215279f
3
- size 14244
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e8b38147b55c07dde05ce314700178190f1d8e0b7b4e7ce52c23f13a13b1437
3
+ size 14308
checkpoint-270/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:aca50632b9dcfeaf56f29cc41af869dfc765fe5c731289691cb32c1dd52ebe96
3
  size 1064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d5af78e1ff750d4062a7fde36fb446bd094216ad52be58bf2c1da246363d86b
3
  size 1064
checkpoint-270/trainer_state.json CHANGED
@@ -1,74 +1,140 @@
1
  {
2
- "best_metric": 0.8755980861244019,
3
- "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-270",
4
- "epoch": 5.917808219178082,
5
- "eval_steps": 500,
6
  "global_step": 270,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
- "epoch": 0.9863013698630136,
13
- "eval_accuracy": 0.7942583732057417,
14
- "eval_loss": 0.5718334913253784,
15
- "eval_runtime": 1.0678,
16
- "eval_samples_per_second": 391.448,
17
- "eval_steps_per_second": 49.633,
18
- "step": 45
19
  },
20
  {
21
- "epoch": 1.9945205479452055,
22
- "eval_accuracy": 0.80622009569378,
23
- "eval_loss": 0.45444098114967346,
24
- "eval_runtime": 1.0491,
25
- "eval_samples_per_second": 398.444,
26
- "eval_steps_per_second": 50.52,
27
- "step": 91
28
  },
29
  {
30
- "epoch": 2.9808219178082194,
31
- "eval_accuracy": 0.8397129186602871,
32
- "eval_loss": 0.3911355435848236,
33
- "eval_runtime": 1.0412,
34
- "eval_samples_per_second": 401.464,
35
- "eval_steps_per_second": 50.903,
36
- "step": 136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  },
38
  {
39
- "epoch": 3.989041095890411,
40
- "eval_accuracy": 0.8516746411483254,
41
- "eval_loss": 0.431730717420578,
42
- "eval_runtime": 1.0687,
43
- "eval_samples_per_second": 391.122,
44
- "eval_steps_per_second": 49.592,
45
- "step": 182
46
  },
47
  {
48
- "epoch": 4.997260273972603,
49
- "eval_accuracy": 0.8708133971291866,
50
- "eval_loss": 0.42494523525238037,
51
- "eval_runtime": 1.0737,
52
- "eval_samples_per_second": 389.322,
53
- "eval_steps_per_second": 49.364,
54
- "step": 228
55
  },
56
  {
57
- "epoch": 5.917808219178082,
58
- "eval_accuracy": 0.8755980861244019,
59
- "eval_loss": 0.43608835339546204,
60
- "eval_runtime": 1.0689,
61
- "eval_samples_per_second": 391.068,
62
- "eval_steps_per_second": 49.585,
63
- "step": 270
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  }
65
  ],
66
- "logging_steps": 500,
67
  "max_steps": 270,
68
  "num_input_tokens_seen": 0,
69
- "num_train_epochs": 6,
70
- "save_steps": 500,
71
  "stateful_callbacks": {
 
 
 
 
 
 
 
 
 
72
  "TrainerControl": {
73
  "args": {
74
  "should_epoch_stop": false,
@@ -80,8 +146,8 @@
80
  "attributes": {}
81
  }
82
  },
83
- "total_flos": 3.9287263824e+16,
84
- "train_batch_size": 8,
85
  "trial_name": null,
86
  "trial_params": null
87
  }
 
1
  {
2
+ "best_metric": 0.8473173810316668,
3
+ "best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-200",
4
+ "epoch": 30.0,
5
+ "eval_steps": 100,
6
  "global_step": 270,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
+ "epoch": 5.555555555555555,
13
+ "grad_norm": 3.0120761394500732,
14
+ "learning_rate": 0.0002954973225032615,
15
+ "loss": 0.9823,
16
+ "step": 50
 
 
17
  },
18
  {
19
+ "epoch": 11.11111111111111,
20
+ "grad_norm": 1.183480143547058,
21
+ "learning_rate": 0.0002441718187008148,
22
+ "loss": 0.5938,
23
+ "step": 100
 
 
24
  },
25
  {
26
+ "epoch": 11.11111111111111,
27
+ "eval_accuracy": 0.8278388278388278,
28
+ "eval_confusion_matrix": [
29
+ [
30
+ 79,
31
+ 7,
32
+ 0,
33
+ 1
34
+ ],
35
+ [
36
+ 11,
37
+ 29,
38
+ 19,
39
+ 0
40
+ ],
41
+ [
42
+ 0,
43
+ 8,
44
+ 56,
45
+ 0
46
+ ],
47
+ [
48
+ 1,
49
+ 0,
50
+ 0,
51
+ 62
52
+ ]
53
+ ],
54
+ "eval_f1": 0.8205727670173567,
55
+ "eval_loss": 0.7099524140357971,
56
+ "eval_precision": 0.8212472631153949,
57
+ "eval_recall": 0.8278388278388278,
58
+ "eval_runtime": 3.0504,
59
+ "eval_samples_per_second": 89.496,
60
+ "eval_steps_per_second": 1.639,
61
+ "step": 100
62
  },
63
  {
64
+ "epoch": 16.666666666666668,
65
+ "grad_norm": 1.125301718711853,
66
+ "learning_rate": 0.0001548472927611466,
67
+ "loss": 0.3816,
68
+ "step": 150
 
 
69
  },
70
  {
71
+ "epoch": 22.22222222222222,
72
+ "grad_norm": 0.025351429358124733,
73
+ "learning_rate": 6.356684850666294e-05,
74
+ "loss": 0.3511,
75
+ "step": 200
 
 
76
  },
77
  {
78
+ "epoch": 22.22222222222222,
79
+ "eval_accuracy": 0.8498168498168498,
80
+ "eval_confusion_matrix": [
81
+ [
82
+ 76,
83
+ 11,
84
+ 0,
85
+ 0
86
+ ],
87
+ [
88
+ 7,
89
+ 35,
90
+ 17,
91
+ 0
92
+ ],
93
+ [
94
+ 0,
95
+ 5,
96
+ 59,
97
+ 0
98
+ ],
99
+ [
100
+ 1,
101
+ 0,
102
+ 0,
103
+ 62
104
+ ]
105
+ ],
106
+ "eval_f1": 0.8473173810316668,
107
+ "eval_loss": 0.7837256193161011,
108
+ "eval_precision": 0.8494091293737467,
109
+ "eval_recall": 0.8498168498168498,
110
+ "eval_runtime": 2.7557,
111
+ "eval_samples_per_second": 99.068,
112
+ "eval_steps_per_second": 1.814,
113
+ "step": 200
114
+ },
115
+ {
116
+ "epoch": 27.77777777777778,
117
+ "grad_norm": 0.0050039030611515045,
118
+ "learning_rate": 7.1628171992377025e-06,
119
+ "loss": 0.349,
120
+ "step": 250
121
  }
122
  ],
123
+ "logging_steps": 50,
124
  "max_steps": 270,
125
  "num_input_tokens_seen": 0,
126
+ "num_train_epochs": 30,
127
+ "save_steps": 100,
128
  "stateful_callbacks": {
129
+ "EarlyStoppingCallback": {
130
+ "args": {
131
+ "early_stopping_patience": 5,
132
+ "early_stopping_threshold": 0.001
133
+ },
134
+ "attributes": {
135
+ "early_stopping_patience_counter": 0
136
+ }
137
+ },
138
  "TrainerControl": {
139
  "args": {
140
  "should_epoch_stop": false,
 
146
  "attributes": {}
147
  }
148
  },
149
+ "total_flos": 7.4367208512e+16,
150
+ "train_batch_size": 64,
151
  "trial_name": null,
152
  "trial_params": null
153
  }
checkpoint-270/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3c9d41ebe9cd4a0039236a5c6d5456e94aa46eef860ab4ccc1c1578a2a2cbc20
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07df27942fd7faa53557a040600995f1eec08c68543b8c4ec8f3d5cdb0edfa8f
3
  size 5240
checkpoint-40/config.json ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "ntu-spml/distilhubert",
3
+ "activation_dropout": 0.1,
4
+ "apply_spec_augment": false,
5
+ "architectures": [
6
+ "HubertForSequenceClassification"
7
+ ],
8
+ "attention_dropout": 0.1,
9
+ "bos_token_id": 1,
10
+ "classifier_proj_size": 256,
11
+ "conv_bias": false,
12
+ "conv_dim": [
13
+ 512,
14
+ 512,
15
+ 512,
16
+ 512,
17
+ 512,
18
+ 512,
19
+ 512
20
+ ],
21
+ "conv_kernel": [
22
+ 10,
23
+ 3,
24
+ 3,
25
+ 3,
26
+ 3,
27
+ 2,
28
+ 2
29
+ ],
30
+ "conv_stride": [
31
+ 5,
32
+ 2,
33
+ 2,
34
+ 2,
35
+ 2,
36
+ 2,
37
+ 2
38
+ ],
39
+ "ctc_loss_reduction": "sum",
40
+ "ctc_zero_infinity": false,
41
+ "do_stable_layer_norm": false,
42
+ "eos_token_id": 2,
43
+ "feat_extract_activation": "gelu",
44
+ "feat_extract_norm": "group",
45
+ "feat_proj_dropout": 0.0,
46
+ "feat_proj_layer_norm": false,
47
+ "final_dropout": 0.0,
48
+ "finetuning_task": "audio-classification",
49
+ "hidden_act": "gelu",
50
+ "hidden_dropout": 0.1,
51
+ "hidden_size": 768,
52
+ "id2label": {
53
+ "0": "1s_normal",
54
+ "1": "1s_pain",
55
+ "2": "1s_hunger",
56
+ "3": "1s_asphyxia"
57
+ },
58
+ "initializer_range": 0.02,
59
+ "intermediate_size": 3072,
60
+ "label2id": {
61
+ "LABEL_0": 0,
62
+ "LABEL_1": 1,
63
+ "LABEL_2": 2,
64
+ "LABEL_3": 3
65
+ },
66
+ "layer_norm_eps": 1e-05,
67
+ "layerdrop": 0.0,
68
+ "mask_feature_length": 10,
69
+ "mask_feature_min_masks": 0,
70
+ "mask_feature_prob": 0.0,
71
+ "mask_time_length": 10,
72
+ "mask_time_min_masks": 2,
73
+ "mask_time_prob": 0.05,
74
+ "model_type": "hubert",
75
+ "num_attention_heads": 12,
76
+ "num_conv_pos_embedding_groups": 16,
77
+ "num_conv_pos_embeddings": 128,
78
+ "num_feat_extract_layers": 7,
79
+ "num_hidden_layers": 2,
80
+ "pad_token_id": 0,
81
+ "torch_dtype": "float32",
82
+ "transformers_version": "4.44.2",
83
+ "use_weighted_layer_sum": false,
84
+ "vocab_size": 32
85
+ }
checkpoint-40/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1be8794b6883271bd8159aa3eaf8a5c7dc9c7ca6f46499b1ed35605284a38f55
3
+ size 94765560
checkpoint-40/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31f63daa8bc8fcbf3743000e82b74820975a2e1180676884de9720d17a4b6d43
3
+ size 189556666
checkpoint-40/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a2dcb8bbd39308e164c4a45b626770cbe1576511da87fee690d854f4d63cf7b
3
+ size 14308
checkpoint-40/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:46a69db9ff1f01320d03ad1559b4a859900f84b5bee395f23ef9947fe1827b6b
3
+ size 1064
checkpoint-40/trainer_state.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 17.77777777777778,
5
+ "eval_steps": 50,
6
+ "global_step": 40,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [],
11
+ "logging_steps": 50,
12
+ "max_steps": 40,
13
+ "num_input_tokens_seen": 0,
14
+ "num_train_epochs": 20,
15
+ "save_steps": 50,
16
+ "stateful_callbacks": {
17
+ "EarlyStoppingCallback": {
18
+ "args": {
19
+ "early_stopping_patience": 5,
20
+ "early_stopping_threshold": 0.001
21
+ },
22
+ "attributes": {
23
+ "early_stopping_patience_counter": 0
24
+ }
25
+ },
26
+ "TrainerControl": {
27
+ "args": {
28
+ "should_epoch_stop": false,
29
+ "should_evaluate": false,
30
+ "should_log": false,
31
+ "should_save": true,
32
+ "should_training_stop": true
33
+ },
34
+ "attributes": {}
35
+ }
36
+ },
37
+ "total_flos": 4.417912515456e+16,
38
+ "train_batch_size": 128,
39
+ "trial_name": null,
40
+ "trial_params": null
41
+ }
checkpoint-40/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d2bccfa101dc4f5033bf5e8f3e4573e0e6972a8c965140412fe062d105477c5
3
+ size 5176
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5fce500dc6425a6a822f0b384979a4c7071ebbc180a97b6c838222031a60dac9
3
  size 94765560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:657233a1d355cbeb81915e1e45af39e18c323629f94e2cf7bf66d64986ad4832
3
  size 94765560
runs/Sep11_11-05-01_ubumarcos/events.out.tfevents.1726045650.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5833aaf9fc14662047e11b7046e638c418b87d5b4f45ba8d382ea9fb6519c58
3
+ size 6410
runs/Sep11_11-13-29_ubumarcos/events.out.tfevents.1726046010.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbd39bd168c806c70d4a064cbb9d695de26c815fa5c9247e12a5f9e77a94c865
3
+ size 5917
runs/Sep11_11-13-48_ubumarcos/events.out.tfevents.1726046029.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a25d521c4d3ecadbebba3a65d7e1a0bca3d6d2e999d769fe99f4b3f975676ff9
3
+ size 5917
runs/Sep11_11-18-22_ubumarcos/events.out.tfevents.1726046303.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d77ff01477f3d4027f08a49f7d60c36e48afe60e395f1520aad651498484bdc6
3
+ size 5919
runs/Sep11_11-18-47_ubumarcos/events.out.tfevents.1726046328.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1285a64f0fd3204492b385f31275977d40fe9a5dc713e9a206f59affcb7c019f
3
+ size 6267
runs/Sep11_11-18-47_ubumarcos/events.out.tfevents.1726046857.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5d2c310fe3d10f735a08ab0bbcb27f7c42b3175fd2431b3e2031f9e93ea83bb
3
+ size 503
runs/Sep11_11-29-05_ubumarcos/events.out.tfevents.1726046946.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f12f757556236c32a577f0a8730d8b37d38a63fcf594862d9cc07e17c1ac95d7
3
+ size 6267
runs/Sep11_11-29-05_ubumarcos/events.out.tfevents.1726047291.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f906fb6a5a845124f954924b8752f0d0293e57a4bbcdc444e91b5b786f968e8
3
+ size 503
runs/Sep11_11-37-18_ubumarcos/events.out.tfevents.1726047439.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1682e0902b337d8d4c1d488eb1f89156d94955b8c9a481ec916d66728a60ac01
3
+ size 6267
runs/Sep11_11-37-18_ubumarcos/events.out.tfevents.1726047783.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c39fd6308708aca67fa69163d9f19073416227f9014fdb60cc7f5a8d08b4e7f6
3
+ size 503
runs/Sep11_12-04-17_ubumarcos/events.out.tfevents.1726049058.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c42c5aa1c3d325a09248ecce2e51f09ced348f1ca3a548e271eee33ce7dc7ea
3
+ size 6267
runs/Sep11_12-04-17_ubumarcos/events.out.tfevents.1726049410.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:75d3ff418ce5b75df3454561f1d02cc1e0acc6d9c79bb0bff2e0f2363ad04bec
3
+ size 503
runs/Sep11_12-23-02_ubumarcos/events.out.tfevents.1726050183.ubumarcos ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a9ff9fb5aea8ac15571130c34e7b3752f026805ffe74e72e2586ff57e602994
3
+ size 5919