Upload folder using huggingface_hub
Browse files- README.md +10 -12
- checkpoint-100/model.safetensors +1 -1
- checkpoint-100/optimizer.pt +1 -1
- checkpoint-100/rng_state.pth +1 -1
- checkpoint-100/scheduler.pt +1 -1
- checkpoint-100/trainer_state.json +33 -33
- checkpoint-100/training_args.bin +1 -1
- checkpoint-120/config.json +85 -0
- checkpoint-120/model.safetensors +3 -0
- checkpoint-120/optimizer.pt +3 -0
- checkpoint-120/rng_state.pth +3 -0
- checkpoint-120/scheduler.pt +3 -0
- checkpoint-120/trainer_state.json +94 -0
- checkpoint-120/training_args.bin +3 -0
- model.safetensors +1 -1
- runs/Sep11_16-12-44_ubumarcos/events.out.tfevents.1726065155.ubumarcos +2 -2
- runs/Sep13_19-53-08_ubumarcos/events.out.tfevents.1726249989.ubumarcos +3 -0
- runs/Sep14_16-07-17_ubumarcos/events.out.tfevents.1726322838.ubumarcos +3 -0
- runs/Sep14_16-08-56_ubumarcos/events.out.tfevents.1726322937.ubumarcos +3 -0
- runs/Sep14_16-16-59_ubumarcos/events.out.tfevents.1726323420.ubumarcos +3 -0
- runs/Sep14_16-16-59_ubumarcos/events.out.tfevents.1726323549.ubumarcos +3 -0
- runs/Sep14_16-20-47_ubumarcos/events.out.tfevents.1726323648.ubumarcos +3 -0
- runs/Sep14_16-20-47_ubumarcos/events.out.tfevents.1726323777.ubumarcos +3 -0
- runs/Sep14_16-29-21_ubumarcos/events.out.tfevents.1726324163.ubumarcos +3 -0
- runs/Sep14_16-29-21_ubumarcos/events.out.tfevents.1726324291.ubumarcos +3 -0
- runs/Sep14_16-32-20_ubumarcos/events.out.tfevents.1726324341.ubumarcos +3 -0
- runs/Sep14_16-32-20_ubumarcos/events.out.tfevents.1726324471.ubumarcos +3 -0
- runs/Sep14_16-34-43_ubumarcos/events.out.tfevents.1726324484.ubumarcos +3 -0
- runs/Sep14_16-34-43_ubumarcos/events.out.tfevents.1726324615.ubumarcos +3 -0
- runs/Sep14_16-50-46_ubumarcos/events.out.tfevents.1726325447.ubumarcos +3 -0
- runs/Sep14_16-50-46_ubumarcos/events.out.tfevents.1726325703.ubumarcos +3 -0
- runs/Sep14_16-55-25_ubumarcos/events.out.tfevents.1726325726.ubumarcos +3 -0
- runs/Sep14_17-02-18_ubumarcos/events.out.tfevents.1726326139.ubumarcos +3 -0
- runs/Sep14_17-03-33_ubumarcos/events.out.tfevents.1726326215.ubumarcos +3 -0
- runs/Sep14_17-05-21_ubumarcos/events.out.tfevents.1726326322.ubumarcos +3 -0
- runs/Sep14_17-06-50_ubumarcos/events.out.tfevents.1726326411.ubumarcos +3 -0
- runs/Sep14_17-06-50_ubumarcos/events.out.tfevents.1726327100.ubumarcos +3 -0
- training_args.bin +1 -1
README.md
CHANGED
@@ -21,12 +21,12 @@ should probably proofread and complete it, then remove this comment. -->
|
|
21 |
|
22 |
This model is a fine-tuned version of [ntu-spml/distilhubert](https://huggingface.co/ntu-spml/distilhubert) on an unknown dataset.
|
23 |
It achieves the following results on the evaluation set:
|
24 |
-
- Loss: 0.
|
25 |
-
- Accuracy: 0.
|
26 |
-
- F1: 0.
|
27 |
-
- Precision: 0.
|
28 |
-
- Recall: 0.
|
29 |
-
- Confusion Matrix: [[
|
30 |
|
31 |
## Model description
|
32 |
|
@@ -46,24 +46,22 @@ More information needed
|
|
46 |
|
47 |
The following hyperparameters were used during training:
|
48 |
- learning_rate: 0.0003
|
49 |
-
- train_batch_size:
|
50 |
-
- eval_batch_size:
|
51 |
- seed: 123
|
52 |
- gradient_accumulation_steps: 2
|
53 |
-
- total_train_batch_size:
|
54 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
55 |
- lr_scheduler_type: cosine_with_restarts
|
56 |
- lr_scheduler_warmup_ratio: 0.1
|
57 |
- num_epochs: 30
|
58 |
-
- mixed_precision_training: Native AMP
|
59 |
- label_smoothing_factor: 0.1
|
60 |
|
61 |
### Training results
|
62 |
|
63 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | Confusion Matrix |
|
64 |
|:-------------:|:-------:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|:--------------------------------------------------------------:|
|
65 |
-
| 0.
|
66 |
-
| 0.3511 | 22.2222 | 200 | 0.7837 | 0.8498 | 0.8473 | 0.8494 | 0.8498 | [[76, 11, 0, 0], [7, 35, 17, 0], [0, 5, 59, 0], [1, 0, 0, 62]] |
|
67 |
|
68 |
|
69 |
### Framework versions
|
|
|
21 |
|
22 |
This model is a fine-tuned version of [ntu-spml/distilhubert](https://huggingface.co/ntu-spml/distilhubert) on an unknown dataset.
|
23 |
It achieves the following results on the evaluation set:
|
24 |
+
- Loss: 0.7942
|
25 |
+
- Accuracy: 0.8242
|
26 |
+
- F1: 0.8278
|
27 |
+
- Precision: 0.8347
|
28 |
+
- Recall: 0.8242
|
29 |
+
- Confusion Matrix: [[51, 10, 0, 2], [5, 44, 9, 0], [1, 14, 67, 0], [7, 0, 0, 63]]
|
30 |
|
31 |
## Model description
|
32 |
|
|
|
46 |
|
47 |
The following hyperparameters were used during training:
|
48 |
- learning_rate: 0.0003
|
49 |
+
- train_batch_size: 128
|
50 |
+
- eval_batch_size: 128
|
51 |
- seed: 123
|
52 |
- gradient_accumulation_steps: 2
|
53 |
+
- total_train_batch_size: 256
|
54 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
55 |
- lr_scheduler_type: cosine_with_restarts
|
56 |
- lr_scheduler_warmup_ratio: 0.1
|
57 |
- num_epochs: 30
|
|
|
58 |
- label_smoothing_factor: 0.1
|
59 |
|
60 |
### Training results
|
61 |
|
62 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | Confusion Matrix |
|
63 |
|:-------------:|:-------:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|:--------------------------------------------------------------:|
|
64 |
+
| 0.3691 | 22.2222 | 100 | 0.7942 | 0.8242 | 0.8278 | 0.8347 | 0.8242 | [[51, 10, 0, 2], [5, 44, 9, 0], [1, 14, 67, 0], [7, 0, 0, 63]] |
|
|
|
65 |
|
66 |
|
67 |
### Framework versions
|
checkpoint-100/model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 94765560
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:58cf140280d70389e1aece2ee9a69bdfb705db914d4944c5f4efd478daa1fd13
|
3 |
size 94765560
|
checkpoint-100/optimizer.pt
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 189556666
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:142e3924a56b6cd9c933c65339620f75678c6c9d9e8e06ea806f34bf4ae56a9c
|
3 |
size 189556666
|
checkpoint-100/rng_state.pth
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 14308
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:17a31001969e690d6c1e7ecc15efb09ef0f6c296cc34685bfbe4d5ec7dbf4d37
|
3 |
size 14308
|
checkpoint-100/scheduler.pt
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1064
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:233a3c28ff8d6558e54c4868ca9542c8aea9ce05af2702d9900bb59e7a606be3
|
3 |
size 1064
|
checkpoint-100/trainer_state.json
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
{
|
2 |
-
"best_metric": 0.
|
3 |
"best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-100",
|
4 |
-
"epoch":
|
5 |
"eval_steps": 100,
|
6 |
"global_step": 100,
|
7 |
"is_hyper_param_search": false,
|
@@ -9,60 +9,60 @@
|
|
9 |
"is_world_process_zero": true,
|
10 |
"log_history": [
|
11 |
{
|
12 |
-
"epoch":
|
13 |
-
"grad_norm":
|
14 |
-
"learning_rate": 0.
|
15 |
-
"loss": 0.
|
16 |
"step": 50
|
17 |
},
|
18 |
{
|
19 |
-
"epoch":
|
20 |
-
"grad_norm":
|
21 |
-
"learning_rate":
|
22 |
-
"loss": 0.
|
23 |
"step": 100
|
24 |
},
|
25 |
{
|
26 |
-
"epoch":
|
27 |
-
"eval_accuracy": 0.
|
28 |
"eval_confusion_matrix": [
|
29 |
[
|
30 |
-
|
31 |
-
|
32 |
0,
|
33 |
-
|
34 |
],
|
35 |
[
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
0
|
40 |
],
|
41 |
[
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
0
|
46 |
],
|
47 |
[
|
48 |
-
|
49 |
0,
|
50 |
0,
|
51 |
-
|
52 |
]
|
53 |
],
|
54 |
-
"eval_f1": 0.
|
55 |
-
"eval_loss": 0.
|
56 |
-
"eval_precision": 0.
|
57 |
-
"eval_recall": 0.
|
58 |
-
"eval_runtime": 3.
|
59 |
-
"eval_samples_per_second":
|
60 |
-
"eval_steps_per_second":
|
61 |
"step": 100
|
62 |
}
|
63 |
],
|
64 |
"logging_steps": 50,
|
65 |
-
"max_steps":
|
66 |
"num_input_tokens_seen": 0,
|
67 |
"num_train_epochs": 30,
|
68 |
"save_steps": 100,
|
@@ -87,8 +87,8 @@
|
|
87 |
"attributes": {}
|
88 |
}
|
89 |
},
|
90 |
-
"total_flos":
|
91 |
-
"train_batch_size":
|
92 |
"trial_name": null,
|
93 |
"trial_params": null
|
94 |
}
|
|
|
1 |
{
|
2 |
+
"best_metric": 0.8277802155149765,
|
3 |
"best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-100",
|
4 |
+
"epoch": 22.22222222222222,
|
5 |
"eval_steps": 100,
|
6 |
"global_step": 100,
|
7 |
"is_hyper_param_search": false,
|
|
|
9 |
"is_world_process_zero": true,
|
10 |
"log_history": [
|
11 |
{
|
12 |
+
"epoch": 11.11111111111111,
|
13 |
+
"grad_norm": 2.099414587020874,
|
14 |
+
"learning_rate": 0.00021731987703006933,
|
15 |
+
"loss": 0.7753,
|
16 |
"step": 50
|
17 |
},
|
18 |
{
|
19 |
+
"epoch": 22.22222222222222,
|
20 |
+
"grad_norm": 0.09225956350564957,
|
21 |
+
"learning_rate": 2.4676828288059558e-05,
|
22 |
+
"loss": 0.3691,
|
23 |
"step": 100
|
24 |
},
|
25 |
{
|
26 |
+
"epoch": 22.22222222222222,
|
27 |
+
"eval_accuracy": 0.8241758241758241,
|
28 |
"eval_confusion_matrix": [
|
29 |
[
|
30 |
+
51,
|
31 |
+
10,
|
32 |
0,
|
33 |
+
2
|
34 |
],
|
35 |
[
|
36 |
+
5,
|
37 |
+
44,
|
38 |
+
9,
|
39 |
0
|
40 |
],
|
41 |
[
|
42 |
+
1,
|
43 |
+
14,
|
44 |
+
67,
|
45 |
0
|
46 |
],
|
47 |
[
|
48 |
+
7,
|
49 |
0,
|
50 |
0,
|
51 |
+
63
|
52 |
]
|
53 |
],
|
54 |
+
"eval_f1": 0.8277802155149765,
|
55 |
+
"eval_loss": 0.7942458391189575,
|
56 |
+
"eval_precision": 0.8346819204947628,
|
57 |
+
"eval_recall": 0.8241758241758241,
|
58 |
+
"eval_runtime": 3.7898,
|
59 |
+
"eval_samples_per_second": 72.036,
|
60 |
+
"eval_steps_per_second": 0.792,
|
61 |
"step": 100
|
62 |
}
|
63 |
],
|
64 |
"logging_steps": 50,
|
65 |
+
"max_steps": 120,
|
66 |
"num_input_tokens_seen": 0,
|
67 |
"num_train_epochs": 30,
|
68 |
"save_steps": 100,
|
|
|
87 |
"attributes": {}
|
88 |
}
|
89 |
},
|
90 |
+
"total_flos": 5.511815490816e+16,
|
91 |
+
"train_batch_size": 128,
|
92 |
"trial_name": null,
|
93 |
"trial_params": null
|
94 |
}
|
checkpoint-100/training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5240
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:04c36e688104e01b3bf86e2899b93cb9a0868d0f8f810b28125b46e47948bf14
|
3 |
size 5240
|
checkpoint-120/config.json
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "ntu-spml/distilhubert",
|
3 |
+
"activation_dropout": 0.1,
|
4 |
+
"apply_spec_augment": false,
|
5 |
+
"architectures": [
|
6 |
+
"HubertForSequenceClassification"
|
7 |
+
],
|
8 |
+
"attention_dropout": 0.1,
|
9 |
+
"bos_token_id": 1,
|
10 |
+
"classifier_proj_size": 256,
|
11 |
+
"conv_bias": false,
|
12 |
+
"conv_dim": [
|
13 |
+
512,
|
14 |
+
512,
|
15 |
+
512,
|
16 |
+
512,
|
17 |
+
512,
|
18 |
+
512,
|
19 |
+
512
|
20 |
+
],
|
21 |
+
"conv_kernel": [
|
22 |
+
10,
|
23 |
+
3,
|
24 |
+
3,
|
25 |
+
3,
|
26 |
+
3,
|
27 |
+
2,
|
28 |
+
2
|
29 |
+
],
|
30 |
+
"conv_stride": [
|
31 |
+
5,
|
32 |
+
2,
|
33 |
+
2,
|
34 |
+
2,
|
35 |
+
2,
|
36 |
+
2,
|
37 |
+
2
|
38 |
+
],
|
39 |
+
"ctc_loss_reduction": "sum",
|
40 |
+
"ctc_zero_infinity": false,
|
41 |
+
"do_stable_layer_norm": false,
|
42 |
+
"eos_token_id": 2,
|
43 |
+
"feat_extract_activation": "gelu",
|
44 |
+
"feat_extract_norm": "group",
|
45 |
+
"feat_proj_dropout": 0.0,
|
46 |
+
"feat_proj_layer_norm": false,
|
47 |
+
"final_dropout": 0.0,
|
48 |
+
"finetuning_task": "audio-classification",
|
49 |
+
"hidden_act": "gelu",
|
50 |
+
"hidden_dropout": 0.1,
|
51 |
+
"hidden_size": 768,
|
52 |
+
"id2label": {
|
53 |
+
"0": "1s_normal",
|
54 |
+
"1": "1s_pain",
|
55 |
+
"2": "1s_hunger",
|
56 |
+
"3": "1s_asphyxia"
|
57 |
+
},
|
58 |
+
"initializer_range": 0.02,
|
59 |
+
"intermediate_size": 3072,
|
60 |
+
"label2id": {
|
61 |
+
"LABEL_0": 0,
|
62 |
+
"LABEL_1": 1,
|
63 |
+
"LABEL_2": 2,
|
64 |
+
"LABEL_3": 3
|
65 |
+
},
|
66 |
+
"layer_norm_eps": 1e-05,
|
67 |
+
"layerdrop": 0.0,
|
68 |
+
"mask_feature_length": 10,
|
69 |
+
"mask_feature_min_masks": 0,
|
70 |
+
"mask_feature_prob": 0.0,
|
71 |
+
"mask_time_length": 10,
|
72 |
+
"mask_time_min_masks": 2,
|
73 |
+
"mask_time_prob": 0.05,
|
74 |
+
"model_type": "hubert",
|
75 |
+
"num_attention_heads": 12,
|
76 |
+
"num_conv_pos_embedding_groups": 16,
|
77 |
+
"num_conv_pos_embeddings": 128,
|
78 |
+
"num_feat_extract_layers": 7,
|
79 |
+
"num_hidden_layers": 2,
|
80 |
+
"pad_token_id": 0,
|
81 |
+
"torch_dtype": "float32",
|
82 |
+
"transformers_version": "4.44.2",
|
83 |
+
"use_weighted_layer_sum": false,
|
84 |
+
"vocab_size": 32
|
85 |
+
}
|
checkpoint-120/model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1d8db797f8abb6e63300b3c3787f2c734270d3192b7024f069b55ef27f94f271
|
3 |
+
size 94765560
|
checkpoint-120/optimizer.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a28cfe9149a4a414b2c23bd9f721401da187eb2ddd226a1ae973de2a767e72ba
|
3 |
+
size 189556666
|
checkpoint-120/rng_state.pth
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:654b7b30eb2788cf161ef643bc0017567b05c040c7c8800275f7d819599bc31f
|
3 |
+
size 14308
|
checkpoint-120/scheduler.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:843e8fe068c808e64965cc91e30bb91768e712c534fccf13d58b96d94f1209bf
|
3 |
+
size 1064
|
checkpoint-120/trainer_state.json
ADDED
@@ -0,0 +1,94 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"best_metric": 0.8277802155149765,
|
3 |
+
"best_model_checkpoint": "distilhubert-finetuned-mixed-data/checkpoint-100",
|
4 |
+
"epoch": 26.666666666666668,
|
5 |
+
"eval_steps": 100,
|
6 |
+
"global_step": 120,
|
7 |
+
"is_hyper_param_search": false,
|
8 |
+
"is_local_process_zero": true,
|
9 |
+
"is_world_process_zero": true,
|
10 |
+
"log_history": [
|
11 |
+
{
|
12 |
+
"epoch": 11.11111111111111,
|
13 |
+
"grad_norm": 2.099414587020874,
|
14 |
+
"learning_rate": 0.00021731987703006933,
|
15 |
+
"loss": 0.7753,
|
16 |
+
"step": 50
|
17 |
+
},
|
18 |
+
{
|
19 |
+
"epoch": 22.22222222222222,
|
20 |
+
"grad_norm": 0.09225956350564957,
|
21 |
+
"learning_rate": 2.4676828288059558e-05,
|
22 |
+
"loss": 0.3691,
|
23 |
+
"step": 100
|
24 |
+
},
|
25 |
+
{
|
26 |
+
"epoch": 22.22222222222222,
|
27 |
+
"eval_accuracy": 0.8241758241758241,
|
28 |
+
"eval_confusion_matrix": [
|
29 |
+
[
|
30 |
+
51,
|
31 |
+
10,
|
32 |
+
0,
|
33 |
+
2
|
34 |
+
],
|
35 |
+
[
|
36 |
+
5,
|
37 |
+
44,
|
38 |
+
9,
|
39 |
+
0
|
40 |
+
],
|
41 |
+
[
|
42 |
+
1,
|
43 |
+
14,
|
44 |
+
67,
|
45 |
+
0
|
46 |
+
],
|
47 |
+
[
|
48 |
+
7,
|
49 |
+
0,
|
50 |
+
0,
|
51 |
+
63
|
52 |
+
]
|
53 |
+
],
|
54 |
+
"eval_f1": 0.8277802155149765,
|
55 |
+
"eval_loss": 0.7942458391189575,
|
56 |
+
"eval_precision": 0.8346819204947628,
|
57 |
+
"eval_recall": 0.8241758241758241,
|
58 |
+
"eval_runtime": 3.7898,
|
59 |
+
"eval_samples_per_second": 72.036,
|
60 |
+
"eval_steps_per_second": 0.792,
|
61 |
+
"step": 100
|
62 |
+
}
|
63 |
+
],
|
64 |
+
"logging_steps": 50,
|
65 |
+
"max_steps": 120,
|
66 |
+
"num_input_tokens_seen": 0,
|
67 |
+
"num_train_epochs": 30,
|
68 |
+
"save_steps": 100,
|
69 |
+
"stateful_callbacks": {
|
70 |
+
"EarlyStoppingCallback": {
|
71 |
+
"args": {
|
72 |
+
"early_stopping_patience": 5,
|
73 |
+
"early_stopping_threshold": 0.001
|
74 |
+
},
|
75 |
+
"attributes": {
|
76 |
+
"early_stopping_patience_counter": 0
|
77 |
+
}
|
78 |
+
},
|
79 |
+
"TrainerControl": {
|
80 |
+
"args": {
|
81 |
+
"should_epoch_stop": false,
|
82 |
+
"should_evaluate": false,
|
83 |
+
"should_log": false,
|
84 |
+
"should_save": true,
|
85 |
+
"should_training_stop": true
|
86 |
+
},
|
87 |
+
"attributes": {}
|
88 |
+
}
|
89 |
+
},
|
90 |
+
"total_flos": 6.619818670848e+16,
|
91 |
+
"train_batch_size": 128,
|
92 |
+
"trial_name": null,
|
93 |
+
"trial_params": null
|
94 |
+
}
|
checkpoint-120/training_args.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:04c36e688104e01b3bf86e2899b93cb9a0868d0f8f810b28125b46e47948bf14
|
3 |
+
size 5240
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 94765560
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:58cf140280d70389e1aece2ee9a69bdfb705db914d4944c5f4efd478daa1fd13
|
3 |
size 94765560
|
runs/Sep11_16-12-44_ubumarcos/events.out.tfevents.1726065155.ubumarcos
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:41acb25ea83d4d052ef60cbac634ffd331f6acab73a0e8686008829f0e7b31a9
|
3 |
+
size 512
|
runs/Sep13_19-53-08_ubumarcos/events.out.tfevents.1726249989.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8b7930a8e4459eb47b5328c1b5e30d89ff8346cb1f4271988173ec984bfa9e8e
|
3 |
+
size 5940
|
runs/Sep14_16-07-17_ubumarcos/events.out.tfevents.1726322838.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d325f339559ba5e0776dae440f8a61994a1cdd3834c7a97959b6323dd31e5384
|
3 |
+
size 5939
|
runs/Sep14_16-08-56_ubumarcos/events.out.tfevents.1726322937.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:24cb9264741d43a908761050e52831ae3e684db680705cb18c12affe05311a6c
|
3 |
+
size 5939
|
runs/Sep14_16-16-59_ubumarcos/events.out.tfevents.1726323420.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:422bf20c8baed359f9d53d14d6233b9787fe5f2d149f0d2cddad7178d20025b8
|
3 |
+
size 6287
|
runs/Sep14_16-16-59_ubumarcos/events.out.tfevents.1726323549.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9a0d269252898a00bafb94438b55d577574bd1bca3e70bd0e261310d2f058f74
|
3 |
+
size 503
|
runs/Sep14_16-20-47_ubumarcos/events.out.tfevents.1726323648.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d893d8ff8727c98231a23f7d5e837886137d36b05ec61db4638d14adfcd881e7
|
3 |
+
size 6287
|
runs/Sep14_16-20-47_ubumarcos/events.out.tfevents.1726323777.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:707ed4127dbf61a2650294365c8872721c86ea049c93edd03018bb76bcf23298
|
3 |
+
size 503
|
runs/Sep14_16-29-21_ubumarcos/events.out.tfevents.1726324163.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0b31c3d94122b47c9c3c886789a6e21fbe258ce29a30f2b9c912006d4a085218
|
3 |
+
size 6287
|
runs/Sep14_16-29-21_ubumarcos/events.out.tfevents.1726324291.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:96b77523bddefded39d845fc6666652e8514daaa8785251783f4abe22c576849
|
3 |
+
size 503
|
runs/Sep14_16-32-20_ubumarcos/events.out.tfevents.1726324341.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0e8753489d542af8cd4cbbb14ebb1c2c5d2f32901ce10646e5e2af9e59d317c5
|
3 |
+
size 6287
|
runs/Sep14_16-32-20_ubumarcos/events.out.tfevents.1726324471.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:672e5af548e05cc57039d009eefa4f17daacd54f0af72c6d2443d9b1b26cf6b0
|
3 |
+
size 503
|
runs/Sep14_16-34-43_ubumarcos/events.out.tfevents.1726324484.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5d2d005f924c4dfb09b2ba7b3382cf3bf9e0ff9ac7517a3c329d5b57e901db7e
|
3 |
+
size 6287
|
runs/Sep14_16-34-43_ubumarcos/events.out.tfevents.1726324615.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1fd4bca5fe257ec01088bbd5fd271a5b0b4df0a00f0eb386f781315d03ba930d
|
3 |
+
size 503
|
runs/Sep14_16-50-46_ubumarcos/events.out.tfevents.1726325447.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:226d764f96374d32745973f2a033a19d4d49f4f55ee30a748ca58b7175c80746
|
3 |
+
size 6495
|
runs/Sep14_16-50-46_ubumarcos/events.out.tfevents.1726325703.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d1f3c03564384a0b7a95c1c5b0720f6187e99ff738cfaa214ba9eed4837d3dda
|
3 |
+
size 503
|
runs/Sep14_16-55-25_ubumarcos/events.out.tfevents.1726325726.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6fab9103bb09302f4a1c3f68effe3360ca543b35d8cb1ea70fbc27709a028d54
|
3 |
+
size 5940
|
runs/Sep14_17-02-18_ubumarcos/events.out.tfevents.1726326139.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:63dbf1c9f1ac79baefdd0f471938a0f748e5749c1ff2783a18e74f6591be6b37
|
3 |
+
size 5940
|
runs/Sep14_17-03-33_ubumarcos/events.out.tfevents.1726326215.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9837dbddc3f665316e28bd2cb2fff8983951867a0a05467eb4199503ad969897
|
3 |
+
size 5942
|
runs/Sep14_17-05-21_ubumarcos/events.out.tfevents.1726326322.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d481f74702f97625cdfc65dadcf5104c072691ee5d120c4b4a3ffef89ff3f223
|
3 |
+
size 11844
|
runs/Sep14_17-06-50_ubumarcos/events.out.tfevents.1726326411.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4665947b0d19193d4067d8a629eb0dbf0702cbfc1837618455286fb6e0e1ff4a
|
3 |
+
size 7167
|
runs/Sep14_17-06-50_ubumarcos/events.out.tfevents.1726327100.ubumarcos
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d62f4c346935efa7b0e7b0b76b2c91431f8f67b53dc9c6fc95a650671d29a17e
|
3 |
+
size 40
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5240
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:04c36e688104e01b3bf86e2899b93cb9a0868d0f8f810b28125b46e47948bf14
|
3 |
size 5240
|