Update to model trained for 5 epochs
Browse files- README.md +49 -32
- all_results.json +10 -10
- eval_results.json +6 -6
- nncf_output.log +0 -0
- openvino_model.bin +2 -2
- openvino_model.xml +0 -0
- pytorch_model.bin +1 -1
- train_results.json +5 -5
- trainer_state.json +0 -0
- training_args.bin +1 -1
README.md
CHANGED
@@ -23,7 +23,7 @@ model-index:
|
|
23 |
metrics:
|
24 |
- name: Accuracy
|
25 |
type: accuracy
|
26 |
-
value: 0.
|
27 |
---
|
28 |
|
29 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -31,31 +31,14 @@ should probably proofread and complete it, then remove this comment. -->
|
|
31 |
|
32 |
# jpqd-bert-base-ft-sst2
|
33 |
|
34 |
-
|
35 |
-
> This model was trained for only 1 epoch and is shared for testing purposes.
|
36 |
|
37 |
-
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the GLUE SST2 dataset.
|
38 |
It was compressed with [NNCF](https://github.com/openvinotoolkit/nncf) following the [Optimum JPQD text-classification
|
39 |
example](https://github.com/huggingface/optimum-intel/tree/main/examples/openvino/text-classification)
|
40 |
|
41 |
It achieves the following results on the evaluation set:
|
42 |
-
- Loss: 0.
|
43 |
-
- Accuracy: 0.
|
44 |
-
|
45 |
-
|
46 |
-
## Model description
|
47 |
-
|
48 |
-
More information needed
|
49 |
-
|
50 |
-
## Intended uses & limitations
|
51 |
-
|
52 |
-
More information needed
|
53 |
-
|
54 |
-
## Training and evaluation data
|
55 |
-
|
56 |
-
More information needed
|
57 |
-
|
58 |
-
## Training procedure
|
59 |
|
60 |
### Training hyperparameters
|
61 |
|
@@ -66,21 +49,55 @@ The following hyperparameters were used during training:
|
|
66 |
- seed: 42
|
67 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
68 |
- lr_scheduler_type: linear
|
69 |
-
- num_epochs:
|
70 |
- mixed_precision_training: Native AMP
|
71 |
|
72 |
### Training results
|
73 |
|
74 |
-
| Training Loss | Epoch | Step
|
75 |
-
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
-
| 0.
|
83 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
84 |
|
85 |
|
86 |
### Framework versions
|
|
|
23 |
metrics:
|
24 |
- name: Accuracy
|
25 |
type: accuracy
|
26 |
+
value: 0.9162844036697247
|
27 |
---
|
28 |
|
29 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
31 |
|
32 |
# jpqd-bert-base-ft-sst2
|
33 |
|
34 |
+
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the GLUE SST2 dataset.
|
|
|
35 |
|
|
|
36 |
It was compressed with [NNCF](https://github.com/openvinotoolkit/nncf) following the [Optimum JPQD text-classification
|
37 |
example](https://github.com/huggingface/optimum-intel/tree/main/examples/openvino/text-classification)
|
38 |
|
39 |
It achieves the following results on the evaluation set:
|
40 |
+
- Loss: 0.2798
|
41 |
+
- Accuracy: 0.9163
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
### Training hyperparameters
|
44 |
|
|
|
49 |
- seed: 42
|
50 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
51 |
- lr_scheduler_type: linear
|
52 |
+
- num_epochs: 5.0
|
53 |
- mixed_precision_training: Native AMP
|
54 |
|
55 |
### Training results
|
56 |
|
57 |
+
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|
58 |
+
|:-------------:|:-----:|:-----:|:---------------:|:--------:|
|
59 |
+
| 0.392 | 0.12 | 250 | 0.4535 | 0.8888 |
|
60 |
+
| 0.4413 | 0.24 | 500 | 0.4671 | 0.8899 |
|
61 |
+
| 0.29 | 0.36 | 750 | 0.3285 | 0.9128 |
|
62 |
+
| 0.2851 | 0.48 | 1000 | 0.2498 | 0.9151 |
|
63 |
+
| 0.3717 | 0.59 | 1250 | 0.2037 | 0.9243 |
|
64 |
+
| 0.2467 | 0.71 | 1500 | 0.2840 | 0.9174 |
|
65 |
+
| 0.2114 | 0.83 | 1750 | 0.2239 | 0.9243 |
|
66 |
+
| 0.1777 | 0.95 | 2000 | 0.1968 | 0.9266 |
|
67 |
+
| 2.6501 | 1.07 | 2250 | 2.8219 | 0.9255 |
|
68 |
+
| 6.4768 | 1.19 | 2500 | 6.5765 | 0.8979 |
|
69 |
+
| 9.3594 | 1.31 | 2750 | 9.4648 | 0.8819 |
|
70 |
+
| 11.5481 | 1.43 | 3000 | 11.5391 | 0.8567 |
|
71 |
+
| 12.7541 | 1.54 | 3250 | 12.8359 | 0.8578 |
|
72 |
+
| 13.6184 | 1.66 | 3500 | 13.6519 | 0.8429 |
|
73 |
+
| 13.9171 | 1.78 | 3750 | 14.0734 | 0.8475 |
|
74 |
+
| 13.9601 | 1.9 | 4000 | 14.1024 | 0.8578 |
|
75 |
+
| 0.2701 | 2.02 | 4250 | 0.3354 | 0.9048 |
|
76 |
+
| 0.2689 | 2.14 | 4500 | 0.3320 | 0.9048 |
|
77 |
+
| 0.1775 | 2.26 | 4750 | 0.2838 | 0.9163 |
|
78 |
+
| 0.1648 | 2.38 | 5000 | 0.2842 | 0.9128 |
|
79 |
+
| 0.1316 | 2.49 | 5250 | 0.2750 | 0.9163 |
|
80 |
+
| 0.2349 | 2.61 | 5500 | 0.2405 | 0.9232 |
|
81 |
+
| 0.066 | 2.73 | 5750 | 0.2695 | 0.9174 |
|
82 |
+
| 0.1285 | 2.85 | 6000 | 0.3017 | 0.9094 |
|
83 |
+
| 0.1813 | 2.97 | 6250 | 0.3472 | 0.9106 |
|
84 |
+
| 0.078 | 3.09 | 6500 | 0.2915 | 0.9140 |
|
85 |
+
| 0.0886 | 3.21 | 6750 | 0.2853 | 0.9151 |
|
86 |
+
| 0.117 | 3.33 | 7000 | 0.2689 | 0.9186 |
|
87 |
+
| 0.0894 | 3.44 | 7250 | 0.2748 | 0.9174 |
|
88 |
+
| 0.1023 | 3.56 | 7500 | 0.3279 | 0.9094 |
|
89 |
+
| 0.0495 | 3.68 | 7750 | 0.2988 | 0.9151 |
|
90 |
+
| 0.0899 | 3.8 | 8000 | 0.2796 | 0.9174 |
|
91 |
+
| 0.1102 | 3.92 | 8250 | 0.2667 | 0.9163 |
|
92 |
+
| 0.061 | 4.04 | 8500 | 0.2837 | 0.9174 |
|
93 |
+
| 0.0594 | 4.16 | 8750 | 0.2766 | 0.9151 |
|
94 |
+
| 0.1062 | 4.28 | 9000 | 0.2777 | 0.9140 |
|
95 |
+
| 0.0751 | 4.39 | 9250 | 0.2690 | 0.9220 |
|
96 |
+
| 0.0386 | 4.51 | 9500 | 0.2668 | 0.9163 |
|
97 |
+
| 0.0284 | 4.63 | 9750 | 0.2812 | 0.9186 |
|
98 |
+
| 0.1016 | 4.75 | 10000 | 0.2825 | 0.9163 |
|
99 |
+
| 0.0507 | 4.87 | 10250 | 0.2805 | 0.9140 |
|
100 |
+
| 0.0709 | 4.99 | 10500 | 0.2855 | 0.9140 |
|
101 |
|
102 |
|
103 |
### Framework versions
|
all_results.json
CHANGED
@@ -1,14 +1,14 @@
|
|
1 |
{
|
2 |
-
"epoch":
|
3 |
-
"eval_accuracy": 0.
|
4 |
-
"eval_loss": 0.
|
5 |
-
"eval_runtime": 22.
|
6 |
"eval_samples": 872,
|
7 |
-
"eval_samples_per_second": 39.
|
8 |
-
"eval_steps_per_second": 4.
|
9 |
-
"train_loss":
|
10 |
-
"train_runtime":
|
11 |
"train_samples": 67349,
|
12 |
-
"train_samples_per_second":
|
13 |
-
"train_steps_per_second": 1.
|
14 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 5.0,
|
3 |
+
"eval_accuracy": 0.9162844036697247,
|
4 |
+
"eval_loss": 0.27984118461608887,
|
5 |
+
"eval_runtime": 22.0602,
|
6 |
"eval_samples": 872,
|
7 |
+
"eval_samples_per_second": 39.528,
|
8 |
+
"eval_steps_per_second": 4.941,
|
9 |
+
"train_loss": 2.258239375927669,
|
10 |
+
"train_runtime": 6578.4214,
|
11 |
"train_samples": 67349,
|
12 |
+
"train_samples_per_second": 51.189,
|
13 |
+
"train_steps_per_second": 1.6
|
14 |
}
|
eval_results.json
CHANGED
@@ -1,9 +1,9 @@
|
|
1 |
{
|
2 |
-
"epoch":
|
3 |
-
"eval_accuracy": 0.
|
4 |
-
"eval_loss": 0.
|
5 |
-
"eval_runtime": 22.
|
6 |
"eval_samples": 872,
|
7 |
-
"eval_samples_per_second": 39.
|
8 |
-
"eval_steps_per_second": 4.
|
9 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 5.0,
|
3 |
+
"eval_accuracy": 0.9162844036697247,
|
4 |
+
"eval_loss": 0.27984118461608887,
|
5 |
+
"eval_runtime": 22.0602,
|
6 |
"eval_samples": 872,
|
7 |
+
"eval_samples_per_second": 39.528,
|
8 |
+
"eval_steps_per_second": 4.941
|
9 |
}
|
nncf_output.log
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
openvino_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1a63ae7b71fd9c3734bf2de52f4a9a692b26e9a980b2530776e569786e39141d
|
3 |
+
size 75584740
|
openvino_model.xml
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 779394143
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f4efc1ae0662c2b5924d475863eee93d9f7e666d0409607d5c514a010b5d352a
|
3 |
size 779394143
|
train_results.json
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
{
|
2 |
-
"epoch":
|
3 |
-
"train_loss":
|
4 |
-
"train_runtime":
|
5 |
"train_samples": 67349,
|
6 |
-
"train_samples_per_second":
|
7 |
-
"train_steps_per_second": 1.
|
8 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 5.0,
|
3 |
+
"train_loss": 2.258239375927669,
|
4 |
+
"train_runtime": 6578.4214,
|
5 |
"train_samples": 67349,
|
6 |
+
"train_samples_per_second": 51.189,
|
7 |
+
"train_steps_per_second": 1.6
|
8 |
}
|
trainer_state.json
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 3579
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:09f3dd510fd602998d6109ba5bc629b379228700f8553c5a9420eb7d1e02ac27
|
3 |
size 3579
|