End of training

Browse files

Files changed (8) hide show

README.md +82 -25
config.json +4 -3
model.safetensors +2 -2
runs/Mar23_13-52-36_3015ac71de9b/events.out.tfevents.1711202013.3015ac71de9b.221.8 +3 -0
tokenizer.json +0 -0
tokenizer_config.json +1 -3
training_args.bin +2 -2
vocab.txt +0 -0

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 license: apache-2.0
-base_model: distilbert-base-uncased
 tags:
 - generated_from_trainer
 metrics:
@@ -18,13 +18,13 @@ should probably proofread and complete it, then remove this comment. -->
 # trainerL
-This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.6821
-- Precision: 0.6504
-- Recall: 0.6310
-- F1: 0.6292
-- Accuracy: 0.6310
 ## Model description
@@ -55,28 +55,85 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
-| 0.0           | 0.28  | 15   | 4.7222          | 0.6565    | 0.6310 | 0.6300 | 0.6310   |
-| 0.0           | 0.57  | 30   | 4.7349          | 0.6565    | 0.6310 | 0.6300 | 0.6310   |
-| 0.0           | 0.85  | 45   | 4.7393          | 0.6565    | 0.6310 | 0.6300 | 0.6310   |
-| 0.0           | 1.13  | 60   | 4.7156          | 0.6453    | 0.6310 | 0.6279 | 0.6310   |
-| 0.0           | 1.42  | 75   | 4.7066          | 0.6453    | 0.6310 | 0.6279 | 0.6310   |
-| 0.0           | 1.7   | 90   | 4.7169          | 0.6453    | 0.6310 | 0.6279 | 0.6310   |
-| 0.0           | 1.98  | 105  | 4.6991          | 0.6453    | 0.6310 | 0.6279 | 0.6310   |
-| 0.0           | 2.26  | 120  | 4.6971          | 0.6453    | 0.6310 | 0.6279 | 0.6310   |
-| 0.0           | 2.55  | 135  | 4.6570          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 2.83  | 150  | 4.6596          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 3.11  | 165  | 4.6646          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 3.4   | 180  | 4.6775          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 3.68  | 195  | 4.6758          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 3.96  | 210  | 4.6771          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 4.25  | 225  | 4.6797          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 4.53  | 240  | 4.6813          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
-| 0.0           | 4.81  | 255  | 4.6821          | 0.6504    | 0.6310 | 0.6292 | 0.6310   |
 ### Framework versions
-- Transformers 4.38.2
 - Pytorch 2.2.1+cu121
 - Datasets 2.18.0
 - Tokenizers 0.15.2

 ---
 license: apache-2.0
+base_model: distilbert-base-cased
 tags:
 - generated_from_trainer
 metrics:
 # trainerL
+This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.1346
+- Precision: 0.8284
+- Recall: 0.8235
+- F1: 0.8240
+- Accuracy: 0.8235
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
+| 0.7821        | 0.07  | 15   | 0.8793          | 0.7615    | 0.6975 | 0.6719 | 0.6975   |
+| 0.588         | 0.14  | 30   | 0.7151          | 0.8020    | 0.7759 | 0.7700 | 0.7759   |
+| 0.4638        | 0.2   | 45   | 0.6870          | 0.7940    | 0.7787 | 0.7765 | 0.7787   |
+| 0.5218        | 0.27  | 60   | 0.8146          | 0.7686    | 0.7283 | 0.7267 | 0.7283   |
+| 0.6686        | 0.34  | 75   | 0.6981          | 0.7978    | 0.7759 | 0.7758 | 0.7759   |
+| 0.5512        | 0.41  | 90   | 0.7200          | 0.7698    | 0.7423 | 0.7394 | 0.7423   |
+| 0.5709        | 0.47  | 105  | 0.6853          | 0.7978    | 0.7843 | 0.7827 | 0.7843   |
+| 0.5019        | 0.54  | 120  | 0.6668          | 0.8201    | 0.8151 | 0.8151 | 0.8151   |
+| 0.3757        | 0.61  | 135  | 0.6816          | 0.8097    | 0.7983 | 0.7991 | 0.7983   |
+| 0.4634        | 0.68  | 150  | 0.6682          | 0.8079    | 0.7899 | 0.7911 | 0.7899   |
+| 0.4107        | 0.74  | 165  | 0.7132          | 0.7975    | 0.7843 | 0.7821 | 0.7843   |
+| 0.5346        | 0.81  | 180  | 0.7647          | 0.7978    | 0.7703 | 0.7689 | 0.7703   |
+| 0.3629        | 0.88  | 195  | 0.7391          | 0.7925    | 0.7731 | 0.7732 | 0.7731   |
+| 0.3951        | 0.95  | 210  | 0.8193          | 0.8081    | 0.7927 | 0.7931 | 0.7927   |
+| 0.3729        | 1.01  | 225  | 0.8174          | 0.8081    | 0.7899 | 0.7876 | 0.7899   |
+| 0.1666        | 1.08  | 240  | 0.7480          | 0.7890    | 0.7843 | 0.7852 | 0.7843   |
+| 0.3315        | 1.15  | 255  | 0.8574          | 0.7918    | 0.7787 | 0.7782 | 0.7787   |
+| 0.1392        | 1.22  | 270  | 0.8227          | 0.8192    | 0.8039 | 0.8034 | 0.8039   |
+| 0.2186        | 1.28  | 285  | 0.8934          | 0.8208    | 0.8123 | 0.8121 | 0.8123   |
+| 0.2085        | 1.35  | 300  | 0.7850          | 0.8356    | 0.8263 | 0.8278 | 0.8263   |
+| 0.2709        | 1.42  | 315  | 0.9394          | 0.8115    | 0.8039 | 0.8024 | 0.8039   |
+| 0.1109        | 1.49  | 330  | 0.8446          | 0.8126    | 0.8039 | 0.8024 | 0.8039   |
+| 0.1927        | 1.55  | 345  | 1.0934          | 0.7690    | 0.7535 | 0.7521 | 0.7535   |
+| 0.1932        | 1.62  | 360  | 0.8842          | 0.8197    | 0.8179 | 0.8169 | 0.8179   |
+| 0.139         | 1.69  | 375  | 0.9567          | 0.8103    | 0.8039 | 0.8044 | 0.8039   |
+| 0.1989        | 1.76  | 390  | 0.9964          | 0.7918    | 0.7815 | 0.7827 | 0.7815   |
+| 0.1533        | 1.82  | 405  | 1.0390          | 0.7849    | 0.7703 | 0.7692 | 0.7703   |
+| 0.2209        | 1.89  | 420  | 1.0088          | 0.8096    | 0.8039 | 0.8039 | 0.8039   |
+| 0.1777        | 1.96  | 435  | 1.1335          | 0.7974    | 0.7843 | 0.7859 | 0.7843   |
+| 0.1449        | 2.03  | 450  | 0.9708          | 0.8071    | 0.8011 | 0.8003 | 0.8011   |
+| 0.212         | 2.09  | 465  | 1.0755          | 0.8066    | 0.7899 | 0.7889 | 0.7899   |
+| 0.0637        | 2.16  | 480  | 0.9715          | 0.8159    | 0.8123 | 0.8111 | 0.8123   |
+| 0.0571        | 2.23  | 495  | 1.1385          | 0.7922    | 0.7871 | 0.7862 | 0.7871   |
+| 0.005         | 2.3   | 510  | 1.1527          | 0.8012    | 0.7955 | 0.7952 | 0.7955   |
+| 0.0419        | 2.36  | 525  | 1.1564          | 0.8056    | 0.7983 | 0.7995 | 0.7983   |
+| 0.0054        | 2.43  | 540  | 1.1604          | 0.8103    | 0.8011 | 0.8021 | 0.8011   |
+| 0.1335        | 2.5   | 555  | 1.0291          | 0.8274    | 0.8235 | 0.8240 | 0.8235   |
+| 0.0137        | 2.57  | 570  | 1.1099          | 0.8299    | 0.8207 | 0.8211 | 0.8207   |
+| 0.0514        | 2.64  | 585  | 1.1275          | 0.8211    | 0.8151 | 0.8158 | 0.8151   |
+| 0.0742        | 2.7   | 600  | 1.0709          | 0.8282    | 0.8235 | 0.8240 | 0.8235   |
+| 0.0401        | 2.77  | 615  | 1.0292          | 0.8230    | 0.8207 | 0.8211 | 0.8207   |
+| 0.0838        | 2.84  | 630  | 1.0180          | 0.8242    | 0.8207 | 0.8207 | 0.8207   |
+| 0.0191        | 2.91  | 645  | 0.9939          | 0.8386    | 0.8347 | 0.8349 | 0.8347   |
+| 0.1275        | 2.97  | 660  | 1.0027          | 0.8268    | 0.8235 | 0.8236 | 0.8235   |
+| 0.0076        | 3.04  | 675  | 1.0779          | 0.8301    | 0.8235 | 0.8229 | 0.8235   |
+| 0.0581        | 3.11  | 690  | 1.0325          | 0.8257    | 0.8207 | 0.8204 | 0.8207   |
+| 0.0072        | 3.18  | 705  | 1.0599          | 0.8184    | 0.8151 | 0.8147 | 0.8151   |
+| 0.0442        | 3.24  | 720  | 1.1332          | 0.8107    | 0.8039 | 0.8040 | 0.8039   |
+| 0.001         | 3.31  | 735  | 1.1529          | 0.8029    | 0.7955 | 0.7955 | 0.7955   |
+| 0.0367        | 3.38  | 750  | 1.0967          | 0.8162    | 0.8123 | 0.8120 | 0.8123   |
+| 0.0052        | 3.45  | 765  | 1.1101          | 0.8175    | 0.8123 | 0.8124 | 0.8123   |
+| 0.0093        | 3.51  | 780  | 1.0600          | 0.8183    | 0.8123 | 0.8133 | 0.8123   |
+| 0.0079        | 3.58  | 795  | 1.0572          | 0.8191    | 0.8123 | 0.8136 | 0.8123   |
+| 0.0019        | 3.65  | 810  | 1.0715          | 0.8263    | 0.8207 | 0.8218 | 0.8207   |
+| 0.009         | 3.72  | 825  | 1.1091          | 0.8271    | 0.8207 | 0.8217 | 0.8207   |
+| 0.0063        | 3.78  | 840  | 1.1287          | 0.8254    | 0.8179 | 0.8189 | 0.8179   |
+| 0.0118        | 3.85  | 855  | 1.1054          | 0.8148    | 0.8095 | 0.8101 | 0.8095   |
+| 0.0148        | 3.92  | 870  | 1.1412          | 0.8190    | 0.8123 | 0.8128 | 0.8123   |
+| 0.0207        | 3.99  | 885  | 1.1313          | 0.8199    | 0.8123 | 0.8131 | 0.8123   |
+| 0.0011        | 4.05  | 900  | 1.1114          | 0.8283    | 0.8235 | 0.8238 | 0.8235   |
+| 0.0008        | 4.12  | 915  | 1.1179          | 0.8312    | 0.8263 | 0.8264 | 0.8263   |
+| 0.0012        | 4.19  | 930  | 1.1181          | 0.8277    | 0.8235 | 0.8235 | 0.8235   |
+| 0.0008        | 4.26  | 945  | 1.1248          | 0.8253    | 0.8207 | 0.8210 | 0.8207   |
+| 0.0587        | 4.32  | 960  | 1.1230          | 0.8306    | 0.8263 | 0.8265 | 0.8263   |
+| 0.0018        | 4.39  | 975  | 1.1217          | 0.8342    | 0.8291 | 0.8296 | 0.8291   |
+| 0.0125        | 4.46  | 990  | 1.1278          | 0.8316    | 0.8263 | 0.8269 | 0.8263   |
+| 0.0021        | 4.53  | 1005 | 1.1358          | 0.8316    | 0.8263 | 0.8269 | 0.8263   |
+| 0.0007        | 4.59  | 1020 | 1.1398          | 0.8266    | 0.8207 | 0.8211 | 0.8207   |
+| 0.0308        | 4.66  | 1035 | 1.1394          | 0.8316    | 0.8263 | 0.8269 | 0.8263   |
+| 0.0011        | 4.73  | 1050 | 1.1385          | 0.8316    | 0.8263 | 0.8269 | 0.8263   |
+| 0.0011        | 4.8   | 1065 | 1.1362          | 0.8284    | 0.8235 | 0.8240 | 0.8235   |
+| 0.0007        | 4.86  | 1080 | 1.1351          | 0.8284    | 0.8235 | 0.8240 | 0.8235   |
+| 0.0006        | 4.93  | 1095 | 1.1350          | 0.8284    | 0.8235 | 0.8240 | 0.8235   |
+| 0.0016        | 5.0   | 1110 | 1.1346          | 0.8284    | 0.8235 | 0.8240 | 0.8235   |
 ### Framework versions
+- Transformers 4.39.1
 - Pytorch 2.2.1+cu121
 - Datasets 2.18.0
 - Tokenizers 0.15.2

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "distilbert-base-uncased",
   "activation": "gelu",
   "architectures": [
     "DistilBertForSequenceClassification"
@@ -31,6 +31,7 @@
   "model_type": "distilbert",
   "n_heads": 12,
   "n_layers": 6,
   "pad_token_id": 0,
   "problem_type": "single_label_classification",
   "qa_dropout": 0.1,
@@ -38,6 +39,6 @@
   "sinusoidal_pos_embds": false,
   "tie_weights_": true,
   "torch_dtype": "float32",
-  "transformers_version": "4.38.2",
-  "vocab_size": 30522
 }

 {
+  "_name_or_path": "distilbert-base-cased",
   "activation": "gelu",
   "architectures": [
     "DistilBertForSequenceClassification"
   "model_type": "distilbert",
   "n_heads": 12,
   "n_layers": 6,
+  "output_past": true,
   "pad_token_id": 0,
   "problem_type": "single_label_classification",
   "qa_dropout": 0.1,
   "sinusoidal_pos_embds": false,
   "tie_weights_": true,
   "torch_dtype": "float32",
+  "transformers_version": "4.39.1",
+  "vocab_size": 28996
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f46e0ea74e12e325061e890bcb2c02997b8aaa17a36305adcee103f3f043bad6
-size 267847948

 version https://git-lfs.github.com/spec/v1
+oid sha256:06327ecc811b5a6912ad509715595c03f0aa1176b97eba74c3136d75f35acc2e
+size 263160068

runs/Mar23_13-52-36_3015ac71de9b/events.out.tfevents.1711202013.3015ac71de9b.221.8 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5ca6f7f494109f1a87cbda324259c65acd67da0e747f00e1a2902cb6902fff8a
+size 55659

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

@@ -43,11 +43,9 @@
   },
   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
-  "do_basic_tokenize": true,
-  "do_lower_case": true,
   "mask_token": "[MASK]",
   "model_max_length": 512,
-  "never_split": null,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "strip_accents": null,

   },
   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
+  "do_lower_case": false,
   "mask_token": "[MASK]",
   "model_max_length": 512,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "strip_accents": null,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d2115da520c41a9fcd77bb151c68f7019a71a96eb6f642a6f21401069003a5dc
-size 4856

 version https://git-lfs.github.com/spec/v1
+oid sha256:44a5e252b5c8e128c6368dd70996952ddbb6683541c83719c510c29f09800a38
+size 4920

vocab.txt CHANGED Viewed

The diff for this file is too large to render. See raw diff