Training in progress epoch 0

Browse files

Files changed (8) hide show

README.md +12 -13
config.json +1 -1
logs/train/events.out.tfevents.1714812347.8f02ccfbe71c.214.0.v2 +3 -0
logs/train/events.out.tfevents.1714812493.8f02ccfbe71c.214.1.v2 +3 -0
logs/train/events.out.tfevents.1714812839.8f02ccfbe71c.509.0.v2 +3 -0
logs/validation/events.out.tfevents.1714813637.8f02ccfbe71c.509.1.v2 +3 -0
tf_model.h5 +1 -1
tokenizer_config.json +1 -1

README.md CHANGED Viewed

@@ -15,13 +15,13 @@ probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: 1.7185
-- Train End Logits Accuracy: 0.5917
-- Train Start Logits Accuracy: 0.5638
-- Validation Loss: 2.0391
-- Validation End Logits Accuracy: 0.5252
-- Validation Start Logits Accuracy: 0.4886
-- Epoch: 1
 ## Model description
@@ -40,20 +40,19 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 6786, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
 - training_precision: float32
 ### Training results
 | Train Loss | Train End Logits Accuracy | Train Start Logits Accuracy | Validation Loss | Validation End Logits Accuracy | Validation Start Logits Accuracy | Epoch |
 |:----------:|:-------------------------:|:---------------------------:|:---------------:|:------------------------------:|:--------------------------------:|:-----:|
-| 2.3543     | 0.5058                    | 0.4992                      | 2.0820          | 0.5253                         | 0.4917                           | 0     |
-| 1.7185     | 0.5917                    | 0.5638                      | 2.0391          | 0.5252                         | 0.4886                           | 1     |
 ### Framework versions
-- Transformers 4.40.1
 - TensorFlow 2.15.0
-- Datasets 2.19.0
-- Tokenizers 0.19.1

 This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 2.3101
+- Train End Logits Accuracy: 0.5128
+- Train Start Logits Accuracy: 0.5040
+- Validation Loss: 2.0900
+- Validation End Logits Accuracy: 0.5237
+- Validation Start Logits Accuracy: 0.4869
+- Epoch: 0
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 22620, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
 - training_precision: float32
 ### Training results
 | Train Loss | Train End Logits Accuracy | Train Start Logits Accuracy | Validation Loss | Validation End Logits Accuracy | Validation Start Logits Accuracy | Epoch |
 |:----------:|:-------------------------:|:---------------------------:|:---------------:|:------------------------------:|:--------------------------------:|:-----:|
+| 2.3101     | 0.5128                    | 0.5040                      | 2.0900          | 0.5237                         | 0.4869                           | 0     |
 ### Framework versions
+- Transformers 4.39.3
 - TensorFlow 2.15.0
+- Datasets 2.18.0
+- Tokenizers 0.15.2

config.json CHANGED Viewed

@@ -19,6 +19,6 @@
   "seq_classif_dropout": 0.2,
   "sinusoidal_pos_embds": false,
   "tie_weights_": true,
-  "transformers_version": "4.40.1",
   "vocab_size": 28996
 }

   "seq_classif_dropout": 0.2,
   "sinusoidal_pos_embds": false,
   "tie_weights_": true,
+  "transformers_version": "4.39.3",
   "vocab_size": 28996
 }

logs/train/events.out.tfevents.1714812347.8f02ccfbe71c.214.0.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bdf83fa3c502b6cae61f0a8fb9561bf854aeb9aef5e58f8d204294a3d3fdbd99
+size 1441201

logs/train/events.out.tfevents.1714812493.8f02ccfbe71c.214.1.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:78b5b6042c83cec3acae68c8f1d1c8e7ca4ad03c3da6225e8ca27d9952d5efcd
+size 1441406

logs/train/events.out.tfevents.1714812839.8f02ccfbe71c.509.0.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f00767c9b41d904f8d3e6597d61436d01cff737a00ce11654764d48c90f5902f
+size 1441432

logs/validation/events.out.tfevents.1714813637.8f02ccfbe71c.509.1.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9429176b2f9164e22ed3737db063baabb3209b0fef73dd4f89fe4ffb392d5299
+size 604

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9b2c4e91a64fe2cd9972dec48b701c9b5e2f8c744d21fc97ac0bb3271abd673c
 size 260895720

 version https://git-lfs.github.com/spec/v1
+oid sha256:2c58e00f15ef6d2c0995e6b9074d986b5a4314d5083ea5c760f580fc4e8452ca
 size 260895720

tokenizer_config.json CHANGED Viewed

@@ -45,7 +45,7 @@
   "cls_token": "[CLS]",
   "do_lower_case": false,
   "mask_token": "[MASK]",
-  "model_max_length": 1000000000000000019884624838656,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "strip_accents": null,

   "cls_token": "[CLS]",
   "do_lower_case": false,
   "mask_token": "[MASK]",
+  "model_max_length": 512,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "strip_accents": null,