gokuls
/

add_bert_12_layer_model_complete_training_new

@@ -15,8 +15,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: nan
-- Accuracy: 0.0000
 ## Model description
@@ -35,40 +35,35 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0005
-- train_batch_size: 64
-- eval_batch_size: 64
 - seed: 10
 - distributed_type: multi-GPU
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 10000
 - num_epochs: 5
-- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step   | Validation Loss | Accuracy |
 |:-------------:|:-----:|:------:|:---------------:|:--------:|
-| 0.0           | 0.11  | 10000  | nan             | 0.0000   |
-| 0.0           | 0.22  | 20000  | nan             | 0.0000   |
-| 0.0           | 0.33  | 30000  | nan             | 0.0000   |
-| 0.0           | 0.44  | 40000  | nan             | 0.0000   |
-| 0.0           | 0.55  | 50000  | nan             | 0.0000   |
-| 0.0           | 0.66  | 60000  | nan             | 0.0000   |
-| 0.0           | 0.76  | 70000  | nan             | 0.0000   |
-| 0.0           | 0.87  | 80000  | nan             | 0.0000   |
-| 0.0           | 0.98  | 90000  | nan             | 0.0000   |
-| 0.0           | 1.09  | 100000 | nan             | 0.0000   |
-| 0.0           | 1.2   | 110000 | nan             | 0.0000   |
-| 0.0           | 1.31  | 120000 | nan             | 0.0000   |
-| 0.0           | 1.42  | 130000 | nan             | 0.0000   |
-| 0.0           | 1.53  | 140000 | nan             | 0.0000   |
 ### Framework versions
-- Transformers 4.29.2
 - Pytorch 1.14.0a0+410ce96
 - Datasets 2.12.0
 - Tokenizers 0.13.3

 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 6.0612
+- Accuracy: 0.1510
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 48
+- eval_batch_size: 48
 - seed: 10
 - distributed_type: multi-GPU
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 10000
 - num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step   | Validation Loss | Accuracy |
 |:-------------:|:-----:|:------:|:---------------:|:--------:|
+| 6.6173        | 0.08  | 10000  | 6.5823          | 0.1275   |
+| 6.4901        | 0.16  | 20000  | 6.3925          | 0.1398   |
+| 6.2875        | 0.25  | 30000  | 6.3106          | 0.1439   |
+| 6.2516        | 0.33  | 40000  | 6.2511          | 0.1465   |
+| 6.2457        | 0.41  | 50000  | 6.2064          | 0.1478   |
+| 6.1715        | 0.49  | 60000  | 6.1655          | 0.1486   |
+| 6.0838        | 0.57  | 70000  | 6.1332          | 0.1494   |
+| 6.1575        | 0.66  | 80000  | 6.1049          | 0.1503   |
+| 6.0773        | 0.74  | 90000  | 6.0814          | 0.1504   |
+| 6.1838        | 0.82  | 100000 | 6.0612          | 0.1510   |
 ### Framework versions
+- Transformers 4.30.0
 - Pytorch 1.14.0a0+410ce96
 - Datasets 2.12.0
 - Tokenizers 0.13.3