gokuls
/

bert_12_layer_model_v2_complete_training_new

@@ -15,8 +15,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 7.2742
-- Accuracy: 0.0465
 ## Model description
@@ -35,9 +35,9 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.001
-- train_batch_size: 64
-- eval_batch_size: 64
 - seed: 10
 - distributed_type: multi-GPU
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -49,22 +49,17 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step   | Validation Loss | Accuracy |
 |:-------------:|:-----:|:------:|:---------------:|:--------:|
-| 7.2752        | 0.11  | 10000  | 7.2772          | 0.0466   |
-| 7.2774        | 0.22  | 20000  | 7.2771          | 0.0428   |
-| 7.2722        | 0.33  | 30000  | 7.2669          | 0.0429   |
-| 7.27          | 0.44  | 40000  | 7.2685          | 0.0466   |
-| 7.2687        | 0.55  | 50000  | 7.2724          | 0.0466   |
-| 7.2724        | 0.66  | 60000  | 7.2716          | 0.0465   |
-| 7.2689        | 0.76  | 70000  | 7.2689          | 0.0465   |
-| 7.268         | 0.87  | 80000  | 7.2708          | 0.0465   |
-| 7.2723        | 0.98  | 90000  | 7.2711          | 0.0465   |
-| 7.2724        | 1.09  | 100000 | 7.2714          | 0.0429   |
-| 7.2761        | 1.2   | 110000 | 7.2723          | 0.0465   |
-| 7.2694        | 1.31  | 120000 | 7.2685          | 0.0465   |
-| 7.2671        | 1.42  | 130000 | 7.2728          | 0.0466   |
-| 7.2664        | 1.53  | 140000 | 7.2714          | 0.0465   |
-| 7.2671        | 1.64  | 150000 | 7.2707          | 0.0465   |
-| 7.2663        | 1.75  | 160000 | 7.2742          | 0.0465   |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 5.5272
+- Accuracy: 0.1983
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 48
+- eval_batch_size: 48
 - seed: 10
 - distributed_type: multi-GPU
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 | Training Loss | Epoch | Step   | Validation Loss | Accuracy |
 |:-------------:|:-----:|:------:|:---------------:|:--------:|
+| 6.5761        | 0.08  | 10000  | 6.5404          | 0.1269   |
+| 6.3286        | 0.16  | 20000  | 6.3053          | 0.1409   |
+| 6.2283        | 0.25  | 30000  | 6.2131          | 0.1449   |
+| 6.1756        | 0.33  | 40000  | 6.1536          | 0.1478   |
+| 6.1292        | 0.41  | 50000  | 6.1186          | 0.1487   |
+| 6.1008        | 0.49  | 60000  | 6.0845          | 0.1494   |
+| 6.0718        | 0.57  | 70000  | 6.0606          | 0.1504   |
+| 5.9008        | 0.66  | 80000  | 5.8655          | 0.1578   |
+| 5.797         | 0.74  | 90000  | 5.7561          | 0.1695   |
+| 5.6959        | 0.82  | 100000 | 5.6441          | 0.1832   |
+| 5.5955        | 0.9   | 110000 | 5.5272          | 0.1983   |
 ### Framework versions