Goshective commited on
Commit
f6f5239
·
verified ·
1 Parent(s): d4a0c1e

Model save

Browse files
Files changed (1) hide show
  1. README.md +15 -14
README.md CHANGED
@@ -18,12 +18,12 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [sberbank-ai/ruRoberta-large](https://huggingface.co/sberbank-ai/ruRoberta-large) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 2.3311
22
- - Accuracy: 0.3980
23
- - Top 2 Accuracy: 0.5651
24
- - Top 3 Accuracy: 0.6403
25
- - Roc Auc: 0.9489
26
- - F1: 0.3823
27
 
28
  ## Model description
29
 
@@ -42,23 +42,24 @@ More information needed
42
  ### Training hyperparameters
43
 
44
  The following hyperparameters were used during training:
45
- - learning_rate: 5e-05
46
- - train_batch_size: 8
47
- - eval_batch_size: 8
48
  - seed: 42
49
- - gradient_accumulation_steps: 2
50
- - total_train_batch_size: 16
51
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
  - lr_scheduler_type: linear
53
- - num_epochs: 2
 
54
  - mixed_precision_training: Native AMP
55
 
56
  ### Training results
57
 
58
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Top 2 Accuracy | Top 3 Accuracy | Roc Auc | F1 |
59
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------------:|:--------------:|:-------:|:------:|
60
- | No log | 1.0 | 196 | 2.6514 | 0.3622 | 0.5102 | 0.5880 | 0.9384 | 0.3265 |
61
- | No log | 2.0 | 392 | 2.3311 | 0.3980 | 0.5651 | 0.6403 | 0.9489 | 0.3823 |
 
 
62
 
63
 
64
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [sberbank-ai/ruRoberta-large](https://huggingface.co/sberbank-ai/ruRoberta-large) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 3.3205
22
+ - Accuracy: 0.3316
23
+ - Top 2 Accuracy: 0.4452
24
+ - Top 3 Accuracy: 0.5217
25
+ - Roc Auc: 0.9245
26
+ - F1: 0.2788
27
 
28
  ## Model description
29
 
 
42
  ### Training hyperparameters
43
 
44
  The following hyperparameters were used during training:
45
+ - learning_rate: 1e-05
46
+ - train_batch_size: 12
47
+ - eval_batch_size: 16
48
  - seed: 42
 
 
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
+ - lr_scheduler_warmup_ratio: 0.1
52
+ - num_epochs: 4
53
  - mixed_precision_training: Native AMP
54
 
55
  ### Training results
56
 
57
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Top 2 Accuracy | Top 3 Accuracy | Roc Auc | F1 |
58
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------------:|:--------------:|:-------:|:------:|
59
+ | 4.9101 | 1.0 | 262 | 4.6292 | 0.0816 | 0.1301 | 0.1773 | 0.7633 | 0.0566 |
60
+ | 4.4492 | 2.0 | 524 | 3.8332 | 0.2564 | 0.3737 | 0.4528 | 0.8992 | 0.2009 |
61
+ | 3.9241 | 3.0 | 786 | 3.4448 | 0.3163 | 0.4222 | 0.5089 | 0.9194 | 0.2619 |
62
+ | 3.0492 | 4.0 | 1048 | 3.3205 | 0.3316 | 0.4452 | 0.5217 | 0.9245 | 0.2788 |
63
 
64
 
65
  ### Framework versions