File size: 3,790 Bytes
e86e4a3
 
 
1199630
 
e86e4a3
 
 
 
1199630
 
 
 
 
 
 
 
 
 
 
e86e4a3
 
 
 
 
 
 
1199630
e86e4a3
1199630
 
e86e4a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
tags:
- generated_from_trainer
datasets:
- gokuls/wiki_book_corpus_complete_processed_bert_dataset
metrics:
- accuracy
model-index:
- name: HBERTv1_emb_compress_48_L10_H512_A8
  results:
  - task:
      name: Masked Language Modeling
      type: fill-mask
    dataset:
      name: gokuls/wiki_book_corpus_complete_processed_bert_dataset
      type: gokuls/wiki_book_corpus_complete_processed_bert_dataset
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.17367944889882433
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# HBERTv1_emb_compress_48_L10_H512_A8

This model is a fine-tuned version of [](https://huggingface.co/) on the gokuls/wiki_book_corpus_complete_processed_bert_dataset dataset.
It achieves the following results on the evaluation set:
- Loss: 5.7680
- Accuracy: 0.1737

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 56
- eval_batch_size: 56
- seed: 10
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 10000
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step   | Validation Loss | Accuracy |
|:-------------:|:-----:|:------:|:---------------:|:--------:|
| 7.1035        | 0.1   | 10000  | 7.0837          | 0.0844   |
| 6.6799        | 0.19  | 20000  | 6.6737          | 0.1072   |
| 6.5327        | 0.29  | 30000  | 6.5279          | 0.1194   |
| 6.4362        | 0.38  | 40000  | 6.4358          | 0.1272   |
| 6.3648        | 0.48  | 50000  | 6.3700          | 0.1335   |
| 6.3181        | 0.57  | 60000  | 6.3158          | 0.1355   |
| 6.2776        | 0.67  | 70000  | 6.2769          | 0.1380   |
| 6.2469        | 0.76  | 80000  | 6.2438          | 0.1400   |
| 6.218         | 0.86  | 90000  | 6.2187          | 0.1422   |
| 6.2036        | 0.96  | 100000 | 6.1963          | 0.1434   |
| 6.1806        | 1.05  | 110000 | 6.1776          | 0.1451   |
| 6.1591        | 1.15  | 120000 | 6.1621          | 0.1456   |
| 6.1503        | 1.24  | 130000 | 6.1473          | 0.1468   |
| 6.1391        | 1.34  | 140000 | 6.1357          | 0.1466   |
| 6.126         | 1.43  | 150000 | 6.1230          | 0.1477   |
| 6.1145        | 1.53  | 160000 | 6.1133          | 0.1479   |
| 6.1067        | 1.62  | 170000 | 6.1040          | 0.1486   |
| 6.097         | 1.72  | 180000 | 6.0966          | 0.1488   |
| 6.0825        | 1.82  | 190000 | 6.0875          | 0.1492   |
| 6.0783        | 1.91  | 200000 | 6.0797          | 0.1494   |
| 6.0673        | 2.01  | 210000 | 6.0730          | 0.1499   |
| 6.066         | 2.1   | 220000 | 6.0623          | 0.1501   |
| 6.0534        | 2.2   | 230000 | 6.0510          | 0.1504   |
| 6.0004        | 2.29  | 240000 | 5.9972          | 0.1517   |
| 5.9609        | 2.39  | 250000 | 5.9492          | 0.1530   |
| 5.93          | 2.49  | 260000 | 5.9169          | 0.1551   |
| 5.9058        | 2.58  | 270000 | 5.8895          | 0.1571   |
| 5.8834        | 2.68  | 280000 | 5.8618          | 0.1597   |
| 5.8572        | 2.77  | 290000 | 5.8394          | 0.1623   |
| 5.8296        | 2.87  | 300000 | 5.8168          | 0.1661   |
| 5.8085        | 2.96  | 310000 | 5.7926          | 0.1703   |
| 5.7873        | 3.06  | 320000 | 5.7663          | 0.1739   |


### Framework versions

- Transformers 4.33.2
- Pytorch 1.14.0a0+410ce96
- Datasets 2.14.5
- Tokenizers 0.13.3