ksirts commited on
Commit
b0595c0
1 Parent(s): 99bd304

Upload 7 files

Browse files
README.md CHANGED
@@ -1,3 +1,137 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - generated_from_trainer
4
+ datasets:
5
+ - sentiment_reduced
6
+ metrics:
7
+ - accuracy
8
+ model-index:
9
+ - name: estbert128_lr5e-5_b64_s2
10
+ results:
11
+ - task:
12
+ name: Text Classification
13
+ type: text-classification
14
+ dataset:
15
+ name: sentiment_reduced
16
+ type: sentiment_reduced
17
+ args: sentiment_reduced
18
+ metrics:
19
+ - name: Accuracy
20
+ type: accuracy
21
+ value: 0.7926136255264282
22
  ---
23
+
24
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
25
+ should probably proofread and complete it, then remove this comment. -->
26
+
27
+ # estbert128_lr5e-5_b64_s2
28
+
29
+ This model is a fine-tuned version of [tartuNLP/EstBERT](https://huggingface.co/tartuNLP/EstBERT) on the sentiment_reduced dataset.
30
+ It achieves the following results on the evaluation set:
31
+ - Loss: 2.2440
32
+ - Accuracy: 0.7926
33
+
34
+ ## Model description
35
+
36
+ More information needed
37
+
38
+ ## Intended uses & limitations
39
+
40
+ More information needed
41
+
42
+ ## Training and evaluation data
43
+
44
+ More information needed
45
+
46
+ ## Training procedure
47
+
48
+ ### Training hyperparameters
49
+
50
+ The following hyperparameters were used during training:
51
+ - learning_rate: 5e-05
52
+ - train_batch_size: 16
53
+ - eval_batch_size: 16
54
+ - seed: 2
55
+ - gradient_accumulation_steps: 4
56
+ - total_train_batch_size: 64
57
+ - optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
58
+ - lr_scheduler_type: polynomial
59
+ - num_epochs: 100
60
+ - mixed_precision_training: Native AMP
61
+
62
+ ### Training results
63
+
64
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
65
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|
66
+ | 0.836 | 0.99 | 38 | 0.6966 | 0.7216 |
67
+ | 0.5336 | 1.99 | 76 | 0.5948 | 0.7699 |
68
+ | 0.2913 | 2.99 | 114 | 0.7197 | 0.7358 |
69
+ | 0.1048 | 3.99 | 152 | 0.9570 | 0.7557 |
70
+ | 0.0424 | 4.99 | 190 | 1.2144 | 0.7528 |
71
+ | 0.0262 | 5.99 | 228 | 1.2675 | 0.7727 |
72
+ | 0.0169 | 6.99 | 266 | 1.4788 | 0.75 |
73
+ | 0.0048 | 7.99 | 304 | 1.5053 | 0.7699 |
74
+ | 0.0084 | 8.99 | 342 | 1.5368 | 0.7614 |
75
+ | 0.0087 | 9.99 | 380 | 1.6678 | 0.7699 |
76
+ | 0.0082 | 10.99 | 418 | 1.7598 | 0.7642 |
77
+ | 0.0104 | 11.99 | 456 | 1.6951 | 0.7528 |
78
+ | 0.0115 | 12.99 | 494 | 1.7123 | 0.7727 |
79
+ | 0.0111 | 13.99 | 532 | 1.7577 | 0.7528 |
80
+ | 0.0028 | 14.99 | 570 | 1.7383 | 0.7727 |
81
+ | 0.0032 | 15.99 | 608 | 2.0254 | 0.7727 |
82
+ | 0.0107 | 16.99 | 646 | 2.2123 | 0.7415 |
83
+ | 0.0056 | 17.99 | 684 | 1.9406 | 0.7614 |
84
+ | 0.0078 | 18.99 | 722 | 2.2002 | 0.7642 |
85
+ | 0.0041 | 19.99 | 760 | 2.0157 | 0.7670 |
86
+ | 0.0087 | 20.99 | 798 | 2.1228 | 0.7642 |
87
+ | 0.0113 | 21.99 | 836 | 2.3692 | 0.7727 |
88
+ | 0.0025 | 22.99 | 874 | 2.2211 | 0.75 |
89
+ | 0.0083 | 23.99 | 912 | 2.2120 | 0.7841 |
90
+ | 0.0104 | 24.99 | 950 | 2.1478 | 0.7614 |
91
+ | 0.0041 | 25.99 | 988 | 2.1118 | 0.7756 |
92
+ | 0.002 | 26.99 | 1026 | 1.9929 | 0.7699 |
93
+ | 0.001 | 27.99 | 1064 | 2.0295 | 0.7841 |
94
+ | 0.003 | 28.99 | 1102 | 2.3142 | 0.7699 |
95
+ | 0.006 | 29.99 | 1140 | 2.2957 | 0.7642 |
96
+ | 0.0005 | 30.99 | 1178 | 2.0661 | 0.7642 |
97
+ | 0.0007 | 31.99 | 1216 | 2.4220 | 0.7614 |
98
+ | 0.0088 | 32.99 | 1254 | 2.2842 | 0.7614 |
99
+ | 0.0 | 33.99 | 1292 | 2.4060 | 0.7585 |
100
+ | 0.0 | 34.99 | 1330 | 2.2088 | 0.7585 |
101
+ | 0.0 | 35.99 | 1368 | 2.2181 | 0.7614 |
102
+ | 0.0 | 36.99 | 1406 | 2.2560 | 0.7784 |
103
+ | 0.0 | 37.99 | 1444 | 2.4803 | 0.7585 |
104
+ | 0.0 | 38.99 | 1482 | 2.1163 | 0.7812 |
105
+ | 0.0087 | 39.99 | 1520 | 2.3410 | 0.75 |
106
+ | 0.0021 | 40.99 | 1558 | 2.3583 | 0.75 |
107
+ | 0.0054 | 41.99 | 1596 | 2.3546 | 0.7642 |
108
+ | 0.0051 | 42.99 | 1634 | 2.2295 | 0.7812 |
109
+ | 0.0 | 43.99 | 1672 | 2.2440 | 0.7926 |
110
+ | 0.0019 | 44.99 | 1710 | 2.3248 | 0.7784 |
111
+ | 0.0044 | 45.99 | 1748 | 2.3058 | 0.7841 |
112
+ | 0.0006 | 46.99 | 1786 | 2.3588 | 0.7784 |
113
+ | 0.0007 | 47.99 | 1824 | 2.6541 | 0.7670 |
114
+ | 0.0001 | 48.99 | 1862 | 2.4621 | 0.7614 |
115
+ | 0.0 | 49.99 | 1900 | 2.4696 | 0.7727 |
116
+ | 0.0 | 50.99 | 1938 | 2.4981 | 0.7670 |
117
+ | 0.0031 | 51.99 | 1976 | 2.6702 | 0.7670 |
118
+ | 0.0 | 52.99 | 2014 | 2.4448 | 0.7756 |
119
+ | 0.0 | 53.99 | 2052 | 2.4214 | 0.7756 |
120
+ | 0.0 | 54.99 | 2090 | 2.4308 | 0.7841 |
121
+ | 0.0001 | 55.99 | 2128 | 2.5869 | 0.7642 |
122
+ | 0.0007 | 56.99 | 2166 | 2.4803 | 0.7727 |
123
+ | 0.0 | 57.99 | 2204 | 2.4557 | 0.7784 |
124
+ | 0.0 | 58.99 | 2242 | 2.4702 | 0.7784 |
125
+ | 0.0 | 59.99 | 2280 | 2.5165 | 0.7784 |
126
+ | 0.0013 | 60.99 | 2318 | 2.6322 | 0.7727 |
127
+ | 0.0001 | 61.99 | 2356 | 2.6253 | 0.7756 |
128
+ | 0.0011 | 62.99 | 2394 | 2.6303 | 0.7841 |
129
+ | 0.0002 | 63.99 | 2432 | 2.5646 | 0.7614 |
130
+
131
+
132
+ ### Framework versions
133
+
134
+ - Transformers 4.14.1
135
+ - Pytorch 1.10.1+cu113
136
+ - Datasets 1.16.1
137
+ - Tokenizers 0.10.3
config.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "tartuNLP/EstBERT",
3
+ "architectures": [
4
+ "BertForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_ids": 0,
10
+ "gradient_checkpointing": false,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 768,
14
+ "id2label": {
15
+ "0": "negatiivne",
16
+ "1": "neutraalne",
17
+ "2": "positiivne"
18
+ },
19
+ "initializer_range": 0.02,
20
+ "intermediate_size": 3072,
21
+ "label2id": {
22
+ "negatiivne": 0,
23
+ "neutraalne": 1,
24
+ "positiivne": 2
25
+ },
26
+ "layer_norm_eps": 1e-12,
27
+ "max_position_embeddings": 512,
28
+ "model_type": "bert",
29
+ "num_attention_heads": 12,
30
+ "num_hidden_layers": 12,
31
+ "output_past": true,
32
+ "pad_token_id": 0,
33
+ "position_embedding_type": "absolute",
34
+ "problem_type": "single_label_classification",
35
+ "torch_dtype": "float32",
36
+ "transformers_version": "4.14.1",
37
+ "type_vocab_size": 2,
38
+ "use_cache": true,
39
+ "vocab_size": 50000
40
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:839b854121412990ae68b7979977fcf2a15d35d9051f3c82c9cfb0d433a92597
3
+ size 497858733
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"do_lower_case": true, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "special_tokens_map_file": null, "full_tokenizer_file": null, "name_or_path": "tartuNLP/EstBERT", "do_basic_tokenize": true, "never_split": null, "tokenizer_class": "BertTokenizer"}
vocab.txt ADDED
The diff for this file is too large to render. See raw diff