yoshitomo-matsubara commited on
Commit
eca7edf
1 Parent(s): 6d02f69

initial commit

Browse files
README.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - bert
5
+ - mnli
6
+ - ax
7
+ - glue
8
+ - kd
9
+ - torchdistill
10
+ license: apache-2.0
11
+ datasets:
12
+ - mnli
13
+ - ax
14
+ metrics:
15
+ - accuracy
16
+ ---
17
+
18
+ `bert-base-uncased` fine-tuned on MNLI dataset, using fine-tuned `bert-large-uncased` as a teacher model, [***torchdistill***](https://github.com/yoshitomo-matsubara/torchdistill) and [Google Colab](https://colab.research.google.com/github/yoshitomo-matsubara/torchdistill/blob/master/demo/glue_kd_and_submission.ipynb) for knowledge distillation.
19
+ The training configuration (including hyperparameters) is available [here](https://github.com/yoshitomo-matsubara/torchdistill/blob/main/configs/sample/glue/mnli/kd/bert_base_uncased_from_bert_large_uncased.yaml).
20
+ I submitted prediction files to [the GLUE leaderboard](https://gluebenchmark.com/leaderboard), and the overall GLUE score was **78.9**.
config.json ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "bert-base-uncased",
3
+ "architectures": [
4
+ "BertForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "finetuning_task": "mnli",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0",
14
+ "1": "LABEL_1",
15
+ "2": "LABEL_2"
16
+ },
17
+ "initializer_range": 0.02,
18
+ "intermediate_size": 3072,
19
+ "label2id": {
20
+ "LABEL_0": 0,
21
+ "LABEL_1": 1,
22
+ "LABEL_2": 2
23
+ },
24
+ "layer_norm_eps": 1e-12,
25
+ "max_position_embeddings": 512,
26
+ "model_type": "bert",
27
+ "num_attention_heads": 12,
28
+ "num_hidden_layers": 12,
29
+ "pad_token_id": 0,
30
+ "position_embedding_type": "absolute",
31
+ "problem_type": "single_label_classification",
32
+ "transformers_version": "4.6.1",
33
+ "type_vocab_size": 2,
34
+ "use_cache": true,
35
+ "vocab_size": 30522
36
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca0f84905c30911b5d2640e1c54998e5870c550466984e9af1094a0a84f9fcb7
3
+ size 438027529
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"do_lower_case": true, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "do_lower": true, "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "bert-base-uncased"}
training.log ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2021-05-31 19:12:19,502 INFO __main__ Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/mnli/kd/bert_base_uncased_from_bert_large_uncased.yaml', log='log/glue/mnli/kd/bert_base_uncased_from_bert_large_uncased.txt', private_output='leaderboard/glue/kd/bert_base_uncased_from_bert_large_uncased/', seed=None, student_only=False, task_name='mnli', test_only=False, world_size=1)
2
+ 2021-05-31 19:12:19,563 INFO __main__ Distributed environment: NO
3
+ Num processes: 1
4
+ Process index: 0
5
+ Local process index: 0
6
+ Device: cuda
7
+ Use FP16 precision: True
8
+
9
+ 2021-05-31 19:12:19,941 INFO filelock Lock 140082792337040 acquired on /root/.cache/huggingface/transformers/5b5f978453cf40beb680cdd3d4aa881c966097f83937fbf475e0ed640062dbca.c73d14e62466b28d4e1ef822a490987b8f83b052127d2564f2e5bbce495e3c09.lock
10
+ 2021-05-31 19:12:20,295 INFO filelock Lock 140082792337040 released on /root/.cache/huggingface/transformers/5b5f978453cf40beb680cdd3d4aa881c966097f83937fbf475e0ed640062dbca.c73d14e62466b28d4e1ef822a490987b8f83b052127d2564f2e5bbce495e3c09.lock
11
+ 2021-05-31 19:12:21,006 INFO filelock Lock 140082831894224 acquired on /root/.cache/huggingface/transformers/7a67abdbf71b85cb08398b0be2f83bb90b20e212c99600e63836e4a37df7de29.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
12
+ 2021-05-31 19:12:21,516 INFO filelock Lock 140082831894224 released on /root/.cache/huggingface/transformers/7a67abdbf71b85cb08398b0be2f83bb90b20e212c99600e63836e4a37df7de29.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
13
+ 2021-05-31 19:12:21,871 INFO filelock Lock 140082823814352 acquired on /root/.cache/huggingface/transformers/696f700b8d350ef06d6b7bb1d40f1727616b761551d519a1b9e473493d622f2d.6dc9f54d5893dc361ac6ccee1865622847ad90bf0536eeb2043f3e3e2f41078a.lock
14
+ 2021-05-31 19:12:22,393 INFO filelock Lock 140082823814352 released on /root/.cache/huggingface/transformers/696f700b8d350ef06d6b7bb1d40f1727616b761551d519a1b9e473493d622f2d.6dc9f54d5893dc361ac6ccee1865622847ad90bf0536eeb2043f3e3e2f41078a.lock
15
+ 2021-05-31 19:12:23,095 INFO filelock Lock 140082823814352 acquired on /root/.cache/huggingface/transformers/0a91d20dc356a0ee3b87e1e02495dfcdc9770ce1b64f4426459748fcdbca17e7.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d.lock
16
+ 2021-05-31 19:12:23,448 INFO filelock Lock 140082823814352 released on /root/.cache/huggingface/transformers/0a91d20dc356a0ee3b87e1e02495dfcdc9770ce1b64f4426459748fcdbca17e7.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d.lock
17
+ 2021-05-31 19:12:23,803 INFO filelock Lock 140082823814352 acquired on /root/.cache/huggingface/transformers/f9a57124cc0406fe634d8934f74efb446b8d92423e8720867cec3ee4291518a6.0f95f2171d2c33a9e9e088c1e5decb2dfb3a22fb00d904f96183827da9540426.lock
18
+ 2021-05-31 19:12:24,158 INFO filelock Lock 140082823814352 released on /root/.cache/huggingface/transformers/f9a57124cc0406fe634d8934f74efb446b8d92423e8720867cec3ee4291518a6.0f95f2171d2c33a9e9e088c1e5decb2dfb3a22fb00d904f96183827da9540426.lock
19
+ 2021-05-31 19:12:24,537 INFO filelock Lock 140082823814992 acquired on /root/.cache/huggingface/transformers/465d4939e3c54729c9bce27016baac778f168894b55701482c8ae4fa40953841.b487d9e34b8144fa22e4e1c7ea1213577af73f111e06c948c8cfa936dcc453aa.lock
20
+ 2021-05-31 19:13:00,303 INFO filelock Lock 140082823814992 released on /root/.cache/huggingface/transformers/465d4939e3c54729c9bce27016baac778f168894b55701482c8ae4fa40953841.b487d9e34b8144fa22e4e1c7ea1213577af73f111e06c948c8cfa936dcc453aa.lock
21
+ 2021-05-31 19:14:53,610 INFO __main__ Start training
22
+ 2021-05-31 19:14:53,610 INFO torchdistill.models.util [teacher model]
23
+ 2021-05-31 19:14:53,610 INFO torchdistill.models.util Using the original teacher model
24
+ 2021-05-31 19:14:53,610 INFO torchdistill.models.util [student model]
25
+ 2021-05-31 19:14:53,611 INFO torchdistill.models.util Using the original student model
26
+ 2021-05-31 19:14:53,611 INFO torchdistill.core.distillation Loss = 1.0 * OrgLoss
27
+ 2021-05-31 19:14:53,611 INFO torchdistill.core.distillation Freezing the whole teacher model
28
+ 2021-05-31 19:14:58,197 INFO torchdistill.misc.log Epoch: [0] [ 0/12272] eta: 0:26:53 lr: 9.999728378965668e-05 sample/s: 38.52969437529281 loss: 0.0905 (0.0905) time: 0.1315 data: 0.0277 max mem: 2519
29
+ 2021-05-31 19:17:04,033 INFO torchdistill.misc.log Epoch: [0] [ 1000/12272] eta: 0:23:38 lr: 9.728107344632768e-05 sample/s: 25.678521357422294 loss: 0.0229 (0.0347) time: 0.1315 data: 0.0046 max mem: 5109
30
+ 2021-05-31 19:19:10,890 INFO torchdistill.misc.log Epoch: [0] [ 2000/12272] eta: 0:21:37 lr: 9.45648631029987e-05 sample/s: 33.98564182345601 loss: 0.0153 (0.0267) time: 0.1355 data: 0.0044 max mem: 5109
31
+ 2021-05-31 19:21:17,630 INFO torchdistill.misc.log Epoch: [0] [ 3000/12272] eta: 0:19:32 lr: 9.184865275966971e-05 sample/s: 30.293297895006734 loss: 0.0145 (0.0230) time: 0.1215 data: 0.0044 max mem: 5109
32
+ 2021-05-31 19:23:24,094 INFO torchdistill.misc.log Epoch: [0] [ 4000/12272] eta: 0:17:26 lr: 8.913244241634072e-05 sample/s: 39.35939116542367 loss: 0.0144 (0.0208) time: 0.1229 data: 0.0045 max mem: 5109
33
+ 2021-05-31 19:25:30,963 INFO torchdistill.misc.log Epoch: [0] [ 5000/12272] eta: 0:15:20 lr: 8.641623207301173e-05 sample/s: 31.891785647075373 loss: 0.0108 (0.0192) time: 0.1368 data: 0.0047 max mem: 5109
34
+ 2021-05-31 19:27:37,490 INFO torchdistill.misc.log Epoch: [0] [ 6000/12272] eta: 0:13:13 lr: 8.370002172968275e-05 sample/s: 30.313604538761055 loss: 0.0109 (0.0179) time: 0.1267 data: 0.0047 max mem: 5109
35
+ 2021-05-31 19:29:45,181 INFO torchdistill.misc.log Epoch: [0] [ 7000/12272] eta: 0:11:08 lr: 8.098381138635376e-05 sample/s: 42.336344641721595 loss: 0.0095 (0.0170) time: 0.1268 data: 0.0045 max mem: 5109
36
+ 2021-05-31 19:31:52,182 INFO torchdistill.misc.log Epoch: [0] [ 8000/12272] eta: 0:09:01 lr: 7.826760104302477e-05 sample/s: 31.78104944118204 loss: 0.0112 (0.0162) time: 0.1264 data: 0.0046 max mem: 5109
37
+ 2021-05-31 19:33:59,788 INFO torchdistill.misc.log Epoch: [0] [ 9000/12272] eta: 0:06:55 lr: 7.555139069969579e-05 sample/s: 30.615916348838482 loss: 0.0089 (0.0155) time: 0.1314 data: 0.0045 max mem: 5109
38
+ 2021-05-31 19:36:07,595 INFO torchdistill.misc.log Epoch: [0] [10000/12272] eta: 0:04:48 lr: 7.283518035636681e-05 sample/s: 37.10492838754766 loss: 0.0072 (0.0149) time: 0.1298 data: 0.0048 max mem: 5109
39
+ 2021-05-31 19:38:13,949 INFO torchdistill.misc.log Epoch: [0] [11000/12272] eta: 0:02:41 lr: 7.011897001303781e-05 sample/s: 32.78477658489305 loss: 0.0090 (0.0144) time: 0.1288 data: 0.0045 max mem: 5109
40
+ 2021-05-31 19:40:21,535 INFO torchdistill.misc.log Epoch: [0] [12000/12272] eta: 0:00:34 lr: 6.740275966970883e-05 sample/s: 37.34245014245014 loss: 0.0079 (0.0140) time: 0.1317 data: 0.0050 max mem: 5109
41
+ 2021-05-31 19:40:56,676 INFO torchdistill.misc.log Epoch: [0] Total time: 0:25:58
42
+ 2021-05-31 19:41:04,501 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/mnli/default_experiment-1-0.arrow
43
+ 2021-05-31 19:41:04,501 INFO __main__ Validation: accuracy = 0.8412633723892002
44
+ 2021-05-31 19:41:04,501 INFO __main__ Updating ckpt at ./resource/ckpt/glue/mnli/kd/mnli-bert-base-uncased_from_bert-large-uncased
45
+ 2021-05-31 19:41:05,722 INFO torchdistill.misc.log Epoch: [1] [ 0/12272] eta: 0:31:19 lr: 6.666395045632334e-05 sample/s: 31.762036742544716 loss: 0.0031 (0.0031) time: 0.1532 data: 0.0272 max mem: 5109
46
+ 2021-05-31 19:43:13,358 INFO torchdistill.misc.log Epoch: [1] [ 1000/12272] eta: 0:23:58 lr: 6.394774011299436e-05 sample/s: 37.225953324487556 loss: 0.0044 (0.0051) time: 0.1300 data: 0.0046 max mem: 5109
47
+ 2021-05-31 19:45:20,181 INFO torchdistill.misc.log Epoch: [1] [ 2000/12272] eta: 0:21:47 lr: 6.123152976966536e-05 sample/s: 37.15834562552879 loss: 0.0047 (0.0051) time: 0.1284 data: 0.0045 max mem: 5109
48
+ 2021-05-31 19:47:26,919 INFO torchdistill.misc.log Epoch: [1] [ 3000/12272] eta: 0:19:38 lr: 5.851531942633638e-05 sample/s: 39.3462836451305 loss: 0.0042 (0.0050) time: 0.1197 data: 0.0043 max mem: 5109
49
+ 2021-05-31 19:49:32,833 INFO torchdistill.misc.log Epoch: [1] [ 4000/12272] eta: 0:17:28 lr: 5.5799109083007396e-05 sample/s: 33.65857162059412 loss: 0.0040 (0.0050) time: 0.1264 data: 0.0043 max mem: 5109
50
+ 2021-05-31 19:51:40,796 INFO torchdistill.misc.log Epoch: [1] [ 5000/12272] eta: 0:15:23 lr: 5.30828987396784e-05 sample/s: 26.070806883959442 loss: 0.0046 (0.0050) time: 0.1288 data: 0.0045 max mem: 5109
51
+ 2021-05-31 19:53:48,528 INFO torchdistill.misc.log Epoch: [1] [ 6000/12272] eta: 0:13:17 lr: 5.036668839634942e-05 sample/s: 32.30201815220279 loss: 0.0045 (0.0049) time: 0.1212 data: 0.0044 max mem: 5109
52
+ 2021-05-31 19:55:53,950 INFO torchdistill.misc.log Epoch: [1] [ 7000/12272] eta: 0:11:08 lr: 4.765047805302043e-05 sample/s: 36.80166358838471 loss: 0.0038 (0.0049) time: 0.1297 data: 0.0044 max mem: 5109
53
+ 2021-05-31 19:57:59,848 INFO torchdistill.misc.log Epoch: [1] [ 8000/12272] eta: 0:09:01 lr: 4.493426770969144e-05 sample/s: 33.594812965184154 loss: 0.0041 (0.0048) time: 0.1258 data: 0.0044 max mem: 5109
54
+ 2021-05-31 20:00:06,135 INFO torchdistill.misc.log Epoch: [1] [ 9000/12272] eta: 0:06:54 lr: 4.221805736636245e-05 sample/s: 25.64719622596514 loss: 0.0045 (0.0048) time: 0.1241 data: 0.0046 max mem: 5109
55
+ 2021-05-31 20:02:14,011 INFO torchdistill.misc.log Epoch: [1] [10000/12272] eta: 0:04:48 lr: 3.9501847023033466e-05 sample/s: 33.08476073658346 loss: 0.0043 (0.0048) time: 0.1239 data: 0.0047 max mem: 5109
56
+ 2021-05-31 20:04:21,426 INFO torchdistill.misc.log Epoch: [1] [11000/12272] eta: 0:02:41 lr: 3.6785636679704476e-05 sample/s: 25.415942288056765 loss: 0.0039 (0.0047) time: 0.1303 data: 0.0045 max mem: 5109
57
+ 2021-05-31 20:06:28,294 INFO torchdistill.misc.log Epoch: [1] [12000/12272] eta: 0:00:34 lr: 3.406942633637549e-05 sample/s: 37.492242198062506 loss: 0.0038 (0.0047) time: 0.1308 data: 0.0051 max mem: 5109
58
+ 2021-05-31 20:07:02,616 INFO torchdistill.misc.log Epoch: [1] Total time: 0:25:57
59
+ 2021-05-31 20:07:10,347 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/mnli/default_experiment-1-0.arrow
60
+ 2021-05-31 20:07:10,348 INFO __main__ Validation: accuracy = 0.8530820173204279
61
+ 2021-05-31 20:07:10,348 INFO __main__ Updating ckpt at ./resource/ckpt/glue/mnli/kd/mnli-bert-base-uncased_from_bert-large-uncased
62
+ 2021-05-31 20:07:11,549 INFO torchdistill.misc.log Epoch: [2] [ 0/12272] eta: 0:27:31 lr: 3.3330617122990006e-05 sample/s: 36.21437806918137 loss: 0.0018 (0.0018) time: 0.1346 data: 0.0241 max mem: 5109
63
+ 2021-05-31 20:09:18,600 INFO torchdistill.misc.log Epoch: [2] [ 1000/12272] eta: 0:23:52 lr: 3.061440677966102e-05 sample/s: 37.12356586986009 loss: 0.0023 (0.0024) time: 0.1341 data: 0.0045 max mem: 5109
64
+ 2021-05-31 20:11:24,788 INFO torchdistill.misc.log Epoch: [2] [ 2000/12272] eta: 0:21:40 lr: 2.789819643633203e-05 sample/s: 32.69698623302515 loss: 0.0021 (0.0023) time: 0.1271 data: 0.0046 max mem: 5109
65
+ 2021-05-31 20:13:32,260 INFO torchdistill.misc.log Epoch: [2] [ 3000/12272] eta: 0:19:36 lr: 2.5181986093003048e-05 sample/s: 39.77255238497116 loss: 0.0019 (0.0023) time: 0.1264 data: 0.0047 max mem: 5109
66
+ 2021-05-31 20:15:38,928 INFO torchdistill.misc.log Epoch: [2] [ 4000/12272] eta: 0:17:29 lr: 2.2465775749674055e-05 sample/s: 37.40997928507879 loss: 0.0019 (0.0023) time: 0.1170 data: 0.0045 max mem: 5109
67
+ 2021-05-31 20:17:46,151 INFO torchdistill.misc.log Epoch: [2] [ 5000/12272] eta: 0:15:22 lr: 1.974956540634507e-05 sample/s: 39.26708624043028 loss: 0.0018 (0.0023) time: 0.1322 data: 0.0045 max mem: 5109
68
+ 2021-05-31 20:19:53,077 INFO torchdistill.misc.log Epoch: [2] [ 6000/12272] eta: 0:13:16 lr: 1.7033355063016082e-05 sample/s: 26.89458075644343 loss: 0.0019 (0.0022) time: 0.1324 data: 0.0045 max mem: 5109
69
+ 2021-05-31 20:21:59,132 INFO torchdistill.misc.log Epoch: [2] [ 7000/12272] eta: 0:11:08 lr: 1.4317144719687093e-05 sample/s: 32.304879269842495 loss: 0.0017 (0.0022) time: 0.1225 data: 0.0044 max mem: 5109
70
+ 2021-05-31 20:24:05,638 INFO torchdistill.misc.log Epoch: [2] [ 8000/12272] eta: 0:09:01 lr: 1.1600934376358105e-05 sample/s: 39.57945395824831 loss: 0.0021 (0.0022) time: 0.1263 data: 0.0044 max mem: 5109
71
+ 2021-05-31 20:26:11,594 INFO torchdistill.misc.log Epoch: [2] [ 9000/12272] eta: 0:06:54 lr: 8.884724033029119e-06 sample/s: 25.860594738237488 loss: 0.0023 (0.0022) time: 0.1262 data: 0.0044 max mem: 5109
72
+ 2021-05-31 20:28:18,549 INFO torchdistill.misc.log Epoch: [2] [10000/12272] eta: 0:04:47 lr: 6.168513689700131e-06 sample/s: 32.40314814635983 loss: 0.0019 (0.0022) time: 0.1260 data: 0.0045 max mem: 5109
73
+ 2021-05-31 20:30:24,951 INFO torchdistill.misc.log Epoch: [2] [11000/12272] eta: 0:02:41 lr: 3.452303346371143e-06 sample/s: 42.08254363214055 loss: 0.0021 (0.0022) time: 0.1241 data: 0.0044 max mem: 5109
74
+ 2021-05-31 20:32:31,971 INFO torchdistill.misc.log Epoch: [2] [12000/12272] eta: 0:00:34 lr: 7.360930030421556e-07 sample/s: 33.625651129091416 loss: 0.0019 (0.0022) time: 0.1307 data: 0.0044 max mem: 5109
75
+ 2021-05-31 20:33:06,083 INFO torchdistill.misc.log Epoch: [2] Total time: 0:25:54
76
+ 2021-05-31 20:33:13,819 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/mnli/default_experiment-1-0.arrow
77
+ 2021-05-31 20:33:13,820 INFO __main__ Validation: accuracy = 0.8582781456953642
78
+ 2021-05-31 20:33:13,820 INFO __main__ Updating ckpt at ./resource/ckpt/glue/mnli/kd/mnli-bert-base-uncased_from_bert-large-uncased
79
+ 2021-05-31 20:33:15,094 INFO __main__ [Teacher: bert-large-uncased]
80
+ 2021-05-31 20:33:28,908 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/mnli/default_experiment-1-0.arrow
81
+ 2021-05-31 20:33:28,908 INFO __main__ Test: accuracy = 0.8665308201732043
82
+ 2021-05-31 20:33:32,568 INFO __main__ [Student: bert-base-uncased]
83
+ 2021-05-31 20:33:40,325 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/mnli/default_experiment-1-0.arrow
84
+ 2021-05-31 20:33:40,326 INFO __main__ Test: accuracy = 0.8582781456953642
85
+ 2021-05-31 20:33:40,326 INFO __main__ Start prediction for private dataset(s)
86
+ 2021-05-31 20:33:40,327 INFO __main__ mnli/test_m: 9796 samples
87
+ 2021-05-31 20:33:47,980 INFO __main__ mnli/test_mm: 9847 samples
88
+ 2021-05-31 20:33:55,598 INFO __main__ ax/test_ax: 1104 samples
vocab.txt ADDED
The diff for this file is too large to render. See raw diff