junnyu commited on
Commit
fd33d3c
1 Parent(s): a2514a0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ thumbnail: "https://github.com/junnyu"
4
+ tags:
5
+ - pytorch
6
+ - electra
7
+ - openwebtext
8
+ license: "MIT"
9
+ datasets:
10
+ - openwebtext
11
+
12
+ ---
13
+ # 在openwebtext数据集上训练得到的electra-small
14
+ # 复现结果
15
+ |Model|CoLA|SST|MRPC|STS|QQP|MNLI|QNLI|RTE|Avg. of Avg.|
16
+ |---|---|---|---|---|---|---|---|---|---|
17
+ |ELECTRA-Small-OWT(original)|56.8|88.3|87.4|86.8|88.3|78.9|87.9|68.5|80.36|
18
+ |**ELECTRA-Small-OWT (this)**| 55.82 |89.67|87.0|86.96|89.28|80.08|87.50|66.07|80.30|
19
+
20
+ # 训练细节
21
+ - 数据集 openwebtext
22
+ - 训练batch_size 256
23
+ - 学习率lr 2e-4
24
+ - 最大句子长度max_seqlen 128
25
+ - 训练total step 625000
26
+
27
+ # 使用
28
+ ```python
29
+ import torch
30
+ from transformers.models.electra import ElectraModel, ElectraTokenizer
31
+ tokenizer = ElectraTokenizer.from_pretrained("junnyu/electra_small_discriminator")
32
+ model = ElectraModel.from_pretrained("junnyu/electra_small_discriminator")
33
+ inputs = tokenizer("Beijing is the capital of China.", return_tensors="pt")
34
+ with torch.no_grad():
35
+ outputs = model(**inputs)
36
+ print(outputs[0].shape)
37
+ ```