monologg commited on
Commit
adb10ae
1 Parent(s): fd0295b

update readme

Browse files

Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ko
3
+ ---
4
+
5
+ # KoBigBird
6
+
7
+ Pretrained BigBird Model for Korean (**kobigbird-bert-base**)
8
+
9
+ ## About
10
+
11
+ BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences.
12
+
13
+ BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.
14
+
15
+ Model is warm started from Korean BERT’s checkpoint.
16
+
17
+ ## How to use
18
+
19
+ WARN: Please use `BertTokenizer` instead of `BigBirdTokenizer`.
20
+
21
+ ```python
22
+ from transformers import AutoModel, AutoTokenizer
23
+
24
+ # by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
25
+ model = AutoModel.from_pretrained("monologg/kobigbird-bert-base")
26
+
27
+ # you can change `attention_type` to full attention like this:
28
+ model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", attention_type="original_full")
29
+
30
+ # you can change `block_size` & `num_random_blocks` like this:
31
+ model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", block_size=16, num_random_blocks=2)
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained("monologg/kobigbird-bert-base")
34
+ text = "한국어 BigBird 모델을 공개합니다!"
35
+ encoded_input = tokenizer(text, return_tensors='pt')
36
+ output = model(**encoded_input)
37
+ ```