monologg
/

kobigbird-bert-base

Inference Endpoints

Model card Files Files and versions Community

monologg commited on Oct 20, 2021

Commit

3767791

·

1 Parent(s): 68b4d7c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ Pretrained BigBird Model for Korean (**kobigbird-bert-base**)
 BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences.
-BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.
 Model is warm started from Korean BERT’s checkpoint.

 BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences.
+BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT.
 Model is warm started from Korean BERT’s checkpoint.