gxy commited on
Commit
d3876ab
1 Parent(s): e4f529d

FEAT: first commit

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -10,7 +10,7 @@ widget:
10
 
11
 
12
  ---
13
- # Randeng-BART-759M-BertTokenizer model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
14
 
15
  The 759M million parameter Randeng-BART large model, using 180G Chinese data, 8 A100(40G) training for 7 days,which is a Encoder-Only transformer structure.
16
 
@@ -18,7 +18,7 @@ We use bert vocab as our tokenizer.
18
 
19
  ## Task Description
20
 
21
- Randeng-BART-759M-BertTokenizer is pre-trained by Text-Infilling task from BART [paper](https://readpaper.com/pdf-annotate/note?noteId=675945911766249472&pdfId=550970997159968917)
22
 
23
  You can find our pretrain's code in [Fengshengbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/examples/pretrain_randeng_bart)
24
 
@@ -28,8 +28,8 @@ You can find our pretrain's code in [Fengshengbang-LM](https://github.com/IDEA-C
28
  from transformers import BartForConditionalGeneration, AutoTokenizer, Text2TextGenerationPipeline
29
  import torch
30
 
31
- tokenizer=AutoTokenizer.from_pretrained('IDEA-CCNL/Randeng-BART-759M-BertTokenizer', use_fast=false)
32
- model=BartForConditionalGeneration.from_pretrained('IDEA-CCNL/Randeng-BART-759M-BertTokenizer')
33
  text = '桂林是著名的[MASK],它有很多[MASK]。'
34
  text2text_generator = Text2TextGenerationPipeline(model, tokenizer)
35
  print(text2text_generator(text, max_length=50, do_sample=False))
 
10
 
11
 
12
  ---
13
+ # Randeng-BART-759M-Chinese-BertTokenizer model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
14
 
15
  The 759M million parameter Randeng-BART large model, using 180G Chinese data, 8 A100(40G) training for 7 days,which is a Encoder-Only transformer structure.
16
 
 
18
 
19
  ## Task Description
20
 
21
+ Randeng-BART-759M-Chinese-BertTokenizer is pre-trained by Text-Infilling task from BART [paper](https://readpaper.com/pdf-annotate/note?noteId=675945911766249472&pdfId=550970997159968917)
22
 
23
  You can find our pretrain's code in [Fengshengbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/examples/pretrain_randeng_bart)
24
 
 
28
  from transformers import BartForConditionalGeneration, AutoTokenizer, Text2TextGenerationPipeline
29
  import torch
30
 
31
+ tokenizer=AutoTokenizer.from_pretrained('IDEA-CCNL/Randeng-BART-759M-Chinese-BertTokenizer', use_fast=false)
32
+ model=BartForConditionalGeneration.from_pretrained('IDEA-CCNL/Randeng-BART-759M-Chinese-BertTokenizer')
33
  text = '桂林是著名的[MASK],它有很多[MASK]。'
34
  text2text_generator = Text2TextGenerationPipeline(model, tokenizer)
35
  print(text2text_generator(text, max_length=50, do_sample=False))