Matttttttt commited on
Commit
cf1b5e8
1 Parent(s): 963405b

fixed a description error in README

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -43,7 +43,7 @@ We used the following corpora for pre-training:
43
  We first segmented texts in the corpora into words using [Juman++](https://github.com/ku-nlp/jumanpp).
44
  Then, we built a sentencepiece model with 32000 tokens including words ([JumanDIC](https://github.com/ku-nlp/JumanDIC)) and subwords induced by the unigram language model of [sentencepiece](https://github.com/google/sentencepiece).
45
 
46
- We tokenized the segmented corpora into subwords using the sentencepiece model and trained the Japanese BART model using [transformers](https://github.com/huggingface/transformers) library.
47
  The training took about 1 month using 4 Tesla V100 GPUs.
48
 
49
  The following hyperparameters were used during pre-training:
 
43
  We first segmented texts in the corpora into words using [Juman++](https://github.com/ku-nlp/jumanpp).
44
  Then, we built a sentencepiece model with 32000 tokens including words ([JumanDIC](https://github.com/ku-nlp/JumanDIC)) and subwords induced by the unigram language model of [sentencepiece](https://github.com/google/sentencepiece).
45
 
46
+ We tokenized the segmented corpora into subwords using the sentencepiece model and trained the Japanese BART model using [fairseq](https://github.com/facebookresearch/fairseq) library.
47
  The training took about 1 month using 4 Tesla V100 GPUs.
48
 
49
  The following hyperparameters were used during pre-training: