schnell commited on
Commit
db49b8b
1 Parent(s): 47ae502

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ja
4
+ license: cc-by-sa-4.0
5
+ datasets:
6
+ - wikipedia
7
+ - cc100
8
+ widget:
9
+ - text: "早稲田 大学 で 自然 言語 処理 を"
10
+ ---
11
+
12
+ # nlp-waseda/gpt2-xl-japanese
13
+
14
+ This model is Japanese GPT-2 pretrained on Japanese Wikipedia and CC-100.
15
+ The parameters of the model are based on [Radford+ 2019](https://paperswithcode.com/paper/language-models-are-unsupervised-multitask).
16
+
17
+ ## Intended uses & limitations
18
+
19
+ You can use the raw model for text generation or fine-tune it to a downstream task.
20
+
21
+ Note that the texts should be segmented into words using Juman++ in advance.
22
+
23
+ ### Preprocessing
24
+
25
+ The texts are normalized using zenhan, segmented into words using Juman++, and tokenized using SentencePiece. Juman++ 2.0.0-rc3 was used for pretraining.
26
+
27
+ The model was trained on 8 NVIDIA A100 GPUs.