izumilab commited on
Commit
fbada5d
1 Parent(s): cd68869

add model card

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+
3
+ language: ja
4
+
5
+ license: cc-by-sa-4.0
6
+
7
+ tags:
8
+
9
+ - finance
10
+
11
+ datasets:
12
+
13
+ - wikipedia
14
+ - securities reports
15
+ - summaries of financial results
16
+
17
+ widget:
18
+
19
+ - text: 流動[MASK]は、1億円となりました。
20
+
21
+ ---
22
+
23
+ # BERT small Japanese finance
24
+
25
+ This is a [BERT](https://github.com/google-research/bert) model pretrained on texts in the Japanese language.
26
+
27
+ The codes for the pretraining are available at [retarfi/language-pretraining](https://github.com/retarfi/language-pretraining/tree/v1.0).
28
+
29
+ ## Model architecture
30
+
31
+ The model architecture is the same as BERT small in the [original ELECTRA paper](https://arxiv.org/abs/2003.10555); 12 layers, 256 dimensions of hidden states, and 4 attention heads.
32
+
33
+ ## Training Data
34
+
35
+ The models are trained on the Japanese version of Wikipedia.
36
+
37
+ The training corpus is generated from the Japanese version of Wikipedia, using Wikipedia dump file as of June 1, 2021.
38
+
39
+ The corpus file is 2.9GB, consisting of approximately 20M sentences.
40
+
41
+ ## Tokenization
42
+
43
+ The texts are first tokenized by MeCab with IPA dictionary and then split into subwords by the WordPiece algorithm.
44
+
45
+ The vocabulary size is 32768.
46
+
47
+ ## Training
48
+
49
+ The models are trained with the same configuration as BERT small in the [original ELECTRA paper](https://arxiv.org/abs/2003.10555); 128 tokens per instance, 128 instances per batch, and 1.45M training steps.
50
+
51
+ ## Citation
52
+
53
+ **There will be another paper for this pretrained model. Be sure to check here again when you cite.**
54
+
55
+ ```
56
+ @inproceedings{bert_electra_japanese,
57
+ title = {Construction and Validation of a Pre-Trained Language Model
58
+ Using Financial Documents}
59
+ author = {Masahiro Suzuki and Hiroki Sakaji and Masanori Hirano and Kiyoshi Izumi},
60
+ month = {oct},
61
+ year = {2021},
62
+ booktitle = {"Proceedings of JSAI Special Interest Group on Financial Infomatics (SIG-FIN) 27"}
63
+ }
64
+ ```
65
+
66
+ ## Licenses
67
+
68
+ The pretrained models are distributed under the terms of the [Creative Commons Attribution-ShareAlike 4.0](https://creativecommons.org/licenses/by-sa/4.0/).
69
+
70
+ ## Acknowledgments
71
+
72
+ This work was supported by JSPS KAKENHI Grant Number JP21K12010.