megagonlabs
/

transformers-ud-japanese-electra-base-ginza

Inference Endpoints

Model card Files Files and versions Community

hiroshi-matsuda-rit commited on Aug 23, 2021

Commit

5073cca

•

1 Parent(s): 1665f8c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ datasets:
 This is an [ELECTRA](https://github.com/google-research/electra) model pretrained on approximately 200M Japanese sentences extracted from the [mC4](https://huggingface.co/datasets/mc4) and finetuned by [spaCy v3](https://spacy.io/usage/v3) on [UD\_Japanese\_BCCWJ r2.8](https://universaldependencies.org/treebanks/ja_bccwj/index.html).
-The base pretrain model is [megagonlabs/transformers-ud-japanese-electra-base-discrimininator](https://huggingface.co/megagonlabs/transformers-ud-japanese-electra-base-discriminator).
 The entire spaCy v3 model is distributed as a python package named [`ja_ginza_electra`](https://pypi.org/project/ja-ginza-electra/) from PyPI along with [`GiNZA v5`](https://github.com/megagonlabs/ginza) which provides some custom pipeline components to recognize the Japanese bunsetu-phrase structures.
 Try running it as below:

 This is an [ELECTRA](https://github.com/google-research/electra) model pretrained on approximately 200M Japanese sentences extracted from the [mC4](https://huggingface.co/datasets/mc4) and finetuned by [spaCy v3](https://spacy.io/usage/v3) on [UD\_Japanese\_BCCWJ r2.8](https://universaldependencies.org/treebanks/ja_bccwj/index.html).
+The base pretrain model is [megagonlabs/transformers-ud-japanese-electra-base-discrimininator](https://huggingface.co/megagonlabs/transformers-ud-japanese-electra-base-discriminator), and requires [SudachiTra](https://github.com/WorksApplications/SudachiTra) for tokenization.
 The entire spaCy v3 model is distributed as a python package named [`ja_ginza_electra`](https://pypi.org/project/ja-ginza-electra/) from PyPI along with [`GiNZA v5`](https://github.com/megagonlabs/ginza) which provides some custom pipeline components to recognize the Japanese bunsetu-phrase structures.
 Try running it as below: