t5-base-japanese-web (with Byte-fallback, 8K)

Description

megagonlabs/t5-base-japanese-web is a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts.
Training codes are available on GitHub.

The vocabulary size of this model is 8K. 32K version is also available.

Corpora

We used following corpora for pre-training.

Tokenizer

We used Japanese Wikipedia to train SentencePiece.

Parameters

It took about 126 hours with TPU v3-8

Related models

License

Apache License 2.0

Citations

  • mC4

Contains information from mC4 which is made available under the ODC Attribution License.

@article{2019t5,
    author = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu},
    title = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},
    journal = {arXiv e-prints},
    year = {2019},
    archivePrefix = {arXiv},
    eprint = {1910.10683},
}
  • wiki40b
@inproceedings{49029,
title = {Wiki-40B: Multilingual Language Model Dataset},
author = {Mandy Guo and Zihang Dai and Denny Vrandecic and Rami Al-Rfou},
year = {2020},
booktitle   = {LREC 2020}
}
New: fine-tune this model in a few clicks by selecting AutoNLP in the "Train" menu!
Downloads last month
48
Hosted inference API
Text2Text Generation
This model can be loaded on the Inference API on-demand.