--- license: cc-by-nc-sa-4.0 language: - zh - ja - en tags: - translation widget: - text: "ja2zh: 吾輩は猫である。名前はまだ無い。" --- # Model Card for mt5-zh-ja-en-trimmed # Model Details ## Model Description More information needed - **Developed by:** K024 - **Shared by [Optional]:** K024 - **Model type:** Translation - **Language(s) (NLP):** Japanese, Chinease, English - **License:** [cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png - **Parent Model:** [mt5-base](https://huggingface.co/google/mt5-base). - **Resources for more information:** - [mT5 GitHub Repo](https://github.com/google-research/multilingual-t5) - [Associated Paper](https://arxiv.org/abs/2010.11934) # Uses ## Direct Use This model can be used for the task of translation. ## Downstream Use [Optional] More information needed. ## Out-of-Scope Use The model should not be used to intentionally create hostile or alienating environments for people. # Bias, Risks, and Limitations Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. ## Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. # Training Details ## Training Data The model vocabulary is trimmed to ~1/3 by selecting top 85000 tokens in the training data. The code to trim the vocabulary can be found [here](https://gist.github.com/K024/4a100a0f4f4b07208958e0f3244da6ad). ``` wikimedia-en-ja wikimedia-en-zh wikimedia-ja-zh wikititles-ja-en wikititles-zh-en wikimatrix-ja-zh news-commentary-en-ja news-commentary-en-zh news-commentary-ja-zh ted2020-en-ja ted2020-en-zh ted2020-ja-zh ``` ## Training Procedure ### Preprocessing More information needed ### Speeds, Sizes, Times This model is finetuned from [mt5-base](https://huggingface.co/google/mt5-base). # Evaluation ## Testing Data, Factors & Metrics ### Testing Data More information needed ### Factors More information needed ### Metrics More information needed ## Results More information needed # Model Examination More information needed # Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** More information needed - **Hours used:** More information needed - **Cloud Provider:** More information needed - **Compute Region:** More information needed - **Carbon Emitted:** More information needed # Technical Specifications [optional] ## Model Architecture and Objective More information needed ## Compute Infrastructure More information needed ### Hardware More information needed ### Software More information needed. # Citation **BibTeX:** ```bibtex @misc{https://doi.org/10.48550/arxiv.2010.11934, doi = {10.48550/ARXIV.2010.11934}, url = {https://arxiv.org/abs/2010.11934}, author = {Xue, Linting and Constant, Noah and Roberts, Adam and Kale, Mihir and Al-Rfou, Rami and Siddhant, Aditya and Barua, Aditya and Raffel, Colin}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {mT5: A massively multilingual pre-trained text-to-text transformer}, publisher = {arXiv}, year = {2020}, copyright = {arXiv.org perpetual, non-exclusive license} } ``` # Glossary [optional] More information needed # More Information [optional] More information needed # Model Card Authors [optional] K024 in collaboration with Ezi Ozoani and the Hugging Face team # Model Card Contact More information needed # How to Get Started with the Model Use the code below to get started with the model.
Click to expand ```python from transformers import ( T5Tokenizer, MT5ForConditionalGeneration, Text2TextGenerationPipeline, ) path = "K024/mt5-zh-ja-en-trimmed" pipe = Text2TextGenerationPipeline( model=MT5ForConditionalGeneration.from_pretrained(path), tokenizer=T5Tokenizer.from_pretrained(path), ) sentence = "ja2zh: 吾輩は猫である。名前はまだ無い。" res = pipe(sentence, max_length=100, num_beams=4) res[0]['generated_text'] ```