Edit model card

output

This model is a fine-tuned version of google/mt5-base on dataset x-tech/cantonese-mandarin-translations.

Model description

The model translates Mandarin sentences to Cantonese.

Intended uses & limitations

When you use the model, please make sure to add translate mandarin to cantonese: <sentence> (please note the space after colon) before the text you want to translate.

Training and evaluation data

Training Dataset: x-tech/cantonese-mandarin-translations

Training procedure

Training is based on example in transformers library

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Since we still need to set up validation set, we do not have any training results yet.

Framework versions

  • Transformers 4.12.5
  • Pytorch 1.8.1
  • Datasets 1.15.1
  • Tokenizers 0.10.3
Downloads last month
7
Safetensors
Model size
582M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train botisan-ai/mt5-translate-zh-yue