jq
/

nllb-1.3B-many-to-many-pronouncorrection-charaug

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

nllb-1.3B-many-to-many-pronouncorrection-charaug / README.md

jq's picture

jq

Upload TrainableM2MForConditionalGeneration

2c23ded verified about 2 months ago

|

raw history blame contribute delete

No virus

3.53 kB

	---
	tags:
	- generated_from_trainer
	base_model: jq/nllb-1.3B-many-to-many-step-2k
	datasets:
	- generator
	model-index:
	- name: nllb-1.3B-many-to-many-pronouncorrection-charaug
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# nllb-1.3B-many-to-many-pronouncorrection-charaug

	This model is a fine-tuned version of [jq/nllb-1.3B-many-to-many-step-2k](https://huggingface.co/jq/nllb-1.3B-many-to-many-step-2k) on the generator dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2075
	- Bleu Ach Eng: 28.371
	- Bleu Lgg Eng: 30.45
	- Bleu Lug Eng: 41.978
	- Bleu Nyn Eng: 32.296
	- Bleu Teo Eng: 30.422
	- Bleu Eng Ach: 20.972
	- Bleu Eng Lgg: 22.362
	- Bleu Eng Lug: 30.359
	- Bleu Eng Nyn: 15.305
	- Bleu Eng Teo: 21.391
	- Bleu Mean: 27.391

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 25
	- eval_batch_size: 25
	- seed: 42
	- gradient_accumulation_steps: 120
	- total_train_batch_size: 3000
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- training_steps: 1500
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu Ach Eng \| Bleu Lgg Eng \| Bleu Lug Eng \| Bleu Nyn Eng \| Bleu Teo Eng \| Bleu Eng Ach \| Bleu Eng Lgg \| Bleu Eng Lug \| Bleu Eng Nyn \| Bleu Eng Teo \| Bleu Mean \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:------------:\|:------------:\|:------------:\|:------------:\|:------------:\|:------------:\|:------------:\|:------------:\|:------------:\|:------------:\|:---------:\|
	\| No log \| 0.0667 \| 100 \| 1.1541 \| 29.033 \| 31.47 \| 41.596 \| 34.169 \| 32.442 \| 19.677 \| 19.657 \| 27.889 \| 14.554 \| 19.143 \| 26.963 \|
	\| No log \| 1.0301 \| 200 \| 1.1570 \| 27.473 \| 31.853 \| 41.934 \| 32.575 \| 31.606 \| 20.25 \| 20.634 \| 28.592 \| 13.672 \| 19.997 \| 26.859 \|
	\| No log \| 1.0968 \| 300 \| 1.1288 \| 29.086 \| 33.257 \| 43.387 \| 33.678 \| 33.579 \| 20.377 \| 20.91 \| 28.906 \| 14.992 \| 21.013 \| 27.919 \|
	\| No log \| 2.0603 \| 400 \| 1.1620 \| 28.122 \| 31.46 \| 42.491 \| 33.304 \| 32.331 \| 20.282 \| 21.604 \| 29.577 \| 14.961 \| 20.94 \| 27.507 \|
	\| 0.7273 \| 3.0237 \| 500 \| 1.1661 \| 28.311 \| 32.122 \| 42.825 \| 32.333 \| 32.415 \| 19.799 \| 22.287 \| 29.558 \| 15.708 \| 21.948 \| 27.731 \|
	\| 0.7273 \| 3.0904 \| 600 \| 1.1652 \| 28.593 \| 30.62 \| 41.964 \| 33.383 \| 32.08 \| 21.142 \| 21.8 \| 30.215 \| 14.717 \| 21.744 \| 27.626 \|
	\| 0.7273 \| 4.0538 \| 700 \| 1.2075 \| 28.371 \| 30.45 \| 41.978 \| 32.296 \| 30.422 \| 20.972 \| 22.362 \| 30.359 \| 15.305 \| 21.391 \| 27.391 \|


	### Framework versions

	- Transformers 4.40.1
	- Pytorch 2.2.0
	- Datasets 2.19.0
	- Tokenizers 0.19.1