kalese
/

opus-mt-en-bkm

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

opus-mt-en-bkm / README.md

kalese's picture

End of training

13849ac verified 7 months ago

|

history blame contribute delete

3.44 kB

	---
	license: apache-2.0
	base_model: Helsinki-NLP/opus-mt-en-ro
	tags:
	- generated_from_trainer
	datasets:
	- arrow
	metrics:
	- bleu
	model-index:
	- name: opus-mt-en-bkm
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: arrow
	type: arrow
	config: default
	split: train
	args: default
	metrics:
	- name: Bleu
	type: bleu
	value: 14.5684
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# opus-mt-en-bkm

	This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ro](https://huggingface.co/Helsinki-NLP/opus-mt-en-ro) on the arrow dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.1597
	- Bleu: 14.5684
	- Gen Len: 58.4294

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 25

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:-------:\|
	\| 3.3983 \| 1.0 \| 974 \| 1.9251 \| 3.7894 \| 60.1579 \|
	\| 1.9429 \| 2.0 \| 1948 \| 1.6720 \| 5.7083 \| 58.6443 \|
	\| 1.7118 \| 3.0 \| 2922 \| 1.5389 \| 7.1977 \| 58.8536 \|
	\| 1.5647 \| 4.0 \| 3896 \| 1.4484 \| 8.4631 \| 57.9068 \|
	\| 1.4611 \| 5.0 \| 4870 \| 1.3836 \| 9.5314 \| 59.3106 \|
	\| 1.3735 \| 6.0 \| 5844 \| 1.3357 \| 10.1879 \| 59.5501 \|
	\| 1.3078 \| 7.0 \| 6818 \| 1.3014 \| 10.9172 \| 59.4968 \|
	\| 1.245 \| 8.0 \| 7792 \| 1.2737 \| 11.445 \| 59.585 \|
	\| 1.2048 \| 9.0 \| 8766 \| 1.2485 \| 11.9346 \| 58.3275 \|
	\| 1.1648 \| 10.0 \| 9740 \| 1.2298 \| 12.3049 \| 58.7768 \|
	\| 1.1272 \| 11.0 \| 10714 \| 1.2176 \| 12.7287 \| 58.1549 \|
	\| 1.086 \| 12.0 \| 11688 \| 1.2043 \| 13.0962 \| 59.2217 \|
	\| 1.0595 \| 13.0 \| 12662 \| 1.1973 \| 13.3375 \| 58.6736 \|
	\| 1.0343 \| 14.0 \| 13636 \| 1.1844 \| 13.3963 \| 58.2763 \|
	\| 1.0174 \| 15.0 \| 14610 \| 1.1797 \| 13.7067 \| 58.1738 \|
	\| 0.9923 \| 16.0 \| 15584 \| 1.1757 \| 13.9467 \| 59.3246 \|
	\| 0.9703 \| 17.0 \| 16558 \| 1.1704 \| 14.1023 \| 58.9813 \|
	\| 0.9589 \| 18.0 \| 17532 \| 1.1663 \| 14.2842 \| 58.401 \|
	\| 0.9472 \| 19.0 \| 18506 \| 1.1662 \| 14.2109 \| 58.4796 \|
	\| 0.9262 \| 20.0 \| 19480 \| 1.1635 \| 14.3872 \| 58.1601 \|
	\| 0.9147 \| 21.0 \| 20454 \| 1.1606 \| 14.4983 \| 58.7417 \|
	\| 0.9162 \| 22.0 \| 21428 \| 1.1630 \| 14.5229 \| 58.4345 \|
	\| 0.9012 \| 23.0 \| 22402 \| 1.1607 \| 14.6204 \| 58.0767 \|
	\| 0.899 \| 24.0 \| 23376 \| 1.1600 \| 14.5681 \| 58.4357 \|
	\| 0.8934 \| 25.0 \| 24350 \| 1.1597 \| 14.5684 \| 58.4294 \|


	### Framework versions

	- Transformers 4.39.3
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2