mideind
/

nmt-doc-en-is-2022-10

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

nmt-doc-en-is-2022-10 / README.md

hafsteinn's picture

Update README.md

0321fb7 over 1 year ago

|

1.47 kB

	---
	language:
	- en
	- is
	- multilingual
	tags:
	- translation
	inference:
	parameters:
	src_lang: en_XX
	tgt_lang: is_IS
	decoder_start_token_id: 2
	max_length: 512
	widget:
	- text: I once owned a horse. It was black and white.
	---
	# mBART based translation model
	This model was trained to translate multiple sentences at once, compared to one sentence at a time.

	It will occasionally combine sentences or add an extra sentence.

	This is the same model as are provided on CLARIN: https://repository.clarin.is/repository/xmlui/handle/20.500.12537/278

	You can use the following example to get started (note that it is necessary to alter the `decoder_start_token_id` of the model):

	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
	import torch

	device = torch.cuda.current_device() if torch.cuda.is_available() else -1

	tokenizer = AutoTokenizer.from_pretrained("mideind/nmt-doc-en-is-2022-10",src_lang="en_XX",tgt_lang="is_IS")

	model = AutoModelForSeq2SeqLM.from_pretrained("mideind/nmt-doc-en-is-2022-10")
	model.config.decoder_start_token_id = 2

	translate = pipeline("translation_XX_to_YY",model=model,tokenizer=tokenizer,device=device,src_lang="en_XX",tgt_lang="is_IS")

	target_seq = translate("I am using a translation model to translate text from English to Icelandic.",src_lang="en_XX",tgt_lang="is_IS",max_length=128)
	print(target_seq[0]['translation_text'].strip('YY '))