ychenNLP
/

nllb-200-3.3B-easyproject

text2text-generation

Model card Files Files and versions Community

ychenNLP commited on Apr 19, 2023

Commit

7fb5df0

·

1 Parent(s): a2b9149

Update README.md

Files changed (1) hide show

README.md +27 -0

README.md CHANGED Viewed

@@ -1,3 +1,30 @@
 ---
 license: mit
 ---

 ---
 license: mit
 ---
+```python
+from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
+import torch
+tokenizer = AutoTokenizer.from_pretrained(
+        "facebook/nllb-200-distilled-600M",  src_lang="eng_Latn")
+print("Loading model")
+model = AutoModelForSeq2SeqLM.from_pretrained("ychenNLP/nllb-200-3.3b-ep")
+model.cuda()
+input_chunks = ["A translator always risks inadvertently introducing source-language words, grammar, or syntax into the target-language rendering."]
+print("Start translation...")
+output_result = []
+for idx in tqdm(range(0, len(input_chunks), batch_size)):
+    start_idx = idx
+    end_idx = idx + batch_size
+    inputs = tokenizer(input_chunks[start_idx: end_idx], padding=True, truncation=True, max_length=128, return_tensors="pt").to('cuda')
+    with torch.no_grad():
+        translated_tokens = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["zho_Hans"],
+                        max_length=128, num_beams=5, num_return_sequences=1, early_stopping=True)
+    output = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)
+    output_result.extend(output)
+```