shivam commited on
Commit
e390f33
1 Parent(s): d20567a

Readme v1.0

Files changed (1) hide show
  1. README.md +39 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ Language Pair Finetuned:
3
+ - en-mr
4
+
5
+ Metrics:
6
+ - sacrebleu
7
+ - WAT 2021: 16.11
8
+
9
+ # mbart-large-finetuned-en-mr
10
+
11
+ ## Model Description
12
+ This is the mbart-large-50 model finetuned on En-Mr corpus.
13
+
14
+ ## Intended uses and limitations
15
+ Mostly useful for English to Marathi translation but the mbart-large-50 model also supports other language pairs
16
+
17
+ ### How to use
18
+ ```python
19
+ from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
20
+
21
+ model = MBartForConditionalGeneration.from_pretrained("shivam/mbart-large-50-finetuned-en-mr")
22
+ tokenizer = MBart50TokenizerFast.from_pretrained("shivam/mbart-large-50-finetuned-en-mr", src_lang="en_XX", tgt_lang="mr_IN")
23
+
24
+ english_input_sentence = "The Prime Minister said that cleanliness, or Swachhta, is one of the most important aspects of preventive healthcare."
25
+ model_inputs = tokenizer(english_input_sentence, return_tensors="pt")
26
+ generated_tokens = model.generate(
27
+ **model_inputs,
28
+ forced_bos_token_id=tokenizer.lang_code_to_id["mr_IN"]
29
+ )
30
+ marathi_output_sentence = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
31
+
32
+ print(marathi_output_sentence)
33
+ #स्वच्छता हा प्रतिबंधात्मक आरोग्य सेवेतील सर्वात महत्त्वाचा पैलू आहे, असे पंतप्रधान म्हणाले.
34
+ ```
35
+ #### Limitations
36
+ The model was trained on Google Colab and as the training takes a lot of time the model was trained for small time and small number of epochs.
37
+
38
+ ## Eval results
39
+ WAT 2021: 16.11