parsi-ai-nlpclass
/

PersianEase

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

PardisSzah commited on Mar 3

Commit

9208d25

•

1 Parent(s): 6684d78

Update README.md

Files changed (1) hide show

README.md +55 -0

README.md CHANGED Viewed

@@ -1,3 +1,58 @@
 ---
 license: mit
 ---

 ---
+language: fa
 license: mit
+pipeline_tag: text2text-generation
 ---
+# PersianEase
+This model is fine-tuned to generate informal text from formal text based on the input provided. It has been fine-tuned on [Mohavere Dataset] (Takalli vahideh, Kalantari, Fateme, Shamsfard, Mehrnoush, Developing an Informal-Formal Persian Corpus, 2022.) using the pretrained model [persian-t5-formality-transfer](https://huggingface.co/erfan226/persian-t5-formality-transfer).
+## Evaluation Metrics
+| Metric               | Basic Model | Base Persian T5 | Our Model   |
+|----------------------|-------------|-----------------|-------------|
+| BLEU-1               | 0.524       | 0.212           | **0.636**   |
+| BLEU-2               | 0.358       | 0.137           | **0.511**   |
+| BLEU-3               | 0.254       | 0.096           | **0.416**   |
+| BLEU-4               | 0.18        | 0.068           | **0.337**   |
+| Bert-Score Precision | 0.671       | 0.537           | **0.797**   |
+| Bert-Score Recall    | 0.712       | 0.570           | **0.805**   |
+| Bert-Score F1 Score  | 0.690       | 0.549           | **0.800**   |
+| ROUGE-1 F1 Score     | 0.553       | -               | **0.645**   |
+| ROUGE-2 F1 Score     | 0.274       | -               | **0.427**   |
+| ROUGE-l F1 Score     | 0.522       | -               | **0.628**   |
+## Usage
+```python
+from transformers import (T5ForConditionalGeneration, AutoTokenizer, pipeline)
+import torch
+model = T5ForConditionalGeneration.from_pretrained('parsi-ai-nlpclass/PersianEase')
+tokenizer = AutoTokenizer.from_pretrained('parsi-ai-nlpclass/PersianEase')
+pipe = pipeline(task='text2text-generation', model=model, tokenizer=tokenizer)
+def test_model(text):
+  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+  model.to(device)
+  inputs = tokenizer.encode("formal: " + text, return_tensors='pt', max_length=128, truncation=True, padding='max_length')
+  inputs = inputs.to(device)
+  outputs = model.generate(inputs, max_length=128, num_beams=4, temperature=0.7)
+  print("Output:", tokenizer.decode(outputs[0], skip_special_tokens=True))
+text = "   من فقط می‌خواستم بگویم که چقدر قدردان همه چیزهایی هستم که برای من انجام داده ای. دوستی تو برای من یک هدیه بزرگ است و من همیشه از داشتن یک دوست مانند تو خوشحال هستم."
+print("Original:", text)
+test_model(text)
+# output: من فقط میخوام بگم که چقدر قدردان همه کاریم که برای من انجام دادی. دوستی تو برای من یه هدیه بزرگه و من همیشه از داشتن یه دوست مثل تو خوشحالم.
+```