File size: 1,715 Bytes
0ddef3f
 
 
 
 
 
133eb72
0ddef3f
 
 
 
 
 
 
 
fe82a33
 
0ddef3f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
language: fa
tags:
- Style transfer
- Formality style transfer
widget:
- text: "من با دوستام میرم بازی."
- text: "من به خونه دوستم رفتم."

---

# Persian-t5-formality-transfer

This is a formality style transfer model for the Persian language to convert colloquial text into a formal one. It is based on [the monolingual T5 model for Persian.](https://huggingface.co/Ahmad/parsT5-base) and [Persian T5 paraphraser](https://huggingface.co/erfan226/persian-t5-paraphraser)

Note: This model is still in development and therefore its outputs might not be very good. However, you can experiment with different values for the decoder to get better results. For more info check this [link.](https://huggingface.co/blog/how-to-generate)

## Usage

```python
>>> pip install transformers
>>> from transformers import (T5ForConditionalGeneration, AutoTokenizer, pipeline)
>>> import torch

model_path = 'erfan226/persian-t5-formality-transfer'
model = T5ForConditionalGeneration.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
pipe = pipeline(task='text2text-generation', model=model, tokenizer=tokenizer)

def paraphrase(text):
  for j in range(3):
    out = pipe(text, encoder_no_repeat_ngram_size=4, do_sample=True, num_beams=5, max_length=128)[0]['generated_text']
    print("Paraphrase:", out)

text = "من با دوستام میرم بازی"
print("Original:", text)
paraphrase(text)

# Original: من با دوستام میرم بازی
# Paraphrase: دوست دارم با دوستانم بازی کنم.
# Paraphrase: من با دوستانم میرم...
# Paraphrase: من با دوستام بازی می کنم.

```

## Training data
TBD