File size: 1,326 Bytes
20bdb85 5f72235 20bdb85 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
language:
- tr
tags:
- paraphrasing
- encoder-decoder
- seq2seq
- bert
---
#Bert2Bert Turkish Paraphrase Generation
#INISTA 2021
#Comparison of Turkish Paraphrase Generation Models
#Dataset
The dataset used in model training was created with the combination of the translation of the QQP dataset and manually generated dataset.
Dataset [Link](https://drive.google.com/file/d/1-2l9EwIzXZ7fUkNW1vdeF3lzQp2pygp_/view?usp=sharing)
#How To Use
```python
from transformers import BertTokenizerFast,EncoderDecoderModel
tokenizer=BertTokenizerFast.from_pretrained("dbmdz/bert-base-turkish-cased")
model = EncoderDecoderModel.from_pretrained("ahmetbagci/bert2bert-turkish-paraphrase-generation")
text="son model arabalar çevreye daha mı az zarar veriyor?"
input_ids = tokenizer(text, return_tensors="pt").input_ids
output_ids = model.generate(input_ids)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))
#sample output
#son model arabalar çevre için daha az zararlı mı?
```
#Cite
```bibtex
@inproceedings{bagci2021paraphrase,
title={Comparison of Turkish Paraphrase Generation Models},
author={Ba{\u{g}}c{\i}, Ahmet and Amasyali, Mehmet Fatih},
booktitle={2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA)},
year={2021},
organization={IEEE}
}
``` |