huseinzol05's picture
Update README.md
864b3f5
---
language:
- ms
tags:
- paraphrase
metrics:
- sacrebleu
---
# finetune-paraphrase-t5-small-standard-bahasa-cased
Finetuned T5 small on MS paraphrase tasks.
## Dataset
1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS
2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC
3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI
## Finetune details
1. Finetune using single RTX 3090 Ti.
Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5
## Supported prefix
1. `parafrasa: {string}`, for MS paraphrase.
## Evaluation
Evaluated on MRPC validation set and ParaSCI Arxiv test set.
```
{'name': 'BLEU',
'score': 37.598729045833316,
'_mean': -1.0,
'_ci': -1.0,
'_verbose': '62.6/42.5/33.2/27.0 (BP = 0.957 ratio = 0.958 hyp_len = 96781 ref_len = 101064)',
'bp': 0.9567103919247614,
'counts': [60539, 38753, 28443, 21680],
'totals': [96781, 91237, 85693, 80149],
'sys_len': 96781,
'ref_len': 101064,
'precisions': [62.55256713611143,
42.47509234192268,
33.19174261608299,
27.049620082596164],
'prec_str': '62.6/42.5/33.2/27.0',
'ratio': 0.9576209134805668}
```