huseinzol05's picture
Update README.md
558438f
---
language:
- ms
tags:
- paraphrase
metrics:
- sacrebleu
---
# finetune-paraphrase-t5-base-standard-bahasa-cased
Finetuned T5 base on MS paraphrase tasks.
## Dataset
1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS
2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC
3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI
## Finetune details
1. Finetune using single RTX 3090 Ti.
Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5
## Supported prefix
1. `parafrasa: {string}`, for MS paraphrase.
## Evaluation
Evaluated on MRPC validation set and ParaSCI Arxiv test set.
```
{'name': 'BLEU',
'score': 35.95965899952292,
'_mean': -1.0,
'_ci': -1.0,
'_verbose': '61.7/41.3/32.0/25.8 (BP = 0.944 ratio = 0.946 hyp_len = 95593 ref_len = 101064)',
'bp': 0.9443747373110852,
'counts': [59014, 37157, 27016, 20383],
'totals': [95593, 90049, 84505, 78961],
'sys_len': 95593,
'ref_len': 101064,
'precisions': [61.73464584226878,
41.263090095392506,
31.969705934560086,
25.81400944770203],
'prec_str': '61.7/41.3/32.0/25.8',
'ratio': 0.9458659859099184}
```