huseinzol05's picture
Update README.md
eb60d80
---
language:
- ms
tags:
- paraphrase
metrics:
- sacrebleu
---
# finetune-paraphrase-t5-tiny-standard-bahasa-cased
Finetuned T5 tiny on MS paraphrase tasks.
## Dataset
1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS
2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC
3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI
## Finetune details
1. Finetune using single RTX 3090 Ti.
Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5
## Supported prefix
1. `parafrasa: {string}`, for MS paraphrase.
## Evaluation
Evaluated on MRPC validation set and ParaSCI Arxiv test set.
```
{'name': 'BLEU',
'score': 36.92696648298233,
'_mean': -1.0,
'_ci': -1.0,
'_verbose': '62.5/42.3/33.0/26.9 (BP = 0.943 ratio = 0.945 hyp_len = 95496 ref_len = 101064)',
'bp': 0.9433611337299734,
'counts': [59650, 38055, 27875, 21217],
'totals': [95496, 89952, 84408, 78864],
'sys_len': 95496,
'ref_len': 101064,
'precisions': [62.46334925023038,
42.30589647812167,
33.02412093640413,
26.90327652667884],
'prec_str': '62.5/42.3/33.0/26.9',
'ratio': 0.944906198052719}
```