|
--- |
|
language: |
|
- ms |
|
tags: |
|
- paraphrase |
|
metrics: |
|
- sacrebleu |
|
--- |
|
|
|
# finetune-paraphrase-t5-small-standard-bahasa-cased |
|
|
|
Finetuned T5 small on MS paraphrase tasks. |
|
|
|
## Dataset |
|
|
|
1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS |
|
2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC |
|
3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI |
|
|
|
## Finetune details |
|
|
|
1. Finetune using single RTX 3090 Ti. |
|
|
|
Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5 |
|
|
|
## Supported prefix |
|
|
|
1. `parafrasa: {string}`, for MS paraphrase. |
|
|
|
## Evaluation |
|
|
|
Evaluated on MRPC validation set and ParaSCI Arxiv test set. |
|
|
|
``` |
|
{'name': 'BLEU', |
|
'score': 37.598729045833316, |
|
'_mean': -1.0, |
|
'_ci': -1.0, |
|
'_verbose': '62.6/42.5/33.2/27.0 (BP = 0.957 ratio = 0.958 hyp_len = 96781 ref_len = 101064)', |
|
'bp': 0.9567103919247614, |
|
'counts': [60539, 38753, 28443, 21680], |
|
'totals': [96781, 91237, 85693, 80149], |
|
'sys_len': 96781, |
|
'ref_len': 101064, |
|
'precisions': [62.55256713611143, |
|
42.47509234192268, |
|
33.19174261608299, |
|
27.049620082596164], |
|
'prec_str': '62.6/42.5/33.2/27.0', |
|
'ratio': 0.9576209134805668} |
|
``` |