metadata
language:
- ms
tags:
- paraphrase
datasets: mesolitica/translated-PAWS
metrics:
- sacrebleu
finetune-paraphrase-t5-base-standard-bahasa-cased
Finetuned T5 base on MS paraphrase tasks.
Dataset
- translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS
- translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC
- translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI
Finetune details
- Finetune using single RTX 3090 Ti.
Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5
Supported prefix
parafrasa: {string}
, for MS paraphrase.
Evaluation
Evaluated on MRPC validation set and ParaSCI Arxiv test set.
{'name': 'BLEU',
'score': 35.95965899952292,
'_mean': -1.0,
'_ci': -1.0,
'_verbose': '61.7/41.3/32.0/25.8 (BP = 0.944 ratio = 0.946 hyp_len = 95593 ref_len = 101064)',
'bp': 0.9443747373110852,
'counts': [59014, 37157, 27016, 20383],
'totals': [95593, 90049, 84505, 78961],
'sys_len': 95593,
'ref_len': 101064,
'precisions': [61.73464584226878,
41.263090095392506,
31.969705934560086,
25.81400944770203],
'prec_str': '61.7/41.3/32.0/25.8',
'ratio': 0.9458659859099184}