--- language: - ms tags: - paraphrase metrics: - sacrebleu --- # finetune-paraphrase-t5-small-standard-bahasa-cased Finetuned T5 small on MS paraphrase tasks. ## Dataset 1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS 2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC 3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI ## Finetune details 1. Finetune using single RTX 3090 Ti. Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5 ## Supported prefix 1. `parafrasa: {string}`, for MS paraphrase. ## Evaluation Evaluated on MRPC validation set and ParaSCI Arxiv test set. ``` {'name': 'BLEU', 'score': 37.598729045833316, '_mean': -1.0, '_ci': -1.0, '_verbose': '62.6/42.5/33.2/27.0 (BP = 0.957 ratio = 0.958 hyp_len = 96781 ref_len = 101064)', 'bp': 0.9567103919247614, 'counts': [60539, 38753, 28443, 21680], 'totals': [96781, 91237, 85693, 80149], 'sys_len': 96781, 'ref_len': 101064, 'precisions': [62.55256713611143, 42.47509234192268, 33.19174261608299, 27.049620082596164], 'prec_str': '62.6/42.5/33.2/27.0', 'ratio': 0.9576209134805668} ```