--- language: - ms tags: - paraphrase metrics: - sacrebleu --- # finetune-paraphrase-t5-base-standard-bahasa-cased Finetuned T5 base on MS paraphrase tasks. ## Dataset 1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS 2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC 3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI ## Finetune details 1. Finetune using single RTX 3090 Ti. Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5 ## Supported prefix 1. `parafrasa: {string}`, for MS paraphrase. ## Evaluation Evaluated on MRPC validation set and ParaSCI Arxiv test set. ``` {'name': 'BLEU', 'score': 35.95965899952292, '_mean': -1.0, '_ci': -1.0, '_verbose': '61.7/41.3/32.0/25.8 (BP = 0.944 ratio = 0.946 hyp_len = 95593 ref_len = 101064)', 'bp': 0.9443747373110852, 'counts': [59014, 37157, 27016, 20383], 'totals': [95593, 90049, 84505, 78961], 'sys_len': 95593, 'ref_len': 101064, 'precisions': [61.73464584226878, 41.263090095392506, 31.969705934560086, 25.81400944770203], 'prec_str': '61.7/41.3/32.0/25.8', 'ratio': 0.9458659859099184} ```