(WIP!) iva_mt_wslot-m2m100_418M-en-pl-lora_adapter

Notice: Although training results are good for some reason inference results are rather poor. I'm leaving this model here as a PoC that PERF LORA adaptation for M2M100 is possible.

This model is a LORA adapted version of facebook/m2m100_418M on the iva_mt_wslot dataset. It achieves the following results on the test set (measured with sacrebleu):

Bleu: 9.33

Using

The model can be used as follows:

First, clone the repository and navigate to the project directory:

git clone https://github.com/cartesinus/multiverb_iva_mt
cd multiverb_iva_mt

Then:

import csv
from iva_mt.iva_mt import IVAMT
import pandas as pd

lang = "es"
translator = IVAMT(lang, peft_model_id="cartesinus/iva_mt_wslot-m2m100_418M-en-es-lora_adapter", device="cuda:0", batch_size=128)
trans = translator.translate("here your example")[0]

Training results

Epoch	Training Loss	Validation Loss	Bleu	Gen Len
1	7.8621	7.6870	24.9063	19.3322
2	7.6340	7.5312	29.7956	19.7533
3	7.5582	7.4595	34.8184	20.1269
4	7.5047	7.4264	36.1874	20.5621
5	7.4888	7.4167	36.2287	20.4417
6	7.4560	7.4013	36.6355	20.2241
7	7.4477	7.3907	37.0554	20.0945
8	7.4422	7.3743	37.7549	20.1589
9	7.4311	7.3748	37.5705	19.9370
10	7.4294	7.3679	37.5343	20.2241
11	7.4114	7.3697	38.1872	20.3836
12	7.4224	7.3620	38.1759	20.1785
13	7.4334	7.3608	38.0895	20.2996
14	7.4133	7.3621	38.2365	20.2948
15	7.4158	7.3599	38.1056	20.2010

Framework versions

PEFT 0.5.0

cartesinus
/

iva_mt_wslot-m2m100_418M-en-pl-lora_adapter

(WIP!) iva_mt_wslot-m2m100_418M-en-pl-lora_adapter

Using

Training results

Framework versions

Evaluation results