Edit model card

(WIP!) iva_mt_wslot-m2m100_418M-en-pl-lora_adapter

Notice: Although training results are good for some reason inference results are rather poor. I'm leaving this model here as a PoC that PERF LORA adaptation for M2M100 is possible.

This model is a LORA adapted version of facebook/m2m100_418M on the iva_mt_wslot dataset. It achieves the following results on the test set (measured with sacrebleu):

  • Bleu: 9.33

Using

The model can be used as follows:

First, clone the repository and navigate to the project directory:

git clone https://github.com/cartesinus/multiverb_iva_mt
cd multiverb_iva_mt

Then:

import csv
from iva_mt.iva_mt import IVAMT
import pandas as pd

lang = "es"
translator = IVAMT(lang, peft_model_id="cartesinus/iva_mt_wslot-m2m100_418M-en-es-lora_adapter", device="cuda:0", batch_size=128)
trans = translator.translate("here your example")[0]

Training results

Epoch Training Loss Validation Loss Bleu Gen Len
1 7.8621 7.6870 24.9063 19.3322
2 7.6340 7.5312 29.7956 19.7533
3 7.5582 7.4595 34.8184 20.1269
4 7.5047 7.4264 36.1874 20.5621
5 7.4888 7.4167 36.2287 20.4417
6 7.4560 7.4013 36.6355 20.2241
7 7.4477 7.3907 37.0554 20.0945
8 7.4422 7.3743 37.7549 20.1589
9 7.4311 7.3748 37.5705 19.9370
10 7.4294 7.3679 37.5343 20.2241
11 7.4114 7.3697 38.1872 20.3836
12 7.4224 7.3620 38.1759 20.1785
13 7.4334 7.3608 38.0895 20.2996
14 7.4133 7.3621 38.2365 20.2948
15 7.4158 7.3599 38.1056 20.2010

Framework versions

  • PEFT 0.5.0
Downloads last month
2
Unable to determine this model’s pipeline type. Check the docs .

Evaluation results