Lutech-AI/I-SPIn

Italian-Sentence Pair Inference, AKA I-SPIn.
This is a fine-tuned version of the model paraphrase-multilingual-mpnet-base-v2.
Its main task is to perform the Natural Language Inference (NLI) task in the Italian language.
The prediction labels may assume three possible values:

1 means the model predicts entailment;
0 represents the neutral case;
-1 corresponds to contradiction.

How it was trained

Train paraphrase-multilingual-mpnet-base-v2 on the NLI task;
Apply Knowledge Distillation on the output of (1.) with IT-EN translation dataset to retain NLI knowledge and improve Italian language comprehension.

More details available in the paper!: https://arxiv.org/abs/2309.02887

Usage #1 (HuggingFace Transformers)

In the environment on which you want to run the project, type:

pip install --extra-index-url https://test.pypi.org/simple/ ispin

NOTE: during the first execution, a total of two different models will be downloaded:

I-SPIn;
paraphrase-multilingual-mpnet-base-v2.

Each is roughly 1GB in dimension.

Retrieve embeddings

If you installed the package correctly, you can retrieve embeddings in the following way:

from ispin.ISPIn import ISPIn
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
sentences = ['Questa è una frase di prova', 'Testando il funzionamento del modello']
sentence_embeddings = model(sentences)
print(sentence_embeddings) # -> torch.Size(2, 768)

Retrieve labels

If you installed the package correctly, you can retrieve labels in the following way:

from ispin.ISPIn import ISPIn
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
premises = ['Il modello sta funzionando correttamente', 'Il modello non funziona correttamente']
hypothesis = ['Testando il funzionamento del modello']
premises_embeddings = model(premises)
hypothesis_embeddings = model(hypothesis)
predictions = model.predict(
    premises_embeddings,
    hypothesis_embeddings,
    one_to_many = False
)
print(predictions) # -> [0 -1]

The computation is subdivided in two tasks (embedding, classification) to simplify a custom fine-tuning process.

If you want to further optimize this classification head, you might want to deepcopy the layers and continue training (one can choose which layers by slicing the list):

import torch
import copy
module_list = torch.nn.ModuleList(list(copy.deepcopy(model.layers))[start:end])

Usage #2 (cloning repo) (will be deleted)

In a terminal located in your project folder, type: 'git clone https://huggingface.co/Lutech-AI/I-SPIn/ ISPIn'.
Please specify the final 'ISPIn' to avoid complications when calling the Python module.
Then, in the code where you call the model, substitute the line:

model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')

with:

model = ISPIn.from_pretrained('[your/path]/I-SPIn')

Full model architecture

ISPIn(
  (encoder): XLMRobertaModel(...) # transformers internal implementation of 'paraphrase-multilingual-mpnet-base-v2'
  (layers): ModuleList(
    (0): Linear(in_features=1536, out_features=1024, bias=True)
    (1): Linear(in_features=1024, out_features=512, bias=True)
    (2): Linear(in_features=512, out_features=256, bias=True)
    (3): Linear(in_features=256, out_features=128, bias=True)
    (4): Linear(in_features=128, out_features=64, bias=True)
    (5): Linear(in_features=64, out_features=3, bias=True)
  )
  (activation): GELU()
)

Evaluation results

Dataset	Metric	Performance
RTE3-ITA	Accuracy	68%
RTE3-ITA	Min F1-Score	60%
RTE-2009-ITA	Accuracy	59%
RTE-2009-ITA	Min F1-Score	31%
SNLI (IT) translated w/NLLB-600M	Accuracy	74%
MNLI-Matched (IT) translated w/NLLB-600M	Accuracy	72%
MNLI-Mismatched (IT) translated w/NLLB-600M	Accuracy	73%

NOTE: in RTE3-ITA and RTE-2009-ITA, there is no 'neutral' class. Hence, in those cases, during testing, as the model classified a sentence pair as 'neutral', it was manually relabeled as 'contradiction'.