Lutech-AI/I-SPIn
Italian-Sentence Pair Inference, AKA I-SPIn.
This is a fine-tuned version of the model paraphrase-multilingual-mpnet-base-v2.
Its main task is to perform the Natural Language Inference (NLI) task in the Italian language.
The prediction labels may assume three possible values:
- 1 means the model predicts entailment;
- 0 represents the neutral case;
- -1 corresponds to contradiction.
How it was trained
- Train paraphrase-multilingual-mpnet-base-v2 on the NLI task;
- Apply Knowledge Distillation on the output of (1.) with IT-EN translation dataset to retain NLI knowledge and improve Italian language comprehension.
More details available in the paper!: https://arxiv.org/abs/2309.02887
Usage #1 (HuggingFace Transformers)
In the environment on which you want to run the project, type:
pip install --extra-index-url https://test.pypi.org/simple/ ispin
NOTE: during the first execution, a total of two different models will be downloaded:
- I-SPIn;
- paraphrase-multilingual-mpnet-base-v2.
Each is roughly 1GB in dimension.
Retrieve embeddings
If you installed the package correctly, you can retrieve embeddings in the following way:
from ispin.ISPIn import ISPIn
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
sentences = ['Questa è una frase di prova', 'Testando il funzionamento del modello']
sentence_embeddings = model(sentences)
print(sentence_embeddings) # -> torch.Size(2, 768)
Retrieve labels
If you installed the package correctly, you can retrieve labels in the following way:
from ispin.ISPIn import ISPIn
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
premises = ['Il modello sta funzionando correttamente', 'Il modello non funziona correttamente']
hypothesis = ['Testando il funzionamento del modello']
premises_embeddings = model(premises)
hypothesis_embeddings = model(hypothesis)
predictions = model.predict(
premises_embeddings,
hypothesis_embeddings,
one_to_many = False
)
print(predictions) # -> [0 -1]
The computation is subdivided in two tasks (embedding, classification) to simplify a custom fine-tuning process.
If you want to further optimize this classification head, you might want to deepcopy the layers and continue training (one can choose which layers by slicing the list):
import torch
import copy
module_list = torch.nn.ModuleList(list(copy.deepcopy(model.layers))[start:end])
Usage #2 (cloning repo) (will be deleted)
In a terminal located in your project folder, type: 'git clone https://huggingface.co/Lutech-AI/I-SPIn/ ISPIn'.
Please specify the final 'ISPIn' to avoid complications when calling the Python module.
Then, in the code where you call the model, substitute the line:
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
with:
model = ISPIn.from_pretrained('[your/path]/I-SPIn')
Full model architecture
ISPIn(
(encoder): XLMRobertaModel(...) # transformers internal implementation of 'paraphrase-multilingual-mpnet-base-v2'
(layers): ModuleList(
(0): Linear(in_features=1536, out_features=1024, bias=True)
(1): Linear(in_features=1024, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=256, bias=True)
(3): Linear(in_features=256, out_features=128, bias=True)
(4): Linear(in_features=128, out_features=64, bias=True)
(5): Linear(in_features=64, out_features=3, bias=True)
)
(activation): GELU()
)
Evaluation results
Dataset | Metric | Performance |
---|---|---|
RTE3-ITA | Accuracy | 68% |
RTE3-ITA | Min F1-Score | 60% |
RTE-2009-ITA | Accuracy | 59% |
RTE-2009-ITA | Min F1-Score | 31% |
SNLI (IT) translated w/NLLB-600M | Accuracy | 74% |
MNLI-Matched (IT) translated w/NLLB-600M | Accuracy | 72% |
MNLI-Mismatched (IT) translated w/NLLB-600M | Accuracy | 73% |
NOTE: in RTE3-ITA and RTE-2009-ITA, there is no 'neutral' class. Hence, in those cases, during testing, as the model classified a sentence pair as 'neutral', it was manually relabeled as 'contradiction'.
- Downloads last month
- 3