Finetuning

by HassanStar - opened Sep 29, 2023

Discussion

HassanStar

Sep 29, 2023

can you please explain more how to finetune it on new classification task and the code to use?

akhtet

Oct 2, 2023

It is likely that the model already works with your task without additional fine-tuning, as it supports zero-shot classification. You can try the sample code provided on the model card.

MoritzLaurer

Owner Oct 5, 2023

Hi @HassanStar , you have two options for fine-tuning: (1) you do standard fine-tuning as with any BERT model. This would delete the universal NLI task head and create a new classification head for your task. (2) you can continuously fine-tune the model including the universal NLI head (this is recommended if you have roughly <= 1000 datapoints. If you have more data than 2000, normal fine-tuning is probably better). You can find example code for this in notebook nr. 4 here: https://github.com/MoritzLaurer/summer-school-transformers-2023

and as @akhtet said, you can also use it without fine-tuning with the example code from the model card.

Gurdikyan1

Jun 14, 2024

"you can continuously fine-tune the model including the universal NLI head (this is recommended if you have roughly <= 1000 datapoints. If you have more data than 2000, normal fine-tuning is probably better). You can find example code for this in notebook nr. 4 here: https://github.com/MoritzLaurer/summer-school-transformers-2023".

I was wondering how multiple classes would change the data preparation.

in the example notebook, we have an "Positive military hypothesis", "Negative military hypotheses" and "not about military hypotheses" true and not true about these hypotheses. when we have more than theses 3 classes for example:

classifying "food related prompt" into 5 classes: "relating to fruit", "relating to vegetables", "relating to chocolate", "relating to meat".

Would it look something like:

text :"I really like strawberries", Hypothesis: "this text relates to fruit", label_nli_explicit: True
text :"I really like strawberries", Hypothesis: "this text relates to vegetables", label_nli_explicit: false
text :"I really like strawberries", Hypothesis: "this text relates to Chocolate", label_nli_explicit: false
text :"I really like strawberries", Hypothesis: "this text relates to meat", label_nli_explicit: false

akhtet

Jun 14, 2024

@Gurdikyan1 I think your example is mostly correct, but for NLI heads there should be 3 label classes (Entailment, Contradiction and Neutral) instead of True/False.

When you prepare the fine-tuning dataset, your examples should include all 3 different types.

MoritzLaurer

Owner Jun 27, 2024

@Gurdikyan1 yes, format would work. In practice, with multiple classes, I've always only provided the True class and then randomly chosen ONE False class. Otherwise the dataset becomes huge with many classes, training becomes much longer and the model learns to overpredict "False".

@akhtet , classical NLI has 3 labels, but for 0-shot or few-shot classification you actually only need to (True/False). You can merge the contradiction+neutral class into the False class.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment