sileod
/

deberta-v3-base-tasksource-nli

Model card Files Files and versions Community

Which datasets are included in the NLI training data / NLI head?

by MoritzLaurer HF staff - opened Jul 10, 2023

Jul 10, 2023

Very interesting model and multi-task learning approach!
Which datasets were included for training the NLI classification head that is used for 0-shot classification? I understand that it is mostly this collection https://huggingface.co/datasets/tasksource/zero-shot-label-nli ? Was something else included for training the NLI head?
Did you use a binary head (entailment vs. contradiction+neutral)?

sileod

Owner Jul 10, 2023

Hi, thank you! It's a threeway NLI head
I used weight sharing for allmost all threeway class, so the backbone+head was trained on dozens of NLI datasets (including label-nli)
Thus, the checkpoint as it is loaded by huggingface can perform zero-shot classification (being trained to do so) and standard NLI
In addition, using tasknet.load_pipeline, you can change the head + a task embedding. By changing the head, you can directly predict for a new task (e.g. sentiment analysis). The task embeddings helps the model focus on a task. For NLI tasks, even if you don't change the head, you can change the task embedding with tasknet.load_pipeline, and make the model a bit more focused for entailment based zero shot NLI, for instance.

MoritzLaurer changed discussion status to closed Jul 10, 2023

automatron900

Jan 25, 2024

•

edited Jan 25, 2024

@sileod , thank you for this model!
I have been playing around with this model for a while now and this is really interesting, im new to NLI so forgive my dumb questions xD
I have been using this model withing the hugging face pipeline, somewhat like this nlp = pipeline('zero shot classification', model = model_dir, token = token_dir). This has been working fine, but when i looked to improve this without finetuning, i stumbled upon this comment. How would one go about making this more specific tto lets say classification of some popular us holidays?

sileod

Owner Jan 25, 2024

•

edited Jan 25, 2024

@automatron900
Hi, thank you for your kind words !
Improving the model is mostly done by a form of fine-tuning
I suggest using tasknet for easier experience https://github.com/sileod/tasknet
The main thing to do is to convert your dataset to a huggingface Dataset
For more info about how it works:
https://huggingface.co/docs/transformers/tasks/sequence_classification

automatron900

Jan 26, 2024

Yup, that was the direction i was thinking in, however i am worried about overfitting the model. I could choose x instances of yay and nay for all classes. But how would you ensure model generalization abilities remain?

sileod

Owner Jan 26, 2024

You should monitor validation accuracy
You can try to make the validation split as different as possible to the training split to emphasize generalization
Early stopping also help preserving generalization
You can even formulate your own data as a NLI task to make the most of the initial capabilities

sileod changed discussion status to open Jan 26, 2024

automatron900

Jan 26, 2024

Awesome!
I like the idea of creating an NLI task for this with my own dataset. Using tasknet is the way to do it?

sileod

Owner Feb 1, 2024

Tasknet or this for specificallly fewshot https://github.com/Knowledgator/LiqFit

automatron900

Feb 2, 2024

oh this very simple and perfect!
thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment