zero-shot classification pipeline and manual pytorch versions are different results

by james92 - opened Oct 11, 2022

Oct 11, 2022

Hi, Thanks for the model. Correct me if I am wrong please. I have picked both the versions ie. code under zero-shot classification pipeline and the code under manual pytorch versions and run against the labels ['Positive','Neutral','Negative'] for the sequence one day I will see the world. Below are the results.

Results (from zero-shot classification pipeline)
{'sequence': 'one day I will see the world', 'labels': ['Positive', 'Negative', 'Neutral'], 'scores': [0.48784172534942627, 0.26007547974586487, 0.25208279490470886]}

Results (from Manual Pytorch Version; For the label 'Positive'}
tensor([0.2946], grad_fn=<SelectBackward0>)

If you notice from the both the results for the label positive, there is a huge variation. I ran the exact same code given in model page in order to test it. I am doing anything wrong ?. Please help me. Thank you.

Extra Information
The logit values from Method Manual Pytorch after applying softmax
tensor([[0.0874, 0.8761, 0.0365]], grad_fn=<SoftmaxBackward0>)

Karthikrevanuru

Nov 24, 2022

Same, @james92 were you able to solve it ?

james92

Mar 7, 2023

Sorry.No I couldn't. How about you ?

Martins6

Mar 20, 2023

Would you be able to share your code, please, @james92 ? It seems hard to debug without it. Thanks! :)

y10ab1

Apr 9, 2023

Hi, I think I found sth from the config of model. Just print model.config and you will see following:

model config: BartConfig {
"_name_or_path": "facebook/bart-large-mnli",
"_num_labels": 3,
"activation_dropout": 0.0,
"activation_function": "gelu",
"add_final_layer_norm": false,
"architectures": [
"BartForSequenceClassification"
],
"attention_dropout": 0.0,
"bos_token_id": 0,
"classif_dropout": 0.0,
"classifier_dropout": 0.0,
"d_model": 1024,
"decoder_attention_heads": 16,
"decoder_ffn_dim": 4096,
"decoder_layerdrop": 0.0,
"decoder_layers": 12,
"decoder_start_token_id": 2,
"dropout": 0.1,
"encoder_attention_heads": 16,
"encoder_ffn_dim": 4096,
"encoder_layerdrop": 0.0,
"encoder_layers": 12,
"eos_token_id": 2,
"forced_eos_token_id": 2,
"gradient_checkpointing": false,
"id2label": {
"0": "contradiction",
"1": "neutral",
"2": "entailment"
},
"init_std": 0.02,
"is_encoder_decoder": true,
"label2id": {
"contradiction": 0,
"entailment": 2,
"neutral": 1
},
"max_position_embeddings": 1024,
"model_type": "bart",
"normalize_before": false,
"num_hidden_layers": 12,
"output_past": false,
"pad_token_id": 1,
"scale_embedding": false,
"transformers_version": "4.27.3",
"use_cache": true,
"vocab_size": 50265
}

As you can see, the label default is,
"0": "contradiction",
"1": "neutral",
"2": "entailment"

which means the highest probability in your case is actually "neutral".

Try to change the hypothesis to 'This example is positive.'
You may find that the prob of entailment will be the highest one.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment