Identical Probabilities

#1
by lucas-weinfurtner - opened

Hello Mr. Sun,

i find this work really interesting and appreciate the effort you put into it.
I tried to use the model with the provided sample code with a variety of sentences with different meaning. However, the returned probabilities of the five personality traits are identical for each sentence. For each sentence, they look the following:
[0.3964995 0.6713814 0.3282923 0.002636 0.46445456], as also presented in your example.
I also tested the huggingface app under 'https://huggingface.co/spaces/KevSun/Personality_Test' with different sentences. In this space, the probabilities are different to the executed model, but still are identical for different sentences in the space.
The probabilities in the space look the following:
Extraversion: 0.2002
Agreeableness: 0.2635
Conscientiousness: 0.1870
Neuroticism: 0.1350
Openness: 0.2143

Do you have an idea on what could cause this issue or how to resolve this?

I would gladly use the model in my research.

Best Regards
Lucas

Owner

Hi Lucas,

Many thanks for your interest in the model.

The demo example has output values in the readme. However, the values are not real results computed from the model.

Best, Kevin

Hi Kevin,

thank you for your quick answer to my question.
I do run not only the demo example but a variety of sentences. In the following code i analyze three sentences with different meaning, yet the output probabilities are identical (up to the 6th decimal place):

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("KevSun/Personality_LM")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Personality_LM")

sent_1 = "I like to talk to other people."
sent_2 = "I like to be alone."
sent_3 = "Other people hate me."

for sent in [sent_1, sent_2, sent_3]:

    sent_tokenized = tokenizer(sent, return_tensors='pt', padding=True, truncation=True, max_length=64)

    model.eval()

    with torch.no_grad():
        outputs = model(**sent_tokenized)

    predictions = outputs.logits.squeeze()
    predicted_scores = predictions.numpy()

    print(predicted_scores)

The output is the following:
[0.39649963 0.6713815 0.32829234 0.00263596 0.46445447]
[0.39649957 0.67138135 0.32829234 0.00263597 0.46445444]
[0.39649963 0.6713814 0.32829228 0.00263601 0.46445447]

Did I do something wrong when calling the model?
I appreciate your time in providing an answer to my question.

Best Regards,
Lucas

Owner

Hi Lucas,

Thanks for your message!

There are two cases to compute probabilities using this model.

The "torch.nn.functional.softmax" function converts logits to probabilities, making it suitable for classification. It's using longer, more complex sentences and is likely to show more variation in the outputs.

Once you make longer sentences as input, the results vary greatly! Moreover, this is the personality model not emotion model. Longer texts could provide more information to model to compute.

It seems that the model is not sensitive to short sentences. The reason for this is that the long texts are included in the training dataset .

Best, Kevin

I am facing the same issue, I have tried all kinds sentences (longer and shorter) but I am getting similar values to all inputs

Analysis Result:
Agreeableness: 0.3910
Openness: 0.6922
Conscientiousness: 0.3259
Extraversion: 0.0042
Neuroticism: 0.4663

Owner

I am facing the same issue, I have tried all kinds sentences (longer and shorter) but I am getting similar values to all inputs

Analysis Result:
Agreeableness: 0.3910
Openness: 0.6922
Conscientiousness: 0.3259
Extraversion: 0.0042
Neuroticism: 0.4663

The original model may not be sensitive to the short sentences. I have upgraded the model. Using the examples from Lucas, we can get the following results.

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("personality_model3b")
tokenizer = AutoTokenizer.from_pretrained("personality_model3b")

sent_1 = "I like to talk to other people."
sent_2 = "I like to be alone."
sent_3 = "Other people hate me."
sent_4 = "I am very new to Python and need some help."

for sent in [sent_1, sent_2, sent_3, sent_4]:

    sent_tokenized = tokenizer(sent, return_tensors='pt', padding=True, truncation=True, max_length=64)

    model.eval()

    with torch.no_grad():
        outputs = model(**sent_tokenized)

 
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_scores = predictions[0].tolist()

    print(predicted_scores)

output:
[0.3072283864021301, 0.2409825474023819, 0.15583530068397522, 0.13852088153362274, 0.1574328988790512]
[0.29344865679740906, 0.2337641716003418, 0.17011244595050812, 0.14260680973529816, 0.16006793081760406]
[0.2811926603317261, 0.2377626895904541, 0.15991169214248657, 0.151767760515213, 0.16936518251895905]
[0.2979387044906616, 0.24921320378780365, 0.15489214658737183, 0.1372152864933014, 0.1607406884431839]

The output values vary from each other, but the differences will become more significant when input sentences become longer.

Please do not use to compute probabilities:
#predictions = outputs.logits.squeeze()
#predicted_scores = predictions.numpy()

Hello Kevin,

thank you for the updates. The update look promising.
I tested the updated model on the above sentences with the identical code. Although the code executes, I get the following warning :

Some weights of the model checkpoint at KevSun/Personality_LM were not used when initializing RobertaForSequenceClassification: ['hidden_layer.bias', 'hidden_layer.weight', 'output_layer.bias', 'output_layer.weight', 'transformer.embeddings.LayerNorm.bias', 'transformer.embeddings.LayerNorm.weight', 'transformer.embeddings.position_embeddings.weight', 'transformer.embeddings.token_type_embeddings.weight', 'transformer.embeddings.word_embeddings.weight', 'transformer.encoder.layer.0.attention.output.LayerNorm.bias', 'transformer.encoder.layer.0.attention.output.LayerNorm.weight', 'transformer.encoder.layer.0.attention.output.dense.bias', 'transformer.encoder.layer.0.attention.output.dense.weight', 'transformer.encoder.layer.0.attention.self.key.bias', 'transformer.encoder.layer.0.attention.self.key.weight', 'transformer.encoder.layer.0.attention.self.query.bias', 'transformer.encoder.layer.0.attention.self.query.weight', 'transformer.encoder.layer.0.attention.self.value.bias', 'transformer.encoder.layer.0.attention.self.value.weight', 'transformer.encoder.layer.0.intermediate.dense.bias', 'transformer.encoder.layer.0.intermediate.dense.weight', 'transformer.encoder.layer.0.output.LayerNorm.bias', 'transformer.encoder.layer.0.output.LayerNorm.weight', 'transformer.encoder.layer.0.output.dense.bias', 'transformer.encoder.layer.0.output.dense.weight', 'transformer.encoder.layer.1.attention.output.LayerNorm.bias', 'transformer.encoder.layer.1.attention.output.LayerNorm.weight', 'transformer.encoder.layer.1.attention.output.dense.bias', 'transformer.encoder.layer.1.attention.output.dense.weight', 'transformer.encoder.layer.1.attention.self.key.bias', 'transformer.encoder.layer.1.attention.self.key.weight', 'transformer.encoder.layer.1.attention.self.query.bias', 'transformer.encoder.layer.1.attention.self.query.weight', 'transformer.encoder.layer.1.attention.self.value.bias', 'transformer.encoder.layer.1.attention.self.value.weight', 'transformer.encoder.layer.1.intermediate.dense.bias', 'transformer.encoder.layer.1.intermediate.dense.weight', 'transformer.encoder.layer.1.output.LayerNorm.bias', 'transformer.encoder.layer.1.output.LayerNorm.weight', 'transformer.encoder.layer.1.output.dense.bias', 'transformer.encoder.layer.1.output.dense.weight', 'transformer.encoder.layer.10.attention.output.LayerNorm.bias', 'transformer.encoder.layer.10.attention.output.LayerNorm.weight', 'transformer.encoder.layer.10.attention.output.dense.bias', 'transformer.encoder.layer.10.attention.output.dense.weight', 'transformer.encoder.layer.10.attention.self.key.bias', 'transformer.encoder.layer.10.attention.self.key.weight', 'transformer.encoder.layer.10.attention.self.query.bias', 'transformer.encoder.layer.10.attention.self.query.weight', 'transformer.encoder.layer.10.attention.self.value.bias', 'transformer.encoder.layer.10.attention.self.value.weight', 'transformer.encoder.layer.10.intermediate.dense.bias', 'transformer.encoder.layer.10.intermediate.dense.weight', 'transformer.encoder.layer.10.output.LayerNorm.bias', 'transformer.encoder.layer.10.output.LayerNorm.weight', 'transformer.encoder.layer.10.output.dense.bias', 'transformer.encoder.layer.10.output.dense.weight', 'transformer.encoder.layer.11.attention.output.LayerNorm.bias', 'transformer.encoder.layer.11.attention.output.LayerNorm.weight', 'transformer.encoder.layer.11.attention.output.dense.bias', 'transformer.encoder.layer.11.attention.output.dense.weight', 'transformer.encoder.layer.11.attention.self.key.bias', 'transformer.encoder.layer.11.attention.self.key.weight', 'transformer.encoder.layer.11.attention.self.query.bias', 'transformer.encoder.layer.11.attention.self.query.weight', 'transformer.encoder.layer.11.attention.self.value.bias', 'transformer.encoder.layer.11.attention.self.value.weight', 'transformer.encoder.layer.11.intermediate.dense.bias', 'transformer.encoder.layer.11.intermediate.dense.weight', 'transformer.encoder.layer.11.output.LayerNorm.bias', 'transformer.encoder.layer.11.output.LayerNorm.weight', 'transformer.encoder.layer.11.output.dense.bias', 'transformer.encoder.layer.11.output.dense.weight', 'transformer.encoder.layer.2.attention.output.LayerNorm.bias', 'transformer.encoder.layer.2.attention.output.LayerNorm.weight', 'transformer.encoder.layer.2.attention.output.dense.bias', 'transformer.encoder.layer.2.attention.output.dense.weight', 'transformer.encoder.layer.2.attention.self.key.bias', 'transformer.encoder.layer.2.attention.self.key.weight', 'transformer.encoder.layer.2.attention.self.query.bias', 'transformer.encoder.layer.2.attention.self.query.weight', 'transformer.encoder.layer.2.attention.self.value.bias', 'transformer.encoder.layer.2.attention.self.value.weight', 'transformer.encoder.layer.2.intermediate.dense.bias', 'transformer.encoder.layer.2.intermediate.dense.weight', 'transformer.encoder.layer.2.output.LayerNorm.bias', 'transformer.encoder.layer.2.output.LayerNorm.weight', 'transformer.encoder.layer.2.output.dense.bias', 'transformer.encoder.layer.2.output.dense.weight', 'transformer.encoder.layer.3.attention.output.LayerNorm.bias', 'transformer.encoder.layer.3.attention.output.LayerNorm.weight', 'transformer.encoder.layer.3.attention.output.dense.bias', 'transformer.encoder.layer.3.attention.output.dense.weight', 'transformer.encoder.layer.3.attention.self.key.bias', 'transformer.encoder.layer.3.attention.self.key.weight', 'transformer.encoder.layer.3.attention.self.query.bias', 'transformer.encoder.layer.3.attention.self.query.weight', 'transformer.encoder.layer.3.attention.self.value.bias', 'transformer.encoder.layer.3.attention.self.value.weight', 'transformer.encoder.layer.3.intermediate.dense.bias', 'transformer.encoder.layer.3.intermediate.dense.weight', 'transformer.encoder.layer.3.output.LayerNorm.bias', 'transformer.encoder.layer.3.output.LayerNorm.weight', 'transformer.encoder.layer.3.output.dense.bias', 'transformer.encoder.layer.3.output.dense.weight', 'transformer.encoder.layer.4.attention.output.LayerNorm.bias', 'transformer.encoder.layer.4.attention.output.LayerNorm.weight', 'transformer.encoder.layer.4.attention.output.dense.bias', 'transformer.encoder.layer.4.attention.output.dense.weight', 'transformer.encoder.layer.4.attention.self.key.bias', 'transformer.encoder.layer.4.attention.self.key.weight', 'transformer.encoder.layer.4.attention.self.query.bias', 'transformer.encoder.layer.4.attention.self.query.weight', 'transformer.encoder.layer.4.attention.self.value.bias', 'transformer.encoder.layer.4.attention.self.value.weight', 'transformer.encoder.layer.4.intermediate.dense.bias', 'transformer.encoder.layer.4.intermediate.dense.weight', 'transformer.encoder.layer.4.output.LayerNorm.bias', 'transformer.encoder.layer.4.output.LayerNorm.weight', 'transformer.encoder.layer.4.output.dense.bias', 'transformer.encoder.layer.4.output.dense.weight', 'transformer.encoder.layer.5.attention.output.LayerNorm.bias', 'transformer.encoder.layer.5.attention.output.LayerNorm.weight', 'transformer.encoder.layer.5.attention.output.dense.bias', 'transformer.encoder.layer.5.attention.output.dense.weight', 'transformer.encoder.layer.5.attention.self.key.bias', 'transformer.encoder.layer.5.attention.self.key.weight', 'transformer.encoder.layer.5.attention.self.query.bias', 'transformer.encoder.layer.5.attention.self.query.weight', 'transformer.encoder.layer.5.attention.self.value.bias', 'transformer.encoder.layer.5.attention.self.value.weight', 'transformer.encoder.layer.5.intermediate.dense.bias', 'transformer.encoder.layer.5.intermediate.dense.weight', 'transformer.encoder.layer.5.output.LayerNorm.bias', 'transformer.encoder.layer.5.output.LayerNorm.weight', 'transformer.encoder.layer.5.output.dense.bias', 'transformer.encoder.layer.5.output.dense.weight', 'transformer.encoder.layer.6.attention.output.LayerNorm.bias', 'transformer.encoder.layer.6.attention.output.LayerNorm.weight', 'transformer.encoder.layer.6.attention.output.dense.bias', 'transformer.encoder.layer.6.attention.output.dense.weight', 'transformer.encoder.layer.6.attention.self.key.bias', 'transformer.encoder.layer.6.attention.self.key.weight', 'transformer.encoder.layer.6.attention.self.query.bias', 'transformer.encoder.layer.6.attention.self.query.weight', 'transformer.encoder.layer.6.attention.self.value.bias', 'transformer.encoder.layer.6.attention.self.value.weight', 'transformer.encoder.layer.6.intermediate.dense.bias', 'transformer.encoder.layer.6.intermediate.dense.weight', 'transformer.encoder.layer.6.output.LayerNorm.bias', 'transformer.encoder.layer.6.output.LayerNorm.weight', 'transformer.encoder.layer.6.output.dense.bias', 'transformer.encoder.layer.6.output.dense.weight', 'transformer.encoder.layer.7.attention.output.LayerNorm.bias', 'transformer.encoder.layer.7.attention.output.LayerNorm.weight', 'transformer.encoder.layer.7.attention.output.dense.bias', 'transformer.encoder.layer.7.attention.output.dense.weight', 'transformer.encoder.layer.7.attention.self.key.bias', 'transformer.encoder.layer.7.attention.self.key.weight', 'transformer.encoder.layer.7.attention.self.query.bias', 'transformer.encoder.layer.7.attention.self.query.weight', 'transformer.encoder.layer.7.attention.self.value.bias', 'transformer.encoder.layer.7.attention.self.value.weight', 'transformer.encoder.layer.7.intermediate.dense.bias', 'transformer.encoder.layer.7.intermediate.dense.weight', 'transformer.encoder.layer.7.output.LayerNorm.bias', 'transformer.encoder.layer.7.output.LayerNorm.weight', 'transformer.encoder.layer.7.output.dense.bias', 'transformer.encoder.layer.7.output.dense.weight', 'transformer.encoder.layer.8.attention.output.LayerNorm.bias', 'transformer.encoder.layer.8.attention.output.LayerNorm.weight', 'transformer.encoder.layer.8.attention.output.dense.bias', 'transformer.encoder.layer.8.attention.output.dense.weight', 'transformer.encoder.layer.8.attention.self.key.bias', 'transformer.encoder.layer.8.attention.self.key.weight', 'transformer.encoder.layer.8.attention.self.query.bias', 'transformer.encoder.layer.8.attention.self.query.weight', 'transformer.encoder.layer.8.attention.self.value.bias', 'transformer.encoder.layer.8.attention.self.value.weight', 'transformer.encoder.layer.8.intermediate.dense.bias', 'transformer.encoder.layer.8.intermediate.dense.weight', 'transformer.encoder.layer.8.output.LayerNorm.bias', 'transformer.encoder.layer.8.output.LayerNorm.weight', 'transformer.encoder.layer.8.output.dense.bias', 'transformer.encoder.layer.8.output.dense.weight', 'transformer.encoder.layer.9.attention.output.LayerNorm.bias', 'transformer.encoder.layer.9.attention.output.LayerNorm.weight', 'transformer.encoder.layer.9.attention.output.dense.bias', 'transformer.encoder.layer.9.attention.output.dense.weight', 'transformer.encoder.layer.9.attention.self.key.bias', 'transformer.encoder.layer.9.attention.self.key.weight', 'transformer.encoder.layer.9.attention.self.query.bias', 'transformer.encoder.layer.9.attention.self.query.weight', 'transformer.encoder.layer.9.attention.self.value.bias', 'transformer.encoder.layer.9.attention.self.value.weight', 'transformer.encoder.layer.9.intermediate.dense.bias', 'transformer.encoder.layer.9.intermediate.dense.weight', 'transformer.encoder.layer.9.output.LayerNorm.bias', 'transformer.encoder.layer.9.output.LayerNorm.weight', 'transformer.encoder.layer.9.output.dense.bias', 'transformer.encoder.layer.9.output.dense.weight', 'transformer.pooler.dense.bias', 'transformer.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at KevSun/Personality_LM and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight', 'embeddings.LayerNorm.bias', 'embeddings.LayerNorm.weight', 'embeddings.position_embeddings.weight', 'embeddings.token_type_embeddings.weight', 'embeddings.word_embeddings.weight', 'encoder.layer.0.attention.output.LayerNorm.bias', 'encoder.layer.0.attention.output.LayerNorm.weight', 'encoder.layer.0.attention.output.dense.bias', 'encoder.layer.0.attention.output.dense.weight', 'encoder.layer.0.attention.self.key.bias', 'encoder.layer.0.attention.self.key.weight', 'encoder.layer.0.attention.self.query.bias', 'encoder.layer.0.attention.self.query.weight', 'encoder.layer.0.attention.self.value.bias', 'encoder.layer.0.attention.self.value.weight', 'encoder.layer.0.intermediate.dense.bias', 'encoder.layer.0.intermediate.dense.weight', 'encoder.layer.0.output.LayerNorm.bias', 'encoder.layer.0.output.LayerNorm.weight', 'encoder.layer.0.output.dense.bias', 'encoder.layer.0.output.dense.weight', 'encoder.layer.1.attention.output.LayerNorm.bias', 'encoder.layer.1.attention.output.LayerNorm.weight', 'encoder.layer.1.attention.output.dense.bias', 'encoder.layer.1.attention.output.dense.weight', 'encoder.layer.1.attention.self.key.bias', 'encoder.layer.1.attention.self.key.weight', 'encoder.layer.1.attention.self.query.bias', 'encoder.layer.1.attention.self.query.weight', 'encoder.layer.1.attention.self.value.bias', 'encoder.layer.1.attention.self.value.weight', 'encoder.layer.1.intermediate.dense.bias', 'encoder.layer.1.intermediate.dense.weight', 'encoder.layer.1.output.LayerNorm.bias', 'encoder.layer.1.output.LayerNorm.weight', 'encoder.layer.1.output.dense.bias', 'encoder.layer.1.output.dense.weight', 'encoder.layer.10.attention.output.LayerNorm.bias', 'encoder.layer.10.attention.output.LayerNorm.weight', 'encoder.layer.10.attention.output.dense.bias', 'encoder.layer.10.attention.output.dense.weight', 'encoder.layer.10.attention.self.key.bias', 'encoder.layer.10.attention.self.key.weight', 'encoder.layer.10.attention.self.query.bias', 'encoder.layer.10.attention.self.query.weight', 'encoder.layer.10.attention.self.value.bias', 'encoder.layer.10.attention.self.value.weight', 'encoder.layer.10.intermediate.dense.bias', 'encoder.layer.10.intermediate.dense.weight', 'encoder.layer.10.output.LayerNorm.bias', 'encoder.layer.10.output.LayerNorm.weight', 'encoder.layer.10.output.dense.bias', 'encoder.layer.10.output.dense.weight', 'encoder.layer.11.attention.output.LayerNorm.bias', 'encoder.layer.11.attention.output.LayerNorm.weight', 'encoder.layer.11.attention.output.dense.bias', 'encoder.layer.11.attention.output.dense.weight', 'encoder.layer.11.attention.self.key.bias', 'encoder.layer.11.attention.self.key.weight', 'encoder.layer.11.attention.self.query.bias', 'encoder.layer.11.attention.self.query.weight', 'encoder.layer.11.attention.self.value.bias', 'encoder.layer.11.attention.self.value.weight', 'encoder.layer.11.intermediate.dense.bias', 'encoder.layer.11.intermediate.dense.weight', 'encoder.layer.11.output.LayerNorm.bias', 'encoder.layer.11.output.LayerNorm.weight', 'encoder.layer.11.output.dense.bias', 'encoder.layer.11.output.dense.weight', 'encoder.layer.2.attention.output.LayerNorm.bias', 'encoder.layer.2.attention.output.LayerNorm.weight', 'encoder.layer.2.attention.output.dense.bias', 'encoder.layer.2.attention.output.dense.weight', 'encoder.layer.2.attention.self.key.bias', 'encoder.layer.2.attention.self.key.weight', 'encoder.layer.2.attention.self.query.bias', 'encoder.layer.2.attention.self.query.weight', 'encoder.layer.2.attention.self.value.bias', 'encoder.layer.2.attention.self.value.weight', 'encoder.layer.2.intermediate.dense.bias', 'encoder.layer.2.intermediate.dense.weight', 'encoder.layer.2.output.LayerNorm.bias', 'encoder.layer.2.output.LayerNorm.weight', 'encoder.layer.2.output.dense.bias', 'encoder.layer.2.output.dense.weight', 'encoder.layer.3.attention.output.LayerNorm.bias', 'encoder.layer.3.attention.output.LayerNorm.weight', 'encoder.layer.3.attention.output.dense.bias', 'encoder.layer.3.attention.output.dense.weight', 'encoder.layer.3.attention.self.key.bias', 'encoder.layer.3.attention.self.key.weight', 'encoder.layer.3.attention.self.query.bias', 'encoder.layer.3.attention.self.query.weight', 'encoder.layer.3.attention.self.value.bias', 'encoder.layer.3.attention.self.value.weight', 'encoder.layer.3.intermediate.dense.bias', 'encoder.layer.3.intermediate.dense.weight', 'encoder.layer.3.output.LayerNorm.bias', 'encoder.layer.3.output.LayerNorm.weight', 'encoder.layer.3.output.dense.bias', 'encoder.layer.3.output.dense.weight', 'encoder.layer.4.attention.output.LayerNorm.bias', 'encoder.layer.4.attention.output.LayerNorm.weight', 'encoder.layer.4.attention.output.dense.bias', 'encoder.layer.4.attention.output.dense.weight', 'encoder.layer.4.attention.self.key.bias', 'encoder.layer.4.attention.self.key.weight', 'encoder.layer.4.attention.self.query.bias', 'encoder.layer.4.attention.self.query.weight', 'encoder.layer.4.attention.self.value.bias', 'encoder.layer.4.attention.self.value.weight', 'encoder.layer.4.intermediate.dense.bias', 'encoder.layer.4.intermediate.dense.weight', 'encoder.layer.4.output.LayerNorm.bias', 'encoder.layer.4.output.LayerNorm.weight', 'encoder.layer.4.output.dense.bias', 'encoder.layer.4.output.dense.weight', 'encoder.layer.5.attention.output.LayerNorm.bias', 'encoder.layer.5.attention.output.LayerNorm.weight', 'encoder.layer.5.attention.output.dense.bias', 'encoder.layer.5.attention.output.dense.weight', 'encoder.layer.5.attention.self.key.bias', 'encoder.layer.5.attention.self.key.weight', 'encoder.layer.5.attention.self.query.bias', 'encoder.layer.5.attention.self.query.weight', 'encoder.layer.5.attention.self.value.bias', 'encoder.layer.5.attention.self.value.weight', 'encoder.layer.5.intermediate.dense.bias', 'encoder.layer.5.intermediate.dense.weight', 'encoder.layer.5.output.LayerNorm.bias', 'encoder.layer.5.output.LayerNorm.weight', 'encoder.layer.5.output.dense.bias', 'encoder.layer.5.output.dense.weight', 'encoder.layer.6.attention.output.LayerNorm.bias', 'encoder.layer.6.attention.output.LayerNorm.weight', 'encoder.layer.6.attention.output.dense.bias', 'encoder.layer.6.attention.output.dense.weight', 'encoder.layer.6.attention.self.key.bias', 'encoder.layer.6.attention.self.key.weight', 'encoder.layer.6.attention.self.query.bias', 'encoder.layer.6.attention.self.query.weight', 'encoder.layer.6.attention.self.value.bias', 'encoder.layer.6.attention.self.value.weight', 'encoder.layer.6.intermediate.dense.bias', 'encoder.layer.6.intermediate.dense.weight', 'encoder.layer.6.output.LayerNorm.bias', 'encoder.layer.6.output.LayerNorm.weight', 'encoder.layer.6.output.dense.bias', 'encoder.layer.6.output.dense.weight', 'encoder.layer.7.attention.output.LayerNorm.bias', 'encoder.layer.7.attention.output.LayerNorm.weight', 'encoder.layer.7.attention.output.dense.bias', 'encoder.layer.7.attention.output.dense.weight', 'encoder.layer.7.attention.self.key.bias', 'encoder.layer.7.attention.self.key.weight', 'encoder.layer.7.attention.self.query.bias', 'encoder.layer.7.attention.self.query.weight', 'encoder.layer.7.attention.self.value.bias', 'encoder.layer.7.attention.self.value.weight', 'encoder.layer.7.intermediate.dense.bias', 'encoder.layer.7.intermediate.dense.weight', 'encoder.layer.7.output.LayerNorm.bias', 'encoder.layer.7.output.LayerNorm.weight', 'encoder.layer.7.output.dense.bias', 'encoder.layer.7.output.dense.weight', 'encoder.layer.8.attention.output.LayerNorm.bias', 'encoder.layer.8.attention.output.LayerNorm.weight', 'encoder.layer.8.attention.output.dense.bias', 'encoder.layer.8.attention.output.dense.weight', 'encoder.layer.8.attention.self.key.bias', 'encoder.layer.8.attention.self.key.weight', 'encoder.layer.8.attention.self.query.bias', 'encoder.layer.8.attention.self.query.weight', 'encoder.layer.8.attention.self.value.bias', 'encoder.layer.8.attention.self.value.weight', 'encoder.layer.8.intermediate.dense.bias', 'encoder.layer.8.intermediate.dense.weight', 'encoder.layer.8.output.LayerNorm.bias', 'encoder.layer.8.output.LayerNorm.weight', 'encoder.layer.8.output.dense.bias', 'encoder.layer.8.output.dense.weight', 'encoder.layer.9.attention.output.LayerNorm.bias', 'encoder.layer.9.attention.output.LayerNorm.weight', 'encoder.layer.9.attention.output.dense.bias', 'encoder.layer.9.attention.output.dense.weight', 'encoder.layer.9.attention.self.key.bias', 'encoder.layer.9.attention.self.key.weight', 'encoder.layer.9.attention.self.query.bias', 'encoder.layer.9.attention.self.query.weight', 'encoder.layer.9.attention.self.value.bias', 'encoder.layer.9.attention.self.value.weight', 'encoder.layer.9.intermediate.dense.bias', 'encoder.layer.9.intermediate.dense.weight', 'encoder.layer.9.output.LayerNorm.bias', 'encoder.layer.9.output.LayerNorm.weight', 'encoder.layer.9.output.dense.bias', 'encoder.layer.9.output.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

What could be the cause of this warning? Is the model on Huggingface not identical to the local one you used in your answer?

Another aspect I noticed is that the model does not behave deterministic. When I run the same input multiple times, it returns varying probabilities. Is there a way to make the model deterministic?

Regarding the length of the input: You mentioned that longer input is better. Does that mean you suggest not to run the model on individual sentences?
In my use case I got product reviews of users, which can contain many sentences. My initial intention was to run the model on each sentence of each review of a user and then build the mean of each personality traits based on the output. Would it be better from your understanding that I run the model only on individual reviews rather than sentences?
Sorry also for the misuse of logits as probabilities.

Best Regards
Lucas

Owner

Hi Lucas,

Such warnings are relatively common when using pre-trained models, especially when working with transformers and other deep learning frameworks. These warnings typically occur for a lot of reasons. They will not influence the running of models. Use the following code, you can avoid such warnings.

import warnings
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

warnings.filterwarnings('ignore')

model = AutoModelForSequenceClassification.from_pretrained("personality_model", ignore_mismatched_sizes=True)
tokenizer = AutoTokenizer.from_pretrained("personality_model")


sent_1 = "I like to talk to other people."
sent_2 = "I like to be alone."
sent_3 = "Other people hate me."
sent_4 = "I am very new to Python and need some help."


sentences = [sent_1, sent_2, sent_3, sent_4]


for sent in sentences:
    # Tokenize the sentence
    sent_tokenized = tokenizer(sent, return_tensors='pt', padding=True, truncation=True, max_length=64)

    # Set the model to evaluation mode
    model.eval()

    # Make predictions without computing gradients
    with torch.no_grad():
        outputs = model(**sent_tokenized)

    # Apply softmax to get prediction probabilities
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_scores = predictions[0].tolist()

    # Print the predicted scores
    print(predicted_scores)

Hello Kevin,

thank you for clarification. Just wanted to let you know as this warning was not present in the earlier model versions.

Can you also elaborate, if your time permits, on the other mentioned aspects in my previous message regarding the non-determinism and the input length?

Best Regards
Lucas

Sign up or log in to comment