How to figure out the output sequence

by bhuvneshsaini - opened Dec 26, 2023

Dec 26, 2023

When i use the same text as shown in readme page and pass it on huggingface inference ,it works fine but on local machine it gives different answer

DarwinAnim8or

Koala AI org Dec 31, 2023

Hi, are you using this using the transformers library's "pipeline" or using the model directly? If the latter, are you changing any settings on it before generating? Thank you

elontusk404

Mar 5, 2024

Hey guys i use only code from example in collab and how undestand results)?SequenceClassifierOutput(loss=None, logits=tensor([[ 0.1169, -1.8691, -0.9551, 5.5512, -0.7662, -1.3808, 0.0202, -0.7166,
-1.2170]], grad_fn=), hidden_states=None, attentions=None)

dr4g0n7ly

Mar 7, 2024

•

edited Mar 7, 2024

Hey guys i use only code from example in collab and how undestand results)?SequenceClassifierOutput(loss=None, logits=tensor([[ 0.1169, -1.8691, -0.9551, 5.5512, -0.7662, -1.3808, 0.0202, -0.7166,
-1.2170]], grad_fn=), hidden_states=None, attentions=None)

I have the same problem

DarwinAnim8or

Koala AI org Mar 7, 2024

@dr4g0n7ly @elontusk404 I don't know which colab you're referring to, so I'm not entirely sure what you're encountering.
If you want to run the model locally (or in a colab) you can use the following code:

import torch
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer

# Load your classifier and tokenizer
classifier = pipeline("text-classification", model="KoalaAI/Text-Moderation")

# Sample text to classify
text = "This is absolutely okay!"

# Get the outputs directly from the classifier
outputs = classifier(text, return_all_scores=True)

# Find the index of the class with the highest score
predicted_class_idx = outputs[0][0]['label']  # Access the label directly

# Get the labels associated with your model (this might still be useful)
labels = classifier.model.config.id2label  

# Print the results
print("All Scores:", outputs)

The output of this code will look like this:

All Scores: [[{'label': 'H', 'score': 0.0026952188927680254}, {'label': 'H2', 'score': 0.0004286111507099122}, {'label': 'HR', 'score': 0.0007978402427397668}, {'label': 'OK', 'score': 0.9905979037284851}, {'label': 'S', 'score': 0.0008165662293322384}, {'label': 'S3', 'score': 0.000507685006596148}, {'label': 'SH', 'score': 0.0020117328967899084}, {'label': 'V', 'score': 0.0012887571938335896}, {'label': 'V2', 'score': 0.000855746679008007}]]

I hope that this helps you use the model more easily :)
You would have to sort the outputs yourself with a bit of extra code though, but that shouldn't be that difficult.

dr4g0n7ly

Mar 8, 2024

@DarwinAnim8or

We were referring to the code at the end of https://huggingface.co/KoalaAI/Text-Moderation page.
But yeah this code is really helpful. Thank you so much!

DarwinAnim8or

Koala AI org Mar 8, 2024

@dr4g0n7ly

Ah, that code. You could retrieve the labels from the model and print them out, a bit like so:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("KoalaAI/Text-Moderation")
tokenizer = AutoTokenizer.from_pretrained("KoalaAI/Text-Moderation")

inputs = tokenizer("I love AutoTrain", return_tensors="pt")
outputs = model(**inputs)

# Get the predicted logits
logits = outputs.logits

# Apply softmax to get probabilities (scores)
probabilities = logits.softmax(dim=-1).squeeze()

# Retrieve the labels
id2label = model.config.id2label
labels = [id2label[idx] for idx in range(len(probabilities))]

# Combine labels and probabilities, then sort
label_prob_pairs = list(zip(labels, probabilities))
label_prob_pairs.sort(key=lambda item: item[1], reverse=True)  

# Print the sorted results
for label, probability in label_prob_pairs:
    print(f"Label: {label} - Probability: {probability:.4f}")

Which outputs:

Label: OK - Probability: 0.9840
Label: H - Probability: 0.0043
Label: SH - Probability: 0.0039
Label: V - Probability: 0.0019
Label: S - Probability: 0.0018
Label: HR - Probability: 0.0015
Label: V2 - Probability: 0.0011
Label: S3 - Probability: 0.0010
Label: H2 - Probability: 0.0006

Lemme know which of the two codes you prefer, and I'll update the readme to include that version; personally I prefer to use the pipeline as I find it simplifies the code, but lemme know what you think!

DarwinAnim8or

Koala AI org May 8, 2024

Closing since no updates in over a month; hope it helped!

DarwinAnim8or changed discussion status to closed May 8, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment