Completely different predictions with same input

#2
by mox - opened

Hi,
I tested the model with following (german) string:

"Dağdelen erzählt immer noch ihr Märchen davon, dass man den Krieg in der Ukraine nicht mit Sanktionen stoppen könne, im Gegenteil - wir müssen das Krokodil weiter demütig füttern in der Hoffnung, dass es uns erst zuletzt frisst. Geh' zum Teufel, du dumme Kuh!"

When using the code from the model card, I get the result that this string is neutral with 99.9%:

image.png

When I try out the same string in the hosted inference API on the website, it predicts the class "negative" with 59.3% which seems more correct for me.

image.png

What could be the reason for this?

Thanks in advance!

Hi @mox ,
the germansentiment lib uses the same preprocessing as I used for the training of the model, but in some cases this preprocessing might not always improve the results.
If you want to use the same inference code as the API, without any preprocessing, you can simply use the pipeline (code below). However, you might want to collect a meaningful number of samples and decide which version (with or without preprocessing) works better for you.

from transformers import pipeline
test = pipeline("text-classification", "oliverguhr/german-sentiment-bert")
test("Dağdelen erzählt immer noch ihr Märchen davon, dass man den Krieg in der Ukraine nicht mit Sanktionen stoppen könne, im Gegenteil - wir müssen das Krokodil weiter demütig füttern in der Hoffnung, dass es uns erst zuletzt frisst. Geh' zum Teufel, du dumme Kuh!")

> [{'label': 'negative', 'score': 0.5927214622497559}]
mox changed discussion status to closed

Sign up or log in to comment