Do you want failure cases?

#5
by grofte - opened

Hi Dan

Do you want a collection of failures, edge cases?

I don't have a lot but I thought it might be nice with some feedback?

E.g. "MAGNUM ALMOND 3-PA" identifies "MAGNUM ALMOND" as a person.

Yep, that'd be great! Feel free to just post all of them here in this thread :)

I have only looked at people because that's all I care about. And I don't have the other failure cases where it fails to tag them - that's really hard to search for.
ANANAS, but not ananas
Benaliv, but not benaliv
ARLA KÆRNEM%LK, but not ARLA KÆRNEMÆLK
CHAMPIGN BRUNE, also CHAMPIGNON BRUNE
HEINZ TOMAT KET, also HEINZ TOMAT KETCHUP
GALBANI MOZZARELLA, but not galbani mozzarella
OLIE, but not Olie
Comfortis K
The "INGRID MARIE" part of ÆBLER INGRID MARIE 70/80MM DK STK KL1, but I think that's fair and doesn't actually matter
ARUNA ROSMARIN, but Aruna is an Indian name. Rosmarin is typically not a name. Aruna Rosmarin is not detected as a person.
Amlodipin Aurovitas, is a drug not a person from Harry Potter
"Cater Køl" in Helæg Cater Køl, pretty cool name
KITCHEN BOARD
SOMFY SUNILUS, and I have no idea what that is
Elvira rogn, where only the first part is a name
TORTILLA, but not tortilla or Tortilla
COSYLAN MIKST
ROZES INFANTA
TORTILLA CHIPS
CIRKEL NICARAGUA
KETTLE SEA
SMOOT.BENDIT ANANA, is two names - one before the period and one after
POMMES FAMILY ELDORADO
Bocca
Crestor
TOLKO ANANAS CASTELLO
JOYA DUBAI SANDAL
GEVALIA GOLD

It seems from these examples that I should probably lowercase my data before running it through the NER.

Kørsel, but not kørsel

Sign up or log in to comment