The underlying data set is flawed

#1
by OliP - opened

Just wanted to point out this analysis: https://www.kaggle.com/code/josutk/only-one-word-99-2
Biases in several direction. In particular a classifier soley based on whether "reuters" is included in the text gives 99% accurarcy.

Sign up or log in to comment