KnutJaegersberg
/

wikipedia_categories_setfit

Sentence Similarity

feature-extraction

Model card Files Files and versions Community

KnutJaegersberg commited on Jul 11, 2023

Commit

3a1c4d1

•

1 Parent(s): 12f15f6

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -11,10 +11,10 @@ datasets:
 - KnutJaegersberg/wikipedia_categories_labels
 ---
-This English model predicts the top 2 levels of the wikipedia categories (roundabout 1100 labels). It is trained on the concatenation of the headlines of the lower level categories articles in few shot setting (i.e. 8 subcategories with their headline concatenations per level 2 category).
-Accuracy on test data split of the higher category level (37 labels) is 73 % and on level 2 is 60%.
 Note that these numbers are just an indicator that training worked, it will differ in production settings, which is why this classifier is meant for corpus exploration.
-Use the wikipedia_categories_labels dataset as key.
@@ -24,4 +24,4 @@ Download from Hub and run inference
 model = SetFitModel.from_pretrained("KnutJaegersberg/wikipedia_categories_setfit")
 Run inference
-preds = model(["Rachel Dolezal Faces Felony Charges For Welfare Fraud", "Elon Musk just got lucky", "The hype on AI is different from the hype on other tech topics"])

 - KnutJaegersberg/wikipedia_categories_labels
 ---
+This English model (e5-large as basis) predicts wikipedia categories (roundabout 37 labels). It is trained on the concatenation of the headlines of the lower level categories articles in few shot setting (i.e. 8 subcategories with their headline concatenations per level 2 category).
+Accuracy on test data split is 85 %.
 Note that these numbers are just an indicator that training worked, it will differ in production settings, which is why this classifier is meant for corpus exploration.
+Use the wikipedia_categories_labels dataset as key.
 model = SetFitModel.from_pretrained("KnutJaegersberg/wikipedia_categories_setfit")
 Run inference
+preds = model(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])