nvidia
/

domain-classifier

BoLiu commited on 22 days ago

Commit

add3e60

•

1 Parent(s): a19ba92

Update README.md (#2)

- Update README.md (42c077982c645be8e9718ac85519c6d1b58cd07b)

Co-authored-by: Bo Liu <BoLiu@users.noreply.huggingface.co>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -9,6 +9,7 @@ This is a text classification model to classify documents into one of 26 domain
 # Model Architecture
 The model architecture is Deberta V3 Base
 Context length is 512 tokens
 # Training (details)
@@ -17,11 +18,7 @@ Context length is 512 tokens
 - 500k Wikepedia articles, curated using Wikipedia-API: https://pypi.org/project/Wikipedia-API/
 ## Training steps:
-- Train a first model on Wikipedia data
-- Randomly sample 1 million Common Crawl data; label them using Google Cloud API
-- Predict these 1 million samples using the first model
-- Google’s labels and first model’s prediction agree on about 500k samples
-- Split these 500k samples 80%/20%. Train the final model on the 80%, and evaluate on the 20%
 # How To Use This Model
@@ -29,6 +26,7 @@ Context length is 512 tokens
 The model takes one or several paragraphs of text as input.
 Example input:
 q Directions
 1. Mix 2 flours and baking powder together
@@ -36,12 +34,15 @@ q Directions
 3. Heat frying pan on medium
 4. Pour batter into pan and then put blueberries on top before flipping
 5. Top with desired toppings!
 ## Output
 The model outputs one of the 26 domain classes as the predicted domain for each input sample.
 Example output:
 Food_and_Drink
 # Evaluation Benchmarks
 Accuracy on 500 human annotated samples

 # Model Architecture
 The model architecture is Deberta V3 Base
 Context length is 512 tokens
 # Training (details)
 - 500k Wikepedia articles, curated using Wikipedia-API: https://pypi.org/project/Wikipedia-API/
 ## Training steps:
+Model was trained in multiple rounds using Wikipedia and Common Crawl data, labeled by a combination of pseudo labels and Google Cloud API.
 # How To Use This Model
 The model takes one or several paragraphs of text as input.
 Example input:
+```
 q Directions
 1. Mix 2 flours and baking powder together
 3. Heat frying pan on medium
 4. Pour batter into pan and then put blueberries on top before flipping
 5. Top with desired toppings!
+```
 ## Output
 The model outputs one of the 26 domain classes as the predicted domain for each input sample.
 Example output:
+```
 Food_and_Drink
+```
 # Evaluation Benchmarks
 Accuracy on 500 human annotated samples