ryantwolf BoLiu commited on
Commit
add3e60
1 Parent(s): a19ba92

Update README.md (#2)

Browse files

- Update README.md (42c077982c645be8e9718ac85519c6d1b58cd07b)


Co-authored-by: Bo Liu <BoLiu@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -9,6 +9,7 @@ This is a text classification model to classify documents into one of 26 domain
9
 
10
  # Model Architecture
11
  The model architecture is Deberta V3 Base
 
12
  Context length is 512 tokens
13
 
14
  # Training (details)
@@ -17,11 +18,7 @@ Context length is 512 tokens
17
  - 500k Wikepedia articles, curated using Wikipedia-API: https://pypi.org/project/Wikipedia-API/
18
 
19
  ## Training steps:
20
- - Train a first model on Wikipedia data
21
- - Randomly sample 1 million Common Crawl data; label them using Google Cloud API
22
- - Predict these 1 million samples using the first model
23
- - Google’s labels and first model’s prediction agree on about 500k samples
24
- - Split these 500k samples 80%/20%. Train the final model on the 80%, and evaluate on the 20%
25
 
26
  # How To Use This Model
27
 
@@ -29,6 +26,7 @@ Context length is 512 tokens
29
  The model takes one or several paragraphs of text as input.
30
 
31
  Example input:
 
32
  q Directions
33
 
34
  1. Mix 2 flours and baking powder together
@@ -36,12 +34,15 @@ q Directions
36
  3. Heat frying pan on medium
37
  4. Pour batter into pan and then put blueberries on top before flipping
38
  5. Top with desired toppings!
 
39
 
40
  ## Output
41
  The model outputs one of the 26 domain classes as the predicted domain for each input sample.
42
 
43
  Example output:
 
44
  Food_and_Drink
 
45
 
46
  # Evaluation Benchmarks
47
  Accuracy on 500 human annotated samples
 
9
 
10
  # Model Architecture
11
  The model architecture is Deberta V3 Base
12
+
13
  Context length is 512 tokens
14
 
15
  # Training (details)
 
18
  - 500k Wikepedia articles, curated using Wikipedia-API: https://pypi.org/project/Wikipedia-API/
19
 
20
  ## Training steps:
21
+ Model was trained in multiple rounds using Wikipedia and Common Crawl data, labeled by a combination of pseudo labels and Google Cloud API.
 
 
 
 
22
 
23
  # How To Use This Model
24
 
 
26
  The model takes one or several paragraphs of text as input.
27
 
28
  Example input:
29
+ ```
30
  q Directions
31
 
32
  1. Mix 2 flours and baking powder together
 
34
  3. Heat frying pan on medium
35
  4. Pour batter into pan and then put blueberries on top before flipping
36
  5. Top with desired toppings!
37
+ ```
38
 
39
  ## Output
40
  The model outputs one of the 26 domain classes as the predicted domain for each input sample.
41
 
42
  Example output:
43
+ ```
44
  Food_and_Drink
45
+ ```
46
 
47
  # Evaluation Benchmarks
48
  Accuracy on 500 human annotated samples