kenhktsui commited on
Commit
e9f5cc1
1 Parent(s): a2af9e6

docs: update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -9,14 +9,19 @@ metrics:
9
  - recall
10
  - f1
11
  model-index:
12
- - name: llm-data-quality-classifer-compare
13
  results: []
 
 
 
 
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
- # llm-data-quality-classifer-compare
 
20
 
21
  This model is a fine-tuned version of [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
@@ -187,4 +192,4 @@ The following hyperparameters were used during training:
187
  - Transformers 4.35.2
188
  - Pytorch 2.1.0+cu121
189
  - Datasets 2.16.1
190
- - Tokenizers 0.15.0
 
9
  - recall
10
  - f1
11
  model-index:
12
+ - name: llm-data-textbook-quality-classifer-v1
13
  results: []
14
+ datasets:
15
+ - kenhktsui/llm-data-quality-tokenized
16
+ language:
17
+ - en
18
  ---
19
 
20
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
21
  should probably proofread and complete it, then remove this comment. -->
22
 
23
+ # llm-data-textbook-quality-classifer-v1
24
+ This model can classify if a text is of textbook quality data. It can be used as a filter for data curation when training a LLM.
25
 
26
  This model is a fine-tuned version of [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base) on an unknown dataset.
27
  It achieves the following results on the evaluation set:
 
192
  - Transformers 4.35.2
193
  - Pytorch 2.1.0+cu121
194
  - Datasets 2.16.1
195
+ - Tokenizers 0.15.0