Update README.md
Browse files
README.md
CHANGED
|
@@ -4,13 +4,14 @@ language:
|
|
| 4 |
- en
|
| 5 |
license: apache-2.0
|
| 6 |
datasets:
|
| 7 |
-
- HuggingFaceFW/
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
# FinePDFs-Edu classifier (English)
|
| 11 |
|
| 12 |
## Model summary
|
| 13 |
-
This is a classifier for judging the educational value of web pages. It was developed to filter and curate educational content from web datasets and was trained on
|
| 14 |
|
| 15 |
We used this classifier to build [FinePDFs-Edu](https://huggingface.co/datasets/HuggingFaceFW/finepdfs-edu) dataset.
|
| 16 |
### How to use in transformers
|
|
|
|
| 4 |
- en
|
| 5 |
license: apache-2.0
|
| 6 |
datasets:
|
| 7 |
+
- HuggingFaceFW/finepdfs_eng_Latn_labeled
|
| 8 |
+
|
| 9 |
---
|
| 10 |
|
| 11 |
# FinePDFs-Edu classifier (English)
|
| 12 |
|
| 13 |
## Model summary
|
| 14 |
+
This is a classifier for judging the educational value of web pages. It was developed to filter and curate educational content from web datasets and was trained on 1304547 [annotations](https://huggingface.co/datasets/HuggingFaceFW/finepdfs_fw_edu_labeled) generated by [Qwen3-235B-A22B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507) for web samples from [FinePDFs](https://huggingface.co/datasets/HuggingFaceFW/finepdfs) dataset.
|
| 15 |
|
| 16 |
We used this classifier to build [FinePDFs-Edu](https://huggingface.co/datasets/HuggingFaceFW/finepdfs-edu) dataset.
|
| 17 |
### How to use in transformers
|