Ronysalem commited on
Commit
0e27ddd
1 Parent(s): b660e31

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -5
README.md CHANGED
@@ -16,7 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # bert-finetuned-sem_eval-english
18
 
19
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
 
20
  It achieves the following results on the evaluation set:
21
  - Loss: 0.1673
22
  - F1: 0.8389
@@ -25,15 +26,26 @@ It achieves the following results on the evaluation set:
25
 
26
  ## Model description
27
 
28
- More information needed
29
 
30
  ## Intended uses & limitations
31
 
32
- More information needed
33
-
34
  ## Training and evaluation data
35
 
36
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ## Training procedure
39
 
 
16
 
17
  # bert-finetuned-sem_eval-english
18
 
19
+ This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on[Multi-Label Classification Dataset
20
+ ](https://www.kaggle.com/datasets/shivanandmn/multilabel-classification-dataset).
21
  It achieves the following results on the evaluation set:
22
  - Loss: 0.1673
23
  - F1: 0.8389
 
26
 
27
  ## Model description
28
 
29
+ This model is a BERT base uncased model fine-tuned for multi-label classification of research papers into 6 categories: Computer Science, Physics, Mathematics, Statistics, Quantitative Biology, and Quantitative Finance. It classifies papers based on their title and abstract text.
30
 
31
  ## Intended uses & limitations
32
 
33
+ This model can be used to automatically tag research papers with relevant categories based on the paper's title and abstract. It works best on academic papers in quantitative research fields. Performance may be lower on papers from other domains or with very short abstracts.
 
34
  ## Training and evaluation data
35
 
36
+ The model was trained on a dataset of ~15,000 research paper abstracts labeled with one or more of 6 category tags:
37
+
38
+ * Computer Science
39
+ * Physics
40
+ * Mathematics
41
+ * Statistics
42
+ * Quantitative Biology
43
+ * Quantitative Finance
44
+ *
45
+ The training data includes papers from arXiv and peer-reviewed journals.
46
+
47
+ The model was evaluated on a held-out test set of ~3,000 labeled research paper abstracts drawn from the same distribution as the training data.
48
+
49
 
50
  ## Training procedure
51