File size: 2,812 Bytes
b660e31
 
 
 
 
 
 
 
 
fc9f031
b660e31
 
 
 
 
 
 
 
0e27ddd
 
b660e31
 
 
 
 
 
 
 
0e27ddd
b660e31
 
 
0e27ddd
b660e31
 
0e27ddd
 
 
 
 
 
 
 
 
 
 
 
 
b660e31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: apache-2.0
base_model: bert-base-uncased
tags:
- generated_from_trainer
metrics:
- f1
- accuracy
model-index:
- name: Multiclass-Classifcation[Bert]
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# bert-finetuned-sem_eval-english

This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on[Multi-Label Classification Dataset
](https://www.kaggle.com/datasets/shivanandmn/multilabel-classification-dataset).
It achieves the following results on the evaluation set:
- Loss: 0.1673
- F1: 0.8389
- Roc Auc: 0.8999
- Accuracy: 0.7046

## Model description

This model is a BERT base uncased model fine-tuned for multi-label classification of research papers into 6 categories: Computer Science, Physics, Mathematics, Statistics, Quantitative Biology, and Quantitative Finance. It classifies papers based on their title and abstract text.

## Intended uses & limitations

This model can be used to automatically tag research papers with relevant categories based on the paper's title and abstract. It works best on academic papers in quantitative research fields. Performance may be lower on papers from other domains or with very short abstracts.
## Training and evaluation data

The model was trained on a dataset of ~15,000 research paper abstracts labeled with one or more of 6 category tags:

* Computer Science
* Physics
* Mathematics
* Statistics
* Quantitative Biology
* Quantitative Finance
* 
The training data includes papers from arXiv and peer-reviewed journals.

The model was evaluated on a held-out test set of ~3,000 labeled research paper abstracts drawn from the same distribution as the training data.


## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step  | Validation Loss | F1     | Roc Auc | Accuracy |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:-------:|:--------:|
| 0.1857        | 1.0   | 2098  | 0.1924          | 0.8143 | 0.8825  | 0.6760   |
| 0.1586        | 2.0   | 4196  | 0.1673          | 0.8389 | 0.8999  | 0.7046   |
| 0.1194        | 3.0   | 6294  | 0.1777          | 0.8361 | 0.8982  | 0.6989   |
| 0.0975        | 4.0   | 8392  | 0.1958          | 0.8312 | 0.8932  | 0.6946   |
| 0.0695        | 5.0   | 10490 | 0.2113          | 0.8315 | 0.8957  | 0.6918   |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu118
- Tokenizers 0.15.0