Ihor commited on
Commit
8f37347
1 Parent(s): 6a9dd1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md CHANGED
@@ -1,3 +1,40 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ metrics:
6
+ - f1
7
+ - accuracy
8
+ - precision
9
+ library_name: transformers
10
+ pipeline_tag: text-classification
11
  ---
12
+
13
+ **DILI-scibert**
14
+ This is a text classification model based on [Scibert](allenai/scibert_scivocab_uncased) fine-tuned on a binary text classification dataset to recognize papers mentioned drug-incded liver injury (DILI).
15
+
16
+ The model was trained to participate in the CAMDA challenge, the dataset and details of the challenge can be found [here](https://bipress.boku.ac.at/camda2022/).
17
+
18
+ ### Dataset
19
+ The CAMDA committee and FDA initially provided a training set of approximately 14,000 DILI-related papers from LiverTox, equally split into positive and negative examples.
20
+ The challenge participants also received test and validation sets with varying levels of imbalance, incorporating increasing numbers of true negatives to mirror real-world task complexity.
21
+ The first validation set had 6,494 abstracts, the second 32,814, and the third 100,265. Additionally, to evaluate model overfitting, the fourth validation set comprised 14,000 expert summaries instead of article abstracts.
22
+
23
+ ### Training
24
+ After the selection of 90% of data for training, the following hyperparameters were used:
25
+ * learning rate: 2e^-5;
26
+ * weight-decay: 0.001;
27
+ * batch size: 12;
28
+ * focal loss gamma: 2;
29
+ * focal loss alpha: 0.3;
30
+
31
+ ### Citation
32
+ If using these models, please cite the following paper:
33
+ @article{Stepanov2023ComparativeAO,
34
+ title={Comparative analysis of classification techniques for topic-based biomedical literature categorisation},
35
+ author={Ihor Stepanov and Arsentii Ivasiuk and Oleksandr Yavorskyi and Alina Frolova},
36
+ journal={Frontiers in Genetics},
37
+ year={2023},
38
+ volume={14},
39
+ url={https://api.semanticscholar.org/CorpusID:265428155}
40
+ }