afg1 commited on
Commit
8a13977
1 Parent(s): 71763de

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ language:
4
+ - en
5
+ metrics:
6
+ - f1
7
+ - accuracy
8
+ library_name: transformers
9
+ pipeline_tag: text-classification
10
+ ---
11
+
12
+ # lncrna-biocontext
13
+ This model is designed to determine whether a given abstract talks about an lncRNA in the context of disease or not.
14
+
15
+ The model has been trained on data from [lncBook-Wiki](https://ngdc.cncb.ac.cn/lncbook/) about papers
16
+ which have been curated by experts based on the biological context they discuss. We have collected the
17
+ abstracts for these papers and simplified the classification into disease/not disease. We then fine-tune a
18
+ [longformer](https://huggingface.co/allenai/longformer-base-4096) model to make a binary classification.
19
+
20
+ We achieve pretty good results:
21
+
22
+ | Metric | Score |
23
+ |-|- |
24
+ | Accuracy | 0.84 |
25
+ | F1 | 0.82 |
26
+ | ROC| 0.98 |
27
+
28
+ Though the test set is only 59 examples, with 22 discussing disease.
29
+
30
+ The next step will be to be able to classify both the specific disease (e.g. lung adenocarcinoma), and the non-disease
31
+ context (e.g. localisation) a paper discusses.
32
+
33
+