afg1
/

lncrna-biocontext-longformer

Text Classification

Inference Endpoints

Model card Files Files and versions Community

afg1 commited on Jan 15

Commit

8a13977

•

1 Parent(s): 71763de

Create README.md

Files changed (1) hide show

README.md +33 -0

README.md ADDED Viewed

	@@ -0,0 +1,33 @@

+---
+license: cc-by-4.0
+language:
+- en
+metrics:
+- f1
+- accuracy
+library_name: transformers
+pipeline_tag: text-classification
+---
+# lncrna-biocontext
+This model is designed to determine whether a given abstract talks about an lncRNA in the context of disease or not.
+The model has been trained on data from [lncBook-Wiki](https://ngdc.cncb.ac.cn/lncbook/) about papers
+which have been curated by experts based on the biological context they discuss. We have collected the
+abstracts for these papers and simplified the classification into disease/not disease. We then fine-tune a
+[longformer](https://huggingface.co/allenai/longformer-base-4096) model to make a binary classification.
+We achieve pretty good results:
+| Metric | Score |
+|-|- |
+| Accuracy | 0.84 |
+| F1 | 0.82 |
+| ROC| 0.98 |
+Though the test set is only 59 examples, with 22 discussing disease.
+The next step will be to be able to classify both the specific disease (e.g. lung adenocarcinoma), and the non-disease
+context (e.g. localisation) a paper discusses.