afg1
/

lncrna-biocontext-longformer

Text Classification

Inference Endpoints

Model card Files Files and versions Community

lncrna-biocontext-longformer / README.md

afg1's picture

Create README.md

8a13977 verified 8 months ago

|

No virus

1.03 kB

	---
	license: cc-by-4.0
	language:
	- en
	metrics:
	- f1
	- accuracy
	library_name: transformers
	pipeline_tag: text-classification
	---

	# lncrna-biocontext
	This model is designed to determine whether a given abstract talks about an lncRNA in the context of disease or not.

	The model has been trained on data from [lncBook-Wiki](https://ngdc.cncb.ac.cn/lncbook/) about papers
	which have been curated by experts based on the biological context they discuss. We have collected the
	abstracts for these papers and simplified the classification into disease/not disease. We then fine-tune a
	[longformer](https://huggingface.co/allenai/longformer-base-4096) model to make a binary classification.

	We achieve pretty good results:

	\| Metric \| Score \|
	\|-\|- \|
	\| Accuracy \| 0.84 \|
	\| F1 \| 0.82 \|
	\| ROC\| 0.98 \|

	Though the test set is only 59 examples, with 22 discussing disease.

	The next step will be to be able to classify both the specific disease (e.g. lung adenocarcinoma), and the non-disease
	context (e.g. localisation) a paper discusses.