File size: 1,561 Bytes
de4007d
ee6e5ac
 
de4007d
ce42369
12f328a
f77ba74
 
da0a9e4
12f328a
88faf58
f77ba74
 
de4007d
 
8b32e3e
6ac70e8
 
 
 
 
18be0fb
 
de4007d
ee6e5ac
18be0fb
de4007d
cff7ff4
de4007d
69f7db8
de4007d
 
69f7db8
de4007d
1a11b0b
de4007d
 
 
69f7db8
de4007d
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---

language: en
tags:
- token-classification
- named-entity-recognition
- multi_class_classification

task:
- token-classification
- named-entity-recognition
- multi_class_classification

license: cc
datasets:
- ncbi_disease
metrics:
- precision
- recall
- f1
- accuracy
widget:
- text: " The risk of cancer, especially lymphoid neoplasias, is substantially elevated in A-T patients and has long been associated with chromosomal instability."

---

## Model information:
distilibert-base-uncased model finetuned using the ncbi_disease dataset from the datasets library. 

## Intended uses:
This model is intended to be used for named entity recoginition tasks. The model will identify disease entities in text.  The model will predict lables based upon the NCBI-disease dataset, please see the dataset information for details.

## Limitations:
Note that the dataset and model may not be fully represetative or suitable for all needs it is recommended that the paper for the dataset and the base model card should be reviewed before using the model - 
- [NCBI Disease](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3951655/pdf/nihms557856.pdf)
- [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)


## How to use:
Load the model from the library using the following checkpoints:
```python
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("sarahmiller137/distilbert-base-uncased-ft-ncbi-disease")
model = AutoModel.from_pretrained("sarahmiller137/distilbert-base-uncased-ft-ncbi-disease")
```