|
--- |
|
tags: |
|
- flair |
|
- hunflair |
|
- token-classification |
|
- sequence-tagger-model |
|
language: en |
|
widget: |
|
- text: It contains a functional GCGGCGGCG Egr-1-binding site |
|
--- |
|
|
|
## HunFlair2 model for TFBS |
|
|
|
[HunFlair](https://github.com/flairNLP/flair/blob/master/resources/docs/HUNFLAIR2.md) (biomedical flair) for enhancer entity: |
|
|
|
- pre-trained language model: michiyasunaga/BioLinkBERT-base |
|
- fine-tuned on RegEl corpus for `Tfbs` entity type |
|
|
|
Predicts 1 tag: |
|
|
|
| **tag** | **meaning** | |
|
| ------- | ---------------------------------------- | |
|
| Tfbs | DNA region bound by transcription factor | |
|
|
|
______________________________________________________________________ |
|
|
|
## Info |
|
|
|
### Demo: How to use in Flair |
|
|
|
Requires: |
|
|
|
- **[Flair](https://github.com/flairNLP/flair/)>=0.14.0** (`pip install flair` or `pip install git+https://github.com/flairNLP/flair.git`) |
|
|
|
```python |
|
from flair.data import Sentence |
|
from flair.nn import Classifier |
|
from flair.tokenization import SciSpacyTokenizer |
|
|
|
text = "We found that Egr-1 specifically binds to the PTEN 5' untranslated region, which contains a functional GCGGCGGCG Egr-1-binding site." |
|
sentence = Sentence(text, use_tokenizer=SciSpacyTokenizer()) |
|
|
|
tagger = Classifier.load("regel-corpus/hunflair2-regel-tfbs") |
|
tagger.predict(sentence) |
|
|
|
print('The following NER tags are found:') |
|
# iterate over entities and print |
|
for entity in sentence.get_spans('ner'): |
|
print(entity) |
|
``` |
|
|