host model is a Named Entity Recognition (NER) model that identifies and annotates the host (living organism) of microbiome samples in texts.
The model is a fine-tuned BioBERT model and the training dataset is available in https://gitlab.com/maaly7/emerald_metagenomics_annotations
Testing examples:
- Turkestan cockroach nymphs (Finke, 2013) were fed to the treefrogs at a quantity of 10% of treefrog biomass twice a week.
- Samples were collected from clinically healthy giant pandas (five females and four males) at the China Conservation and Research Center for Giant Pandas (Ya'an, China).
- Field-collected bee samples were dissected on dry ice and separated into head, thorax (excluding legs and wings), and abdomens.