lighteternal's picture
Update README.md
fdb646e
---
language: en
tags:
- textual-entailment
- nli
- pytorch
datasets:
- mnli
license: mit
widget :
- text: "EpCAM is overexpressed in breast cancer. </s></s> EpCAM is downregulated in breast cancer."
---
# BiomedNLP-PubMedBERT finetuned on textual entailment (NLI)
The [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext?text=%5BMASK%5D+is+a+tumor+suppressor+gene) finetuned on the MNLI dataset. It should be useful in textual entailment tasks involving biomedical corpora.
## Usage
Given two sentences (a premise and a hypothesis), the model outputs the logits of entailment, neutral or contradiction.
You can test the model using the HuggingFace model widget on the side:
- Input two sentences (premise and hypothesis) one after the other.
- The model returns the probabilities of 3 labels: entailment(LABEL:0), neutral(LABEL:1) and contradiction(LABEL:2) respectively.
To use the model locally on your machine:
```python
# import torch
# device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("lighteternal/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-finetuned-mnli")
model = AutoModelForSequenceClassification.from_pretrained("lighteternal/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-finetuned-mnli")
premise = 'EpCAM is overexpressed in breast cancer'
hypothesis = 'EpCAM is downregulated in breast cancer.'
# run through model pre-trained on MNLI
x = tokenizer.encode(premise, hypothesis, return_tensors='pt',
truncation_strategy='only_first')
logits = model(x)[0]
probs = logits.softmax(dim=1)
print('Probabilities for entailment, neutral, contradiction \n', np.around(probs.cpu().
detach().numpy(),3))
# Probabilities for entailment, neutral, contradiction
# 0.001 0.001 0.998
```
## Metrics
Evaluation on classification accuracy (entailment, contradiction, neutral) on MNLI test set:
| Metric | Value |
| --- | --- |
| Accuracy | 0.8338|
See Training Metrics tab for detailed info.