---
language:
- en
tags:
- generated_from_trainer
datasets:
- glue
model-index:
- name: debug
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# BioLinkBERT-large-mnli-snli

This model is a fine-tuned version of [BioLinkBERT-large](https://huggingface.co/michiyasunaga/BioLinkBERT-large) on the GLUE [MNLI](https://huggingface.co/datasets/multi_nli) and [SNLI](https://huggingface.co/datasets/snli) dataset.

The results are 

| **Model**                   | **Dataset** | **Acc** |
|-----------------------------|-------------|---------|
| Roberta-large-mnli          | MNLI dev mm | 90.12   |
|                             | MNLI dev m  | 90.59   |
|                             | SNLI test   | 88.25   |
| BioLinkBERT-large           | MNLI dev mm | 33.56   |
|                             | MNLI dev m  | 33.18   |
|                             | SNLI test   | 32.66   |
| BioLinkBERT-large-mnli-snli | MNLI dev mm | 85.75   |
|                             | MNLI dev m  | 85.30   |
|                             | SNLI test   | 89.82   |

The labels are
"0": "entailment",
"1": "neutral",
"2": "contradiction"

For finetuning [BioLinkBERT-large](https://huggingface.co/michiyasunaga/BioLinkBERT-large) on 
- only MNLI (i.e. not SNLI), see [`cnut1648/biolinkbert-mnli`](https://huggingface.co/cnut1648/biolinkbert-mnli)
- MedNLI, see [`cnut1648/biolinkbert-mednli`](https://huggingface.co/cnut1648/biolinkbert-mednli)
- resampled MNLI such that the size is the same as MedNLI, see [`cnut1648/biolinkbert-large-mnli-resampled`](https://huggingface.co/cnut1648/biolinkbert-large-mnli-resampled)

## Training procedure

This model checkpoint is created by [mnli.py](https://huggingface.co/cnut1648/biolinkbert-large-mnli-snli/blob/main/mnli.py) with the following command:
```shell
CUDA_VISIBLE_DEVICES=0,1,2; python -m torch.distributed.launch \
    --nproc_per_node 3 mnli.py \
    --model_name_or_path michiyasunaga/BioLinkBERT-large --task_name mnli --add_snli \
    --do_train --max_seq_length 512 --fp16 --per_device_train_batch_size 16 --gradient_accumulation_steps 2 \
    --learning_rate 3e-5 --warmup_ratio 0.5 --num_train_epochs 10 \
    --output_dir ./biolinkbert_mnli_snli
```
which will create a folder `biolinkbert_mnli_snli` that contains the checkpoints.

And all checkpoints are evaluated on MNLI dev (mismatched and matched) & SNLI test by
```shell
name=biolinkbert_mnli_snli
root=$PWD/$name
mkdir $PWD/eval/$name
for run in $root/checkpoint-*; do
    step=$( echo $run | rg "checkpoint-(?P<step>\d+)" -or '$step')
    out="./eval/${name}/eval_$step"
    echo "eval of $step ---- save to $out";

    CUDA_VISIBLE_DEVICES=0; python mnli.py \
        --model_name_or_path $run --task_name mnli --add_snli \
        --do_eval --max_seq_length 512 --fp16 --report_to none \
        --per_device_eval_batch_size 8 --output_dir $out
done;
```
which will create a folder `eval/biolinkbert_mnli_snli` that contains evaluation results for MNLI dev & SNLI test.

The best checkpoint is then selected according to `mnli-m`:
```python
import json
import os
from pathlib import Path
import pandas as pd
from rich.console import Console
from rich.table import Table

pwd = Path(__file__).parent.resolve()
name = "eval/biolinkbert_mnli"
results = []
for evalckpt in os.listdir(pwd / name):
    step = evalckpt.split("_")[1]
    with open(pwd / name / evalckpt / "all_results.json") as f:
        data = json.load(f)
    results.append([int(step), data["mnli-mm_eval_accuracy"] * 100, data["mnli_eval_accuracy"] * 100, data["snli-test_eval_accuracy"] * 100])
results = pd.DataFrame(results, columns=["step", "mnli-mm", "mnli-m", "snli"]).sort_values(by=['step'])
console = Console()
table = Table(name)
table.add_row(results.to_string(float_format=lambda _: '{:.3f}'.format(_)))
console.print(table)
best = results["mnli-m"].idxmax()
print(results.loc[best])
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.5
- num_epochs: 10.0
- mixed_precision_training: Native AMP

### Framework versions

- Transformers 4.22.2
- Pytorch 1.12.1+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1