--- language: - en tags: - generated_from_trainer datasets: - glue model-index: - name: debug results: [] --- # BioLinkBERT-large-mnli-snli This model is a fine-tuned version of [BioLinkBERT-large](https://huggingface.co/michiyasunaga/BioLinkBERT-large) on the GLUE [MNLI](https://huggingface.co/datasets/multi_nli) and [SNLI](https://huggingface.co/datasets/snli) dataset. The results are | **Model** | **Dataset** | **Acc** | |-----------------------------|-------------|---------| | Roberta-large-mnli | MNLI dev mm | 90.12 | | | MNLI dev m | 90.59 | | | SNLI test | 88.25 | | BioLinkBERT-large | MNLI dev mm | 33.56 | | | MNLI dev m | 33.18 | | | SNLI test | 32.66 | | BioLinkBERT-large-mnli-snli | MNLI dev mm | 85.75 | | | MNLI dev m | 85.30 | | | SNLI test | 89.82 | The labels are "0": "entailment", "1": "neutral", "2": "contradiction" For finetuning [BioLinkBERT-large](https://huggingface.co/michiyasunaga/BioLinkBERT-large) on - only MNLI (i.e. not SNLI), see [`cnut1648/biolinkbert-mnli`](https://huggingface.co/cnut1648/biolinkbert-mnli) - MedNLI, see [`cnut1648/biolinkbert-mednli`](https://huggingface.co/cnut1648/biolinkbert-mednli) - resampled MNLI such that the size is the same as MedNLI, see [`cnut1648/biolinkbert-large-mnli-resampled`](https://huggingface.co/cnut1648/biolinkbert-large-mnli-resampled) ## Training procedure This model checkpoint is created by [mnli.py](https://huggingface.co/cnut1648/biolinkbert-large-mnli-snli/blob/main/mnli.py) with the following command: ```shell CUDA_VISIBLE_DEVICES=0,1,2; python -m torch.distributed.launch \ --nproc_per_node 3 mnli.py \ --model_name_or_path michiyasunaga/BioLinkBERT-large --task_name mnli --add_snli \ --do_train --max_seq_length 512 --fp16 --per_device_train_batch_size 16 --gradient_accumulation_steps 2 \ --learning_rate 3e-5 --warmup_ratio 0.5 --num_train_epochs 10 \ --output_dir ./biolinkbert_mnli_snli ``` which will create a folder `biolinkbert_mnli_snli` that contains the checkpoints. And all checkpoints are evaluated on MNLI dev (mismatched and matched) & SNLI test by ```shell name=biolinkbert_mnli_snli root=$PWD/$name mkdir $PWD/eval/$name for run in $root/checkpoint-*; do step=$( echo $run | rg "checkpoint-(?P\d+)" -or '$step') out="./eval/${name}/eval_$step" echo "eval of $step ---- save to $out"; CUDA_VISIBLE_DEVICES=0; python mnli.py \ --model_name_or_path $run --task_name mnli --add_snli \ --do_eval --max_seq_length 512 --fp16 --report_to none \ --per_device_eval_batch_size 8 --output_dir $out done; ``` which will create a folder `eval/biolinkbert_mnli_snli` that contains evaluation results for MNLI dev & SNLI test. The best checkpoint is then selected according to `mnli-m`: ```python import json import os from pathlib import Path import pandas as pd from rich.console import Console from rich.table import Table pwd = Path(__file__).parent.resolve() name = "eval/biolinkbert_mnli" results = [] for evalckpt in os.listdir(pwd / name): step = evalckpt.split("_")[1] with open(pwd / name / evalckpt / "all_results.json") as f: data = json.load(f) results.append([int(step), data["mnli-mm_eval_accuracy"] * 100, data["mnli_eval_accuracy"] * 100, data["snli-test_eval_accuracy"] * 100]) results = pd.DataFrame(results, columns=["step", "mnli-mm", "mnli-m", "snli"]).sort_values(by=['step']) console = Console() table = Table(name) table.add_row(results.to_string(float_format=lambda _: '{:.3f}'.format(_))) console.print(table) best = results["mnli-m"].idxmax() print(results.loc[best]) ``` ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 16 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.5 - num_epochs: 10.0 - mixed_precision_training: Native AMP ### Framework versions - Transformers 4.22.2 - Pytorch 1.12.1+cu113 - Datasets 2.4.0 - Tokenizers 0.12.1