Edit model card

biolinkbert-mednli

This model is a fine-tuned version of michiyasunaga/BioLinkBERT-large on MedNLI. It achieves the following results on the evaluation set:

{
    "eval_accuracy": 0.8788530230522156,
    "eval_loss": 0.7843484878540039,
    "eval_runtime": 39.7009,
    "eval_samples": 1395,
    "eval_samples_per_second": 35.138,
    "eval_steps_per_second": 1.108
}

The accuracy for the test set is

{
    "eval_accuracy": 0.8607594966888428,
    "eval_loss": 0.879707932472229,
    "eval_runtime": 27.4404,
    "eval_samples": 1395,
    "eval_samples_per_second": 51.821,
    "eval_steps_per_second": 1.64
}

The labels are

"id2label": {
    "0": "entailment",
    "1": "neutral",
    "2": "contradiction"
  },

Training procedure

This model checkpoint is made by mednli.py by the following command:

root=/path/to/mednli/;
python mednli.py \
    --model_name_or_path michiyasunaga/BioLinkBERT-large \
    --do_train --train_file ${root}/mli_train_v1.jsonl \
    --do_eval --validation_file ${root}/mli_dev_v1.jsonl \
    --do_predict --test_file ${root}/mli_test_v1.jsonl \
    --max_seq_length 512 --fp16 --per_device_train_batch_size 16 --gradient_accumulation_steps 2 \
    --learning_rate 3e-5 --warmup_ratio 0.5 --num_train_epochs 10 \
    --output_dir ./biolinkbert_mednli

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.5
  • num_epochs: 10.0
  • mixed_precision_training: Native AMP

Framework versions

  • Transformers 4.22.2
  • Pytorch 1.13.0+cu117
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
39
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.