Edit model card

MIReAD Neuro

This model is a fine-tuned version of arazd/MIReAD on a dataset of Neuroscience papers from 200 journals collected from various sources for a journal classification task. It achieves the following results on the evaluation set:

  • Loss: 2.7117
  • Accuracy: 0.4011
  • F1: 0.3962
  • Precision: 0.4066
  • Recall: 0.3999

Model description

This model was trained on a journal classification task.

Intended uses & limitations

The intended use of this model is to create abstract embeddings for semantic similarity search for neuroscience-related articles.

Model Usage

To load the model:

from transformers import BertForSequenceClassification, AutoTokenizer
model_path = "biodatlab/MIReAD-Neuro"
model = BertForSequenceClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

To create embeddings and for classification:

# sample abstract & title text
title = "Why Brain Criticality Is Clinically Relevant: A Scoping Review."
abstract = "The past 25 years have seen a strong increase in the number of publications related to criticality in different areas of neuroscience..."
text = title + tokenizer.sep_token + abstract
tokens = tokenizer(
    text,
    max_length=512,
    padding=True,
    truncation=True,
    return_tensors="pt"
)

# to generate an embedding from a given title and abstract
with torch.no_grad():
  output = model.bert(**tokens)
  embedding = output.last_hidden_state[:, 0, :]

# to classify (200 journals) a given title and abstract
output = model(**tokens)
class = output.logits 

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • num_epochs: 6
Downloads last month
13
Safetensors
Model size
110M params
Tensor type
I64
ยท
F32
ยท
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using biodatlab/MIReAD-Neuro 1