YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

bert-finetuned-mlm-accelerate

This repository contains a BERT model fine-tuned for Masked Language Modeling (MLM) using the πŸ€— Accelerate library for efficient training and evaluation.


🧠 Model Overview

  • Base model: BERT (pretrained, e.g., bert-base-uncased)
  • Task: Masked Language Modeling (MLM)
  • Fine-tuning framework: Hugging Face Transformers + Accelerate
  • Optimizer: AdamW
  • Learning rate scheduler: Linear scheduler
  • Training epochs: 3
  • Loss metric: Cross-entropy loss over masked tokens
  • Evaluation metric: Perplexity

βš™οΈ Training Details

  • Optimizer used: AdamW from PyTorch.
  • Learning rate scheduler: Linear decay over total training steps.
  • Epochs: 3
  • Sequence length: (e.g., 128 or 512 β€” fill in based on your setup)
  • Batch size: (fill in if known)
  • Mixed precision training with πŸ€— Accelerate
  • Concatenated dataset split into fixed-length chunks to avoid truncation and padding inefficiencies.

πŸ“Š Evaluation Results

Epoch Perplexity
0 12.03
1 11.55
2 11.32

Perplexity was calculated on the validation set after each epoch.


πŸ§ͺ How to Use

You can load the model for masked language modeling using transformers:

from transformers import AutoTokenizer, AutoModelForMaskedLM

model_name = "your-username/bert-finetuned-mlm-accelerate"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)

text = "The capital of France is [MASK]."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

mask_token_index = (inputs["input_ids"] == tokenizer.mask_token_id).nonzero(as_tuple=True)[1]
predicted_token_id = outputs.logits[0, mask_token_index, :].argmax(axis=-1)
print(tokenizer.decode(predicted_token_id))
Downloads last month
1
Safetensors
Model size
67M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support