MARIA-OLMo-7B

MARIA ("Masked Autoregressive Infilling Adapter") is a small trained head that lets a frozen autoregressive (AR) language model fill in masked tokens. This checkpoint bundles:

Component Source Frozen?
AR backbone allenai/OLMo-7B-0724-hf โœ…
MLM backbone answerdotai/ModernBERT-large โœ…
Fusion head trained (this repo) โ€“

Only the fusion head (โ‰ˆ257M params) was trained. Both backbones are frozen and shipped here unchanged.

Paper: Enabling Autoregressive Models to Fill In Masked Tokens (Israel et al., 2025) Code & training scripts: https://github.com/danielmisrael/maria

Quickstart

import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer

model = AutoModelForMaskedLM.from_pretrained(
    "dmisrael/maria-olmo-7b", trust_remote_code=True, torch_dtype=torch.bfloat16
).to("cuda").eval()
tok = AutoTokenizer.from_pretrained("dmisrael/maria-olmo-7b")

text = f"The capital of France is {tok.mask_token}."
ids = tok(text, return_tensors="pt").input_ids.cuda()
out = model.infill(ids, mask_token_id=tok.mask_token_id)
print(tok.decode(out[0], skip_special_tokens=True))
# -> "The capital of France is Paris."

API

The model class exposes two inference methods on top of the standard HF interface (see the GitHub README for full docs):

  • model.infill(input_ids, mask_token_id, greedy=True) โ€” fills every position where input_ids == mask_token_id left-to-right.
  • model.compute_nll(input_ids, labels, reduction='mean') โ€” returns negative log-likelihood at positions where labels != -100.

Citation

@article{israel2025maria,
  title  = {Enabling Autoregressive Models to Fill In Masked Tokens},
  author = {Israel, Daniel and Grover, Aditya and Van den Broeck, Guy},
  journal= {arXiv preprint arXiv:2502.06901},
  year   = {2025}
}

License

Apache 2.0. Inherits from the underlying allenai/OLMo-7B-0724-hf and answerdotai/ModernBERT-large backbones.

Downloads last month
31
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for dmisrael/maria-olmo-7b

Finetuned
(59)
this model

Paper for dmisrael/maria-olmo-7b