opt-mlops-merged

facebook/opt-125m fine-tuned with a LoRA adapter (atulkrs/opt-mlops-lora) and fully merged into base weights via PeftModel.merge_and_unload().

The adapter deltas are baked in โ€” no PEFT dependency needed at inference time.

Load (full precision)

from transformers import AutoModelForCausalLM, AutoTokenizer

model     = AutoModelForCausalLM.from_pretrained("atulkrs/opt-mlops-merged")
tokenizer = AutoTokenizer.from_pretrained("atulkrs/opt-mlops-merged")

Load in 4-bit with BitsAndBytes (recommended for GPU inference)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",           # NormalFloat4 โ€” from QLoRA paper
    bnb_4bit_compute_dtype=torch.float16,
)

model = AutoModelForCausalLM.from_pretrained(
    "atulkrs/opt-mlops-merged",
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("atulkrs/opt-mlops-merged")

Tip: swap torch.float16 for torch.bfloat16 on Ampere+ GPUs (A100, RTX 30xx+) for better numerical stability at no speed cost.

Size & load-time benchmark

Format Size Notes
FP32 (merged) 477.8 MB measured via param_size_mb()
4-bit NF4 (est.) 59.7 MB approx fp32 / 8
Reduction ~8x

4-bit load time benchmark requires Linux + CUDA + bitsandbytes; estimated load time on GPU is typically 2โ€“5s for a 125M model.

Merge details

Field Value
Base model facebook/opt-125m
Adapter atulkrs/opt-mlops-lora
Merge method PeftModel.merge_and_unload()
Saved format PyTorch bin (fp32)

Why merge?

Merging removes the adapter overhead entirely โ€” no extra matrix multiplications at inference, no PEFT dependency, and the weights load like any standard transformers checkpoint. The only trade-off is that you can no longer swap adapters without re-loading the base model.

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for atulkrs/opt-mlops-merged

Adapter
(317)
this model