Instructions to use atulkrs/opt-mlops-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use atulkrs/opt-mlops-merged with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
opt-mlops-merged
facebook/opt-125m fine-tuned with a LoRA adapter
(atulkrs/opt-mlops-lora)
and fully merged into base weights via PeftModel.merge_and_unload().
The adapter deltas are baked in โ no PEFT dependency needed at inference time.
Load (full precision)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("atulkrs/opt-mlops-merged")
tokenizer = AutoTokenizer.from_pretrained("atulkrs/opt-mlops-merged")
Load in 4-bit with BitsAndBytes (recommended for GPU inference)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4", # NormalFloat4 โ from QLoRA paper
bnb_4bit_compute_dtype=torch.float16,
)
model = AutoModelForCausalLM.from_pretrained(
"atulkrs/opt-mlops-merged",
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("atulkrs/opt-mlops-merged")
Tip: swap
torch.float16fortorch.bfloat16on Ampere+ GPUs (A100, RTX 30xx+) for better numerical stability at no speed cost.
Size & load-time benchmark
| Format | Size | Notes |
|---|---|---|
| FP32 (merged) | 477.8 MB | measured via param_size_mb() |
| 4-bit NF4 (est.) | 59.7 MB | approx fp32 / 8 |
| Reduction | ~8x |
4-bit load time benchmark requires Linux + CUDA + bitsandbytes; estimated load time on GPU is typically 2โ5s for a 125M model.
Merge details
| Field | Value |
|---|---|
| Base model | facebook/opt-125m |
| Adapter | atulkrs/opt-mlops-lora |
| Merge method | PeftModel.merge_and_unload() |
| Saved format | PyTorch bin (fp32) |
Why merge?
Merging removes the adapter overhead entirely โ no extra matrix multiplications at
inference, no PEFT dependency, and the weights load like any standard
transformers checkpoint. The only trade-off is that you can no longer swap
adapters without re-loading the base model.
- Downloads last month
- -
Model tree for atulkrs/opt-mlops-merged
Base model
facebook/opt-125m