MiniCPM5-1B-Hindi-Instruct

A Hindi instruction-tuned variant of openbmb/MiniCPM5-1B, fine-tuned for Hindi (हिंदी) conversational and instruction-following tasks.

Part of the 🇮🇳 Hindi LLM Series by @pankajpandey-dev.

Model Details

  • Base model: openbmb/MiniCPM5-1B (1.1B parameters)
  • Language: Hindi (हिंदी), with English understanding retained from the base
  • Fine-tuning method: LoRA (r=32, alpha=64) merged into base weights
  • Training framework: Unsloth + TRL
  • License: Apache 2.0

Training Data

Fine-tuned on 4,000 high-quality Hindi instruction examples sampled from:

All examples ≤ 2048 tokens, formatted with the MiniCPM5 ChatML template.

Training Configuration

Hyperparameter Value
LoRA rank 32
LoRA alpha 64
LoRA dropout 0.0
Target modules q, k, v, o, gate, up, down
Batch size (effective) 16
Learning rate 2e-4
LR scheduler cosine
Warmup steps 15
Epochs 2
Total steps 500
Precision fp16 (4-bit base)
Hardware NVIDIA Tesla T4 (Colab)
Training time ~60 minutes
Final training loss 1.108

Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {"role": "user", "content": "नमस्ते! बारिश के दिन पर एक छोटी कविता लिखो।"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1,
)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))

Recommended Generation Parameters

  • temperature: 0.7 (lower = more focused, higher = more creative)
  • top_p: 0.9
  • repetition_penalty: 1.1
  • max_new_tokens: 256–512 depending on task

LoRA Adapter Only

If you prefer to load the LoRA adapter on top of the base model (~85 MB vs 2.2 GB), it's available in the lora_adapter/ folder of this repo:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("openbmb/MiniCPM5-1B", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct", subfolder="lora_adapter")

Example Outputs

Prompt: बारिश के दिन पर एक छोटी कविता लिखो। Response: (creative Hindi poetry generation)

Prompt: मशीन लर्निंग क्या है? सरल हिंदी में समझाइए। Response: (simplified Hindi explanation of ML)

Prompt: नमस्ते! अपना परिचय दीजिए। Response: (conversational Hindi self-introduction)

Quantized Versions (GGUF)

For running locally with llama.cpp, Ollama, LM Studio, or other GGUF-compatible inference engines.

Acknowledgements

  • OpenBMB for the MiniCPM5-1B base model
  • AI4Bharat (IIT Madras) for the indic-instruct-data dataset
  • Unsloth for the training framework

Citation

If you use this model in your work, please cite:

@misc{pandey2026minicpm5hindi,
  title  = {MiniCPM5-1B-Hindi-Instruct},
  author = {Pankaj Pandey},
  year   = {2026},
  url    = {https://huggingface.co/pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct}
}

Part of an ongoing effort to bring strong open-source LLMs to Indian languages. Feedback and contributions welcome via the community tab.

Downloads last month
-
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct

Adapter
(2)
this model
Quantizations
1 model

Collection including pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct