NLLB Bishnupriya Manipuri V8.4

LoRA fine-tune of facebook/nllb-200-distilled-600M for English → Bishnupriya Manipuri.

Status: Production - outputs pure BPY, not Assamese/Bengali.

Training: 2558 pairs, 400 weighted for core vocab. Val_loss ~0.85.

Quick start

from peft import PeftModel
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch

base = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
model = PeftModel.from_pretrained(base, "Emarthar/nllb-bpy-beng-v8_4")
tokenizer = AutoTokenizer.from_pretrained("Emarthar/nllb-bpy-beng-v8_4")
model.eval()

def translate(text):
    tokenizer.src_lang = "eng_Latn"
    inputs = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        out = model.generate(
            **inputs,
            forced_bos_token_id=tokenizer.convert_tokens_to_ids("asm_Beng"),
            max_new_tokens=64,
            num_beams=5
        )
    return tokenizer.batch_decode(out, skip_special_tokens=True)[0]

print(translate("Water is important")) # পানীহান দরকারি
print(translate("The sky is blue")) # হাগহান নীলুৱাহান
print(translate("My name is Arunita")) # মর নাংহান অরুনিতা

Downloads last month: 71

Model tree for Emarthar/nllb-bpy-beng-v8_4

Base model

facebook/nllb-200-distilled-600M

Adapter

(91)

this model