LocalTranslate EN -> RO

This is a English -> Romanian translation model trained by Mihai Popa (with the help of Kimi K2.6 Thinking)! It's based on BART, and it's a lot smaller than other models.

Why?

Because it's for my own future LocalTranslate project! Unlike Google Translate, this uses models (LTR files in the future) that you download and put on your device! And yes, it translates locally!

Notes

  • Trained in just 22 minutes on Colab T4 GPU. 24M parameters and 92 MB in size!

Model Configurations

Parameter Value
Tokenizer BPE
Vocabulary Size 16384 tokens
Batch Size 128 x 1 = 128
Context Window 128 tokens
max_position_embeddings 128
encoder_layers 6
decoder_layers 6
encoder_attention_heads 6
decoder_attention_heads 6
encoder_ffn_dim 768
decoder_ffn_dim 768
d_model 384
dropout 0.1
attention_dropout 0.1
activation_function "gelu"
init_std 0.02
scale_embedding True
normalize_before True
add_final_layer_norm True
pad_token_id tokenizer.pad_token_id
bos_token_id tokenizer.bos_token_id
eos_token_id tokenizer.eos_token_id
decoder_start_token_id tokenizer.eos_token_id
forced_eos_token_id tokenizer.eos_token_id
tie_word_embeddings True

Training Configurations

Hyperparameter Value
output_dir "./localtranslate_en_ro"
do_train True
do_eval True
eval_strategy "steps"
eval_steps 500
learning_rate 3e-4
weight_decay 0.01
max_steps 6000
warmup_steps 1000
logging_steps 50
save_steps 500
save_total_limit 5000
fp16 True
gradient_checkpointing True
label_smoothing_factor 0.1
predict_with_generate False
report_to "none"
dataloader_num_workers 2
remove_unused_columns False

Limitations

  • Not Perfect: As with any other model, it's not 100% perfect and can generate incorrect translations!
  • English-Only: It's for English -> Romanian translation (NOT vice-versa)!

Evaluation Results

Metric Score (greedy search on dev split) Score (beam search, 2 beams on dev split) Score (3 beams on dev split) Score (greedy on Flores 200) Score (3-beam on Flores 200)
BLEU 33.74 34.67 34.83 11.31 12.35
chrF++ 61.35 62.09 62.29 39.49 40.70

Usage

Code is by Gemini 3 Flash/Kimi K2.6 Thinking (then some little modifications by myself):

from transformers import BartForConditionalGeneration, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/LocalTranslate_EN_RO"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = BartForConditionalGeneration.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate to Romanian
inputs = tokenizer("At your current usage level, this runtime may last up to 1 hour.", return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_length=64,
    num_beams=2,
    early_stopping=True,
    forced_eos_token_id=tokenizer.eos_token_id,
)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(translation)

Data Used

Dataset Translation Pairs
Europarl 7 ~399k raw -> ~340k cleaned
Tatoeba+ 16k
Total 343488
Dev Split 2000
Downloads last month
176
Safetensors
Model size
24.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results