Tara10M SFT v1 2K

Tara10M SFT v1 2K is a small 10.4M parameter Burmese-English causal language model exported in Hugging Face Llama-compatible format.

This is a learning/school-project model. It is not a production assistant.

Model Details

  • Architecture: Llama-style decoder-only causal LM
  • Parameters: 10,390,784
  • Layers: 6
  • Hidden size: 256
  • Attention heads: 4
  • Vocabulary: 16,000 SentencePiece tokens
  • Context length: 1,024
  • Base checkpoint: Tara10M Colab base checkpoint
  • SFT data: 2,000 cleaned synthetic Burmese-English instruction examples

Intended Use

Best test areas:

  • short English to Burmese translation
  • short Burmese to English translation
  • simple Burmese rewrite
  • travel phrasebook style prompts

Prompt format:

Instruction: Translate this sentence to Burmese.
Input: I will go home tomorrow.
Response:

Limitations

This model is very small and still repeats, drifts, and gives wrong translations. It should be used for experimentation only.

Known issues:

  • weak factual reliability
  • repeated phrases
  • mixed Burmese/English output
  • poor long-form generation
  • not suitable for safety-critical use

Training Summary

SFT examples: 2,000
Train examples: 1,800
Validation examples: 200
Max steps: 400
Learning rate: 2e-5
Best validation loss: 2.4228

Files

  • model.safetensors
  • config.json
  • tokenizer.model
  • tokenizer_config.json
  • special_tokens_map.json
Downloads last month
26
Safetensors
Model size
10.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aungkomyint/tara10m-sft-v1-2k

Finetunes
1 model