F1-1B-TH-Base MLOps Test
This repository contains an early MLOps checkpoint for F1-1B-TH-Base, a Thai decoder-only causal language model trained from scratch with a Qwen/Llama-style architecture.
Important
This is an infrastructure checkpoint, not a production-quality language model.
- Parameters: 1,244,760,064
- Architecture: Qwen/Llama-style decoder-only Transformer
- Training type: from scratch, random initialization
- Pretrained external weights: none
- Intended use: MLOps loading, checkpointing, training continuation, smoke generation
- Expected output quality: poor/unstable because the checkpoint was trained on a tiny token budget
Architecture
- RMSNorm
- RoPE positional embedding
- grouped-query causal self-attention
- SwiGLU feedforward
- 24 layers
- hidden size 2048
- 16 attention heads
- 8 KV heads
- context length 512
- vocab size 32000
Files
f1_1b_th_base_final.pt: PyTorch checkpointconfig.json: architecture and training metadataf1_model.py: minimal model definition for loadingload_model.py: example loadertokenizer.modelandtokenizer.vocab: optional tokenizer files if uploaded
Load Example
import torch
from f1_model import F1ForCausalLM, ModelConfig
cfg = ModelConfig.from_json("config.json")
model = F1ForCausalLM(cfg)
payload = torch.load("f1_1b_th_base_final.pt", map_location="cpu")
state = payload["model"] if isinstance(payload, dict) and "model" in payload else payload
model.load_state_dict(state)
model.eval()
Status
This checkpoint is for MLOps validation only. Train or continue-train on substantially more Thai data before using it for evaluation, demos, or downstream applications.
- Downloads last month
- 17