F1-1B-TH-Base MLOps Test

This repository contains an early MLOps checkpoint for F1-1B-TH-Base, a Thai decoder-only causal language model trained from scratch with a Qwen/Llama-style architecture.

Important

This is an infrastructure checkpoint, not a production-quality language model.

  • Parameters: 1,244,760,064
  • Architecture: Qwen/Llama-style decoder-only Transformer
  • Training type: from scratch, random initialization
  • Pretrained external weights: none
  • Intended use: MLOps loading, checkpointing, training continuation, smoke generation
  • Expected output quality: poor/unstable because the checkpoint was trained on a tiny token budget

Architecture

  • RMSNorm
  • RoPE positional embedding
  • grouped-query causal self-attention
  • SwiGLU feedforward
  • 24 layers
  • hidden size 2048
  • 16 attention heads
  • 8 KV heads
  • context length 512
  • vocab size 32000

Files

  • f1_1b_th_base_final.pt: PyTorch checkpoint
  • config.json: architecture and training metadata
  • f1_model.py: minimal model definition for loading
  • load_model.py: example loader
  • tokenizer.model and tokenizer.vocab: optional tokenizer files if uploaded

Load Example

import torch
from f1_model import F1ForCausalLM, ModelConfig

cfg = ModelConfig.from_json("config.json")
model = F1ForCausalLM(cfg)
payload = torch.load("f1_1b_th_base_final.pt", map_location="cpu")
state = payload["model"] if isinstance(payload, dict) and "model" in payload else payload
model.load_state_dict(state)
model.eval()

Status

This checkpoint is for MLOps validation only. Train or continue-train on substantially more Thai data before using it for evaluation, demos, or downstream applications.

Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support