Barbet 1B Agent SFT TW Full Fine-Tune

This is a public full-weight fine-tune of OpenFormosa/barbet-1b-base on voidful/agent-sft. It is not LoRA, QLoRA, adapter tuning, or quantized training. The released checkpoint is cleaned from the wave003a checkpoint-100 FSDP training artifact for normal inference loading.

The tokenizer assets are from OpenFormosa/PangolinTokenizer. The model uses custom Barbet modeling code and should be loaded with trust_remote_code=True.

Training Summary

  • Base model: OpenFormosa/barbet-1b-base
  • Tokenizer: OpenFormosa/PangolinTokenizer
  • Dataset: voidful/agent-sft
  • Final selected checkpoint: wave003a checkpoint-100
  • Training method: full-parameter supervised fine-tuning
  • Main framework: Axolotl + Transformers + FSDP
  • Hardware used for main runs: one 8-GPU H200 Slurm node on the dev partition
  • Judge model for core evaluation: google/gemma-4-31B-it

The training set was filtered to examples with at least one trainable assistant turn, an assistant final turn, and bounded conversation length. This avoided very long outlier conversations that stalled preprocessing before FSDP training. The exact manifests are included in data_filter_manifests/.

Evaluation

Primary target:

voidful/claw-eval-zh --language tw

Scores below are sums of per-task grading.mean from the exported evaluation JSON files.

Candidate Eval suite Score
OpenFormosa/barbet-1b-base automated TW quick eval 2.755 / 25
wave002a checkpoint-100 automated TW quick eval 2.790 / 25
wave003a checkpoint-50 automated TW quick eval 2.790 / 25
wave003a checkpoint-100 automated TW quick eval 2.790 / 25
wave003a checkpoint-150 automated TW quick eval 2.790 / 25
wave003a checkpoint-200 automated TW quick eval 2.790 / 25
wave003b checkpoint-50 automated TW quick eval 2.790 / 25
wave003b checkpoint-100 automated TW quick eval 2.790 / 25
wave003b checkpoint-150 automated TW quick eval 2.790 / 25
wave002a checkpoint-100 all/core TW judge eval 1.267 / 20
wave003a checkpoint-100 all/core TW judge eval 1.267 / 20

The benchmark plateaued after the first wave002a improvement. Longer-context continuation (wave003a) and higher-GPU-utilization continuation (wave003b) did not improve claw-eval-zh --language tw further. wave003a checkpoint-100 was selected because it tied the best quick eval, tied the full/core judge eval, and includes the seq2048 continuation pass.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "voidful/barbet-1b-base-agent-sft-tw-fullft"

tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "system", "content": "你是一個能使用工具完成任務的助理。"},
    {"role": "user", "content": "請用繁體中文簡短介紹你自己。"},
]
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Included Artifacts

  • PLAYBOOK.md: full training and exploration playbook
  • training_configs/: selected Axolotl configs for the main waves
  • eval_results/: raw exported claw-eval-zh JSON files
  • data_filter_manifests/: exact filtering manifests for prepared datasets
  • chat_template.jinja: chat template used during training/evaluation

Limitations

The benchmark gain is small, and the absolute claw-eval-zh --language tw scores remain low. Treat this as a reproducible full-finetuned Barbet agent SFT checkpoint and exploration artifact, not as a strong production-ready agent.

License

The upstream base model, tokenizer, and dataset currently declare license: other. This repository follows that metadata. Check the upstream repositories for the applicable terms before redistribution or commercial use.

Downloads last month
28
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for voidful/barbet-1b-base-agent-sft-tw-fullft

Finetuned
(1)
this model

Dataset used to train voidful/barbet-1b-base-agent-sft-tw-fullft