Instructions to use voidful/barbet-1b-base-agent-sft-tw-fullft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use voidful/barbet-1b-base-agent-sft-tw-fullft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="voidful/barbet-1b-base-agent-sft-tw-fullft", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("voidful/barbet-1b-base-agent-sft-tw-fullft", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use voidful/barbet-1b-base-agent-sft-tw-fullft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "voidful/barbet-1b-base-agent-sft-tw-fullft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "voidful/barbet-1b-base-agent-sft-tw-fullft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/voidful/barbet-1b-base-agent-sft-tw-fullft

SGLang

How to use voidful/barbet-1b-base-agent-sft-tw-fullft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "voidful/barbet-1b-base-agent-sft-tw-fullft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "voidful/barbet-1b-base-agent-sft-tw-fullft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "voidful/barbet-1b-base-agent-sft-tw-fullft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "voidful/barbet-1b-base-agent-sft-tw-fullft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use voidful/barbet-1b-base-agent-sft-tw-fullft with Docker Model Runner:
```
docker model run hf.co/voidful/barbet-1b-base-agent-sft-tw-fullft
```

Barbet 1B Agent SFT TW Full Fine-Tune

This is a public full-weight fine-tune of OpenFormosa/barbet-1b-base on voidful/agent-sft. It is not LoRA, QLoRA, adapter tuning, or quantized training. The released checkpoint is cleaned from the wave003a checkpoint-100 FSDP training artifact for normal inference loading.

The tokenizer assets are from OpenFormosa/PangolinTokenizer. The model uses custom Barbet modeling code and should be loaded with trust_remote_code=True.

Training Summary

Base model: OpenFormosa/barbet-1b-base
Tokenizer: OpenFormosa/PangolinTokenizer
Dataset: voidful/agent-sft
Final selected checkpoint: wave003a checkpoint-100
Training method: full-parameter supervised fine-tuning
Main framework: Axolotl + Transformers + FSDP
Hardware used for main runs: one 8-GPU H200 Slurm node on the dev partition
Judge model for core evaluation: google/gemma-4-31B-it

The training set was filtered to examples with at least one trainable assistant turn, an assistant final turn, and bounded conversation length. This avoided very long outlier conversations that stalled preprocessing before FSDP training. The exact manifests are included in data_filter_manifests/.

Evaluation

Primary target:

voidful/claw-eval-zh --language tw

Scores below are sums of per-task grading.mean from the exported evaluation JSON files.

Candidate	Eval suite	Score
`OpenFormosa/barbet-1b-base`	automated TW quick eval	`2.755 / 25`
`wave002a` checkpoint-100	automated TW quick eval	`2.790 / 25`
`wave003a` checkpoint-50	automated TW quick eval	`2.790 / 25`
`wave003a` checkpoint-100	automated TW quick eval	`2.790 / 25`
`wave003a` checkpoint-150	automated TW quick eval	`2.790 / 25`
`wave003a` checkpoint-200	automated TW quick eval	`2.790 / 25`
`wave003b` checkpoint-50	automated TW quick eval	`2.790 / 25`
`wave003b` checkpoint-100	automated TW quick eval	`2.790 / 25`
`wave003b` checkpoint-150	automated TW quick eval	`2.790 / 25`
`wave002a` checkpoint-100	all/core TW judge eval	`1.267 / 20`
`wave003a` checkpoint-100	all/core TW judge eval	`1.267 / 20`

The benchmark plateaued after the first wave002a improvement. Longer-context continuation (wave003a) and higher-GPU-utilization continuation (wave003b) did not improve claw-eval-zh --language tw further. wave003a checkpoint-100 was selected because it tied the best quick eval, tied the full/core judge eval, and includes the seq2048 continuation pass.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "voidful/barbet-1b-base-agent-sft-tw-fullft"

tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "system", "content": "你是一個能使用工具完成任務的助理。"},
    {"role": "user", "content": "請用繁體中文簡短介紹你自己。"},
]
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Included Artifacts

PLAYBOOK.md: full training and exploration playbook
training_configs/: selected Axolotl configs for the main waves
eval_results/: raw exported claw-eval-zh JSON files
data_filter_manifests/: exact filtering manifests for prepared datasets
chat_template.jinja: chat template used during training/evaluation

Limitations

The benchmark gain is small, and the absolute claw-eval-zh --language tw scores remain low. Treat this as a reproducible full-finetuned Barbet agent SFT checkpoint and exploration artifact, not as a strong production-ready agent.

License

The upstream base model, tokenizer, and dataset currently declare license: other. This repository follows that metadata. Check the upstream repositories for the applicable terms before redistribution or commercial use.

Downloads last month: 28

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for voidful/barbet-1b-base-agent-sft-tw-fullft

Base model

OpenFormosa/barbet-1b-base

Finetuned

(1)

this model

voidful
/

barbet-1b-base-agent-sft-tw-fullft