Gemma 4 12B IT Agent SFT TW Full Fine-Tune

This model is a full-weight fine-tune of google/gemma-4-12B-it on voidful/agent-sft.

It is not LoRA, QLoRA, or a merged adapter. The SFT run updated the language model weights and kept the vision/audio embedding stacks frozen so the multimodal interface remained loadable. Red-square image smoke tests passed for the selected checkpoint.

Selected Checkpoint

  • Checkpoint: wave002-c400-checkpoint-400
  • Source path: /work/voidful2nlp/gemma-agent-sft/preserved_checkpoints_12b/wave002-c400-checkpoint-400
  • Training branch: full-data continuation from wave001-checkpoint-400
  • Sequence length: 8192
  • Learning rate: 5e-7
  • Selected by voidful/claw-eval-zh --language tw --suite all --core
  • Judge: google/gemma-4-31B-it

Evaluation

Best complete CLAW TW core score:

9.567 / 20 = 47.83%

Baseline google/gemma-4-12B-it score from the same evaluation setup:

8.106 / 20 = 40.53%

See PLAYBOOK.md for the training, Slurm, checkpoint, smoke-test, and evaluation exploration log. The selected checkpoint config is in training_config.yml; selected evaluation artifacts are under eval_results/.

Downloads last month
17
Safetensors
Model size
13B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for voidful/gemma-4-12b-it-agent-sft-tw-fullft

Finetuned
(87)
this model

Dataset used to train voidful/gemma-4-12b-it-agent-sft-tw-fullft