Gemma 4 12B IT Agent SFT TW Full Fine-Tune

This model is a full-weight fine-tune of google/gemma-4-12B-it on voidful/agent-sft.

It is not LoRA, QLoRA, or a merged adapter. The SFT run updated the language model weights and kept the vision/audio embedding stacks frozen so the multimodal interface remained loadable. Red-square image smoke tests passed for the selected checkpoint.

Selected Checkpoint

Checkpoint: wave002-c400-checkpoint-400
Source path: /work/voidful2nlp/gemma-agent-sft/preserved_checkpoints_12b/wave002-c400-checkpoint-400
Training branch: full-data continuation from wave001-checkpoint-400
Sequence length: 8192
Learning rate: 5e-7
Selected by voidful/claw-eval-zh --language tw --suite all --core
Judge: google/gemma-4-31B-it

Evaluation

Best complete CLAW TW core score:

9.567 / 20 = 47.83%

Baseline google/gemma-4-12B-it score from the same evaluation setup:

8.106 / 20 = 40.53%

See PLAYBOOK.md for the training, Slurm, checkpoint, smoke-test, and evaluation exploration log. The selected checkpoint config is in training_config.yml; selected evaluation artifacts are under eval_results/.

Downloads last month: 17

Safetensors

Model size

13B params

Tensor type

BF16

Model tree for voidful/gemma-4-12b-it-agent-sft-tw-fullft

Base model

google/gemma-4-12B

Finetuned

google/gemma-4-12B-it

Finetuned

(87)

this model

voidful
/

gemma-4-12b-it-agent-sft-tw-fullft

Gemma 4 12B IT Agent SFT TW Full Fine-Tune

Selected Checkpoint

Evaluation

Model tree for voidful/gemma-4-12b-it-agent-sft-tw-fullft

Dataset used to train voidful/gemma-4-12b-it-agent-sft-tw-fullft