YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ttthyme-ckpts

Thyme SFT checkpoints from reproducing Table 5 "+Only Last Round" of Thyme paper.

See full repro notes at Irisicy4/Thyme-projects @ refactor-reproduce-notes.

Checkpoints

Folder Config Train epochs train_loss Notes
SFT-baseline-bs128-seed42/checkpoint-4251 effective batch 128 (matches paper) 3 0.274 main result
SFT-baseline-bs64-DEPRECATED-seed42/checkpoint-8499 effective batch 64 (half-paper, setup error) 3 0.206 partially eval'd
SFT-baseline-bs64-partial-seed42/checkpoint-1500 bs64 — partial 0.5 n/a killed mid-training
SFT-no123-bs64-partial-seed42/checkpoint-2000 bs64 — partial — no123 ablation 0.7 n/a killed mid-training

Base model

All checkpoints are fine-tunes of Qwen/Qwen2.5-VL-7B-Instruct. Load with:

from transformers import AutoModelForCausalLM, AutoProcessor
mp = "Icey444/ttthyme-ckpts/SFT-baseline-bs128-seed42/checkpoint-4251"
model = AutoModelForCausalLM.from_pretrained(mp, trust_remote_code=True)
processor = AutoProcessor.from_pretrained(mp)

Or with VLMEvalKit + the Thyme code-execution sandbox loop (see repro notes §13 in the Irisicy4 repo branch above).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Icey444/ttthyme-ckpts