Instructions to use onda/ligature-seam-gemma4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use onda/ligature-seam-gemma4 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-4-12B-it") model = PeftModel.from_pretrained(base_model, "onda/ligature-seam-gemma4") - Notebooks
- Google Colab
- Kaggle
Ligature seam Gemma 4 12B LoRA - hillclimb exp10 lower-LR continuation
This repository contains LoRA adapter checkpoints from hillclimb_exp10, a
lower-learning-rate longer continuation run on the exp5-style ligature seam
classifier data/prompt setup.
Canonical result
The canonical paper result is exp5 checkpoint-500, not this exp10 run.
Use exp5 checkpoint-500 for the paper's direct seam-classifier result:
outputs/hillclimb_exp5/lora/checkpoint-500
That checkpoint has the best direct human-gold heldout result found so far:
- exp5 checkpoint-500 macro F1 on
selnoligGT2.txt:0.945921653410345 - exp5 checkpoint-500 accuracy on
selnoligGT2.txt:0.9527439024390244
This exp10 repository is kept as an experiment artifact and negative result: lower LR plus longer training did not improve the direct classifier. The exp5 checkpoint is not uploaded under this exp10 repository prefix.
Which adapter to load
Use the explicit checkpoint subdirectories:
hillclimb_exp10/adapters/checkpoint-500hillclimb_exp10/adapters/checkpoint-1000hillclimb_exp10/adapters/checkpoint-1500hillclimb_exp10/adapters/checkpoint-2000hillclimb_exp10/adapters/checkpoint-2500hillclimb_exp10/adapters/checkpoint-3000
The root-level adapter files, if present in repository history/snapshots, are
legacy suppress-only rank16 artifacts from an earlier upload and should not be
treated as the exp10 result. The exp10 run is represented by the
hillclimb_exp10/adapters/... paths above.
Training run
- Config:
hillclimb_exp10.yamlin this repository snapshot / upload. - Base model:
google/gemma-4-12B-it. - Training hardware: 2x A100 80GB with torch DDP.
- Train output:
outputs/hillclimb_exp10/lora/checkpoint-*. - Checkpoints uploaded here: checkpoint-500, checkpoint-1000, checkpoint-1500, checkpoint-2000, checkpoint-2500, checkpoint-3000.
Heldout classifier result for this exp10 run
Direct evaluation on selnoligGT2.txt showed that exp10 did not improve over the previous exp5 checkpoint-500 classifier.
- Previous exp5 checkpoint-500 gold macro F1:
0.945921653. - exp10 checkpoint-500 gold macro F1:
0.925104828. - exp10 checkpoint-1000 gold macro F1:
0.906452763. - exp10 checkpoint-1500 gold macro F1:
0.892553049.
Interpretation: lower learning rate plus longer training degraded direct human-gold seam-classifier performance. This suggests the useful signal in this dataset is reached early, and additional optimization starts fitting idiosyncrasies of the train distribution rather than improving the heldout psycholinguistic boundary decision.
Downstream patgen note
For en-wiki patgen experiments generated from exp10 checkpoints, checkpoint-2500 produced the best pattern-level heldout macro F1 among exp10 checkpoints:
{ "500": { "macro_f1": 0.8458406600483104, "accuracy": 0.8649468892261002 }, "1000": { "macro_f1": 0.8381156147232458, "accuracy": 0.858877086494689 }, "1500": { "macro_f1": 0.8457685826624228, "accuracy": 0.8634294385432474 }, "2000": { "macro_f1": 0.834056023974697, "accuracy": 0.8619119878603946 }, "2500": { "macro_f1": 0.860016339869281, "accuracy": 0.881638846737481 } }
This downstream result is not the same as direct classifier quality. As of this upload, exp5 checkpoint-500 has not yet been run through the same en-wiki/3M patgen pipeline, so paper checkpoint selection should keep classifier and pattern-generation evidence separate.
Uploaded contents
Each checkpoint directory is adapter-only: PEFT adapter weights/config, tokenizer/chat template files, task config, and README. Trainer-only files such as optimizer.pt, scheduler.pt, RNG state, and trainer state are intentionally excluded.
- Downloads last month
- 3