kasuboski/gemma-4-e4b-gleam-sft

A LoRA adapter fine-tuned on google/gemma-4-e4b-it for Gleam code generation.

This is the adapter-only repo. For a ready-to-run GGUF (Ollama / llama.cpp), see kasuboski/gemma-4-e4b-gleam-sft-GGUF.

Dataset

Trained on a mixed dataset of 13,538 instruction-response pairs:

Source Count % Purpose
gleam-instruct 9,355 69.1% Gleam code generation (translate, refactor, debug, from-scratch)
ultrachat_200k 3,000 22.2% General instruction β€” anti-forgetting
Code-290k-ShareGPT-MarkedLanguage 629 4.6% Functional code (Scala, Elixir, Clojure, Haskell, etc.) β€” anti-forgetting
OpenCodeInstruct 554 4.1% Python code β€” anti-forgetting

gleam-instruct (69.1%)

The core SFT dataset. 14,053 Gleam instruction-response pairs covering:

  • translate β€” port code from other languages to Gleam
  • refactor β€” improve existing Gleam code
  • debug β€” fix broken Gleam code
  • from-scratch β€” write new Gleam modules from requirements

Filtered to 9,355 pairs that fit within 4096 tokens (93.6% coverage). The explanation task type (4,698 samples that just explained code) was excluded as a length bottleneck with low learning signal.

Anti-forgetting mix (30.9%)

To preserve general capabilities while specializing on Gleam:

  • General instruction (ultrachat_200k, MIT) β€” 3,000 random samples to maintain conversational ability
  • Functional code (Code-290k, Apache-2.0) β€” 629 samples across Scala (364), Haskell (156), Clojure (71), Elixir (34), F# (23), Lisp (28), Erlang (5), OCaml (2). Keeps the model's functional programming knowledge.
  • Python code (OpenCodeInstruct, CC-BY-4.0) β€” 554 samples to maintain general code ability

Usage

With Unsloth (recommended)

from unsloth import FastModel
from unsloth.chat_templates import get_chat_template

model, tokenizer = FastModel.from_pretrained(
    model_name="unsloth/gemma-4-e4b-it-unsloth-bnb-4bit",
    max_seq_length=4096,
    load_in_4bit=True,
)
model.load_adapter("kasuboski/gemma-4-e4b-gleam-sft")
tokenizer = get_chat_template(tokenizer, chat_template="gemma-4")

messages = [
    {"role": "system", "content": "You are a helpful Gleam programming assistant."},
    {"role": "user", "content": "Write a Gleam function that reverses a list."},
]
inputs = tokenizer.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
).to("cuda")
outputs = model.generate(input_ids=inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

With PEFT + Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("google/gemma-4-e4b-it", torch_dtype="auto")
model = PeftModel.from_pretrained(base, "kasuboski/gemma-4-e4b-gleam-sft")
tokenizer = AutoTokenizer.from_pretrained("kasuboski/gemma-4-e4b-gleam-sft")

Recommended System Prompt

For best results, use this system prompt:

You are a Gleam programming expert. You ONLY write valid Gleam code. Gleam syntax rules: comments use //, function signatures use -> not ::, no where clauses, no do notation, imports use gleam/module style, use pub fn for public functions, pipe operator |> for chaining. Do NOT use Haskell syntax.

Known Issues

  • Haskell syntax bleed: The model may occasionally output Haskell syntax (:: type signatures, -- comments, Data.* imports). This is caused by 156 Haskell samples in the anti-forgetting mix. A strong system prompt (above) mitigates this. A future training run should exclude Haskell.
  • Unsloth GGUF export is broken for Gemma 4: Do NOT use model.save_pretrained_gguf(). Use llama.cpp's convert_hf_to_gguf.py instead (see the GGUF repo for details).

License

This model is subject to the Gemma Terms of Use.

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support