Instructions to use albertobarnabo/scenesmith-qwen3-4b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use albertobarnabo/scenesmith-qwen3-4b with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # if on a CUDA device, also pip install mlx[cuda] # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("albertobarnabo/scenesmith-qwen3-4b") prompt = "Once upon a time in" text = generate(model, tokenizer, prompt=prompt, verbose=True) - PEFT
How to use albertobarnabo/scenesmith-qwen3-4b with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use albertobarnabo/scenesmith-qwen3-4b with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Generate some text mlx_lm.generate --model "albertobarnabo/scenesmith-qwen3-4b" --prompt "Once upon a time"
Scenesmith โ Qwen3-4B LoRA Adapter (Manim CE Animation Generator)
A LoRA adapter that turns Qwen3-4B-Instruct-2507 into Scenesmith: give it a natural-language description of a concept ("animate binary search narrowing on a sorted array") and it replies with one complete, runnable Manim Community Edition Python file in a consistent dark house style โ no markdown fences, no commentary, no LaTeX dependency. Trained and evaluated entirely on a 16GB Apple Silicon machine via MLX.
The defining property of the project: outputs are mechanically verifiable.
Manim code either renders to an MP4 or it doesn't, so a single render gate
(manim -ql or reject) guards the training data, the harvested data, and the
eval. Every example this adapter was trained on actually rendered.
Code, data pipeline, and eval harness: https://github.com/albertobarnabo/scenesmith
Results
43 eval prompts (31 held-out in-distribution, 12 novel concepts), greedy decoding. "Render pass" means the generated file produced a real MP4 through the same gate as the training data.
| metric | base Qwen3-4B | + this adapter |
|---|---|---|
| render pass, overall | 6/43 (14%) | 27/43 (63%) |
| render pass, held-out in-distribution | 3/31 (10%) | 25/31 (81%) |
| render pass, novel concepts | 3/12 (25%) | 2/12 (17%) |
| house style adherence (palette/bg/font/subtitles) | 0/43 | 39โ43/43 |
| markdown fences around output | 43/43 | 0/43 |
Usage
pip install mlx-lm
python -m mlx_lm generate \
--model mlx-community/Qwen3-4B-Instruct-2507-4bit \
--adapter-path <this repo> \
--system-prompt "$(cat system.txt)" \
--max-tokens 3072 \
--prompt "Animate two pointers finding a pair that sums to 20 in [2, 5, 8, 11, 14, 19]"
Save the reply to scene.py and render with manim -ql scene.py.
Two integration notes:
- Strip the think prefix. The Qwen3 chat template wraps assistant turns in an
(empty)
<think>\n\n</think>block and the adapter reproduces it. Remove it before compiling:re.sub(r"<think>.*?</think>\s*", "", out, flags=re.S). - Use the bundled
system.txtโ it is the system prompt the adapter was trained against; the house style is conditioned on it.
House style
Dark slate background (#0e1116), GitHub-dark accent palette declared as
constants, Menlo, Pango Text only (no LaTeX required on the render machine),
subtitles via add_subcaption (Manim emits an .srt), one descriptive Scene
class per file, fade-out ending.
Training
- Base: mlx-community/Qwen3-4B-Instruct-2507-4bit, LoRA rank 16 on 16 layers
(14.7M trainable params, 0.365%),
mask_prompt: true, 650 iterations, effective batch 4, cosine decay 5e-5 โ 5e-6. Peak memory 5.5GB. - Data: 851 train / 74 valid examples, all render-verified.
- 60% synthetic: 12 parameterized algorithm/math scene families in the house style (binary search, two pointers, sliding window, bubble sort, stack bracket-matching, BFS grid, linked-list reversal, hash-map two-sum, prefix sums, Kadane, decimalโbinary, function plots).
- 40% wild, render-gate-filtered: bespokelabs/bespoke-manim, the official ManimCE docs examples (back-captioned), and SuienR/ManimBench-v1. Wild examples were trained under a separate plain system prompt so the house prompt stays bound to house-style outputs.
- Loss: val 1.117 โ ~0.22 (
loss.csvin this repo).
Limitations
- Novel concept types outside the trained families can hallucinate plausible-but-
fake API (
Text.add_cell) โ always render-check generated code (it's cheap). - No
Tex/MathTex: the house style is deliberately LaTeX-free. - Tuned for ~10โ25 s explainer scenes at 480p/1080p, not long-form videos.
Quantized
Model tree for albertobarnabo/scenesmith-qwen3-4b
Base model
Qwen/Qwen3-4B-Instruct-2507