🔄 In a Training Loop

Asankhaya Sharma

codelion

hugging-science

·

http://asankhaya.github.io/

AI & ML interests

Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and Ellora. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.

Recent Activity

updated a dataset about 5 hours ago

adaptive-classifier/ai-detector-data

new activity about 9 hours ago

mlx-community/gemma-4-26B-A4B-it-OptiQ-4bit:optiq_vision.safetensors exists but "this model has no vision sidecar / frontend; cannot take images." error occured.

updated a model about 18 hours ago

mlx-community/diffusiongemma-26B-A4B-it-OptiQ-4bit

View all activity

Organizations

Posts 48

Post

343

SPROG-9M — a 9.37M parameter model trained from scratch to solve GSM8K-style math without using an LLM at inference.

The model, codelion/sprog-9m, predicts symbolic programs over number slots, then a deterministic executor does the arithmetic. With a simple verifier, it reaches ~11.8% on GSM8K test.

We also released the dataset: codelion/gsm8k-synth, 117K validated synthetic GSM8K-style problems.

Tiny model, no pretraining, no LLM at inference, runs on a laptop.

Articles 16

Article

1

SPROG-9M: how far a 9-million-parameter, LLM-free model gets on grade-school math

View all Articles

Collections 8

View 8 collections

Papers 5

arxiv:2506.08060

arxiv:2501.14249

arxiv:2407.18521

arxiv:2407.16557

spaces 11

dhara-chat

ZeroGPU demo of dhara-250M tri-mode (AR/diffusion/self-spec)

PTS Visualizer

Visualize pivotal tokens and thought anchors in language models

Safety Copilot

Ask about any health & safety related queries

Svg2png

Convert SVG to PNG with specified dimensions

MLX My Repo

Convert and upload Hugging Face models to MLX format

LLMSearchEngine

Search for information using LLM

models 33

codelion/dhara-250m

Text Generation • 0.2B • Updated 4 days ago • 1.76k • 4

codelion/sprog-9m

Question Answering • Updated Jun 12 • 39 • 4

codelion/dhara-250m-ar-base

Text Generation • 0.2B • Updated Jun 12 • 46 • 1

codelion/SmolLM2-70M

Text Generation • 69.2M • Updated Mar 8 • 48 • 3

codelion/malm-165m

Text Generation • Updated Jan 23 • 26 • 4

codelion/dhara-70m

Text Generation • 71.3M • Updated Dec 30, 2025 • 188 • 49

codelion/gpt-2-70m

Text Generation • 64.1M • Updated Nov 2, 2025 • 22 • 21

codelion/Qwen3-4B-execution-world-model-lora

Text Generation • Updated Oct 20, 2025 • 11 • 6

codelion/Qwen2.5-Coder-0.5B-Instruct-security-grpo-lora

Text Generation • Updated Aug 2, 2025 • 24

codelion/qwen2-5-coder-0-5b-instruct-progressive-2000k-lora

Text Generation • Updated Jul 20, 2025 • 5 • 2

datasets 47

codelion/logical-puzzles-cot

Viewer • Updated 28 days ago • 22.2k • 100 • 3

codelion/gsm8k-synth

Viewer • Updated Jun 4 • 118k • 326 • 2

codelion/sutra-improved-100M

Viewer • Updated Mar 29 • 414k • 30 • 2

codelion/sutra-magpie-sft

Viewer • Updated Mar 8 • 20.7k • 89 • 2

codelion/sutra-30k-seeds

Viewer • Updated Mar 8 • 30.3k • 43 • 2

codelion/sutra-10M

Viewer • Updated Mar 8 • 7.25k • 82 • 3

codelion/sutra-100M

Viewer • Updated Mar 8 • 70.4k • 67 • 2

codelion/sutra-1B

Viewer • Updated Mar 8 • 429k • 1.03k • 2

codelion/sutra-10B

Viewer • Updated Mar 8 • 5M • 519 • 8

codelion/synth-1B

Viewer • Updated Nov 11, 2025 • 822k • 92 • 2

View 47 datasets