Wind Edge 1.6 — Geode (0.4B)

A 0.4B parameter causal language model built for edge deployment. Fast, small, and honest about what it can do.

North ML · Wind Arc 1.5 Preview


Overview

Wind Edge 1.6 (Geode) is a compact LLM trained for real-time, on-device inference. At 0.4B parameters it sits in the ultra-small tier — expect strong common-sense and classification performance, limited hard reasoning.

Best use cases:

  • Instruction-following dialogue (short to medium turns)
  • Text classification and sentiment
  • Light code completion
  • Summarization of short passages

Not recommended for: multi-step math, complex logical chains, long-context tasks.


Changes vs 1.5

  • Improved instruction adherence on structured output formats
  • More stable multi-sentence generation (fewer mid-sequence repetitions)
  • Reduced hallucination rate on short factual queries (internal held-out eval)

Honest Benchmark Estimates

Realistic ranges for a well-trained 0.4B model — not cherry-picked numbers.

Task Expected Range Notes
Common Sense (0-shot) 0.60 – 0.68 Reliable strength
Sentiment Analysis 0.70 – 0.80 Reliable strength
Text Classification 0.68 – 0.78 Reliable strength
Reading Comprehension 0.52 – 0.63 Context-dependent
Summarization 0.58 – 0.68 Short docs only
Code Generation 0.45 – 0.58 Simple tasks only
Math Reasoning 0.15 – 0.28 Known weak point at this scale
Logical Reasoning 0.18 – 0.28 Known weak point at this scale

A 0.4B model cannot compete with 7B+ on reasoning — Geode doesn't pretend to.


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("north-ml1/wind-edge-1.6")
tokenizer = AutoTokenizer.from_pretrained("north-ml1/wind-edge-1.6")

inputs = tokenizer("You are Wind Edge, a helpful AI assistant.\nUser: ", return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=256, temperature=0.6, top_p=0.9)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Recommended Settings

Parameter Value
temperature 0.0
top_p 0.95
min_p 0.05
max_new_tokens 256–512
repetition_penalty 1.1
context_limit 1024-4096

GGUF Quantizations

GGUF quants converted from arthu1/Wind-Edge-1.6-Instruct using a Qwen3-compatible tensor layout. The Transformers repo remains canonical — use these for llama.cpp, LM Studio, Ollama-style runtimes, and any other GGUF-compatible inference stack.

Files

File bpw Use
Wind-Edge-1.6-TQ1_0.gguf ~1.7 bpw Experimental 1-bit/ternary. Lowest quality, smallest size.
Wind-Edge-1.6-TQ2_0.gguf ~2.1 bpw Very small 2-bit/ternary option.
Wind-Edge-1.6-IQ3_M.gguf ~3.7 bpw Good balance for tiny devices.
Wind-Edge-1.6-Q4_K_M.gguf ~4.6 bpw Recommended default.
Wind-Edge-1.6-Q6_K.gguf ~6.1 bpw Higher quality, still compact.
Wind-Edge-1.6-Q8_0.gguf ~8.5 bpw Near-lossless practical quant.
Wind-Edge-1.6-F16.gguf 16 bpw Full precision GGUF export.

Q4_K_M, Q6_K, and Q8_0 are the recommended daily drivers. TQ1_0 and TQ2_0 are included for constrained edge hardware but will lose measurable reasoning and factual accuracy.

llama.cpp

llama-cli \
  -m Wind-Edge-1.6-Q4_K_M.gguf \
  -cnv \
  --temp 0.6 \
  --top-p 0.9 \
  --repeat-penalty 1.06 \
  -n 512

For deterministic output, use --temp 0 and keep prompts short.

Chat Template

The GGUF metadata includes the chat template. If your runtime doesn't apply it automatically:

<|im_start|>system
You are Wind-Edge-1.6, a compact AI assistant model. You are not a human.<|im_end|>
<|im_start|>user
Who are you?<|im_end|>
<|im_start|>assistant
<think>
</think>

Model Details

Property Value
Parameters ~0.4B
Architecture Causal LM (decoder-only)
Context Length 8192 tokens
Quantization 1-16bit (GGUF)
Org north-ml1

License

MIT

Downloads last month
2,704
GGUF
Model size
0.4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

1-bit

2-bit

3-bit

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for North-ML1/Wind-Edge-1.6-GGUF

Quantized
(1)
this model

Dataset used to train North-ML1/Wind-Edge-1.6-GGUF

Collection including North-ML1/Wind-Edge-1.6-GGUF

Evaluation results