KodaLite-1.3B — MLX 8-bit

8-bit MLX quantization of YoAbriel/KodaLite-1.3B, optimized for Apple Silicon.

Size: ~1.4 GB | Precision: 8-bit (8.5 bpw)

Usage

pip install mlx-lm

from mlx_lm import load, generate

model, tok = load("YoAbriel/KodaLite-1.3B-mlx-8bit")
prompt = tok.apply_chat_template(
    [{"role": "user", "content": "What is the capital of France?"}],
    tokenize=False,
    add_generation_prompt=True,
)
print(generate(model, tok, prompt=prompt, max_tokens=80))

YoAbriel/KodaLite-1.3B-mlx — Full precision fp16 (~2.5 GB)

Note: 4-bit quantization of this model produces degraded output because the SFT signal is weak at 1.27B params with only 1.64B training tokens. 8-bit works fine.

License

Apache 2.0

Downloads last month: 23

Safetensors

Model size

0.4B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Model tree for YoAbriel/KodaLite-1.3B-mlx-8bit

Base model

YoAbriel/KodaLite-1.3B

Quantized

(4)

this model

YoAbriel
/

KodaLite-1.3B-mlx-8bit

KodaLite-1.3B — MLX 8-bit

Usage

Related

License

Model tree for YoAbriel/KodaLite-1.3B-mlx-8bit