3bit-bf16

by ehartford - opened Jan 6

Discussion

ehartford

MLX Community org Jan 6

wait, is it 3bit, or bf16? it can't be both, right?

prince-canuma

MLX Community org Jan 7

MLX supports mixed precision.

Now you can define the precision of each layer.

erichartford

Jan 8

And is that what this repo is?

awni

MLX Community org Jan 8

•

edited Jan 8

In fact you do need to specify both to be precise always. Usually quantizations are n-bit with activations in fp16, but sometimes it is better for bf16 or fp32 activations if you have a large range of values.

3-bit means the weight layers are quantized using 3-bits of precision.
bf16 means the activations of the model will be in bf16 since the quantization scales, biases and all the non-quantized weight matrices (e.g. the layer norm params) are bf16.

prince-canuma

MLX Community org Jan 8

•

edited Jan 8

Interesting!

How I would I specify the activation Dtype ?

Even better, how did you make this model ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment