Falcon-E-1.2-3B-Exp-prequantized

This is the model card of Falcon-E-1.2-3B-Exp, a ternary (1.58bits) language model trained on SFT agentic, and STEM data using axolotl framework combined with onebitllm library.

The model has been trained starting from tiiuae/Falcon-E-3B-Base-prequantized checkpoint using full-finetuning for 3 epochs. Below are the hyper-parameters used for fine-tuning:

This model is the prequantized version of axolotl-ai-co/Falcon-E-1.2-3B-Exp that can be used for further fine-tuning.

micro_batch_size: 1
num_epochs: 3
optimizer: adamw_torch
lr_scheduler: cosine
learning_rate: 8.0e-4
# adamw hyperparams
adam_beta1: 0.9
adam_beta2: 0.95
warmup_steps: 128

And we used a context parallel size of 8.

Usage

The model uses think mode by default, this can be disabled and switched to non-thiking mode. You can use the model with different frameworks such as HF transformers, llama.cpp or mlx-lm

transformers

transformers chat axolotl-ai-co/Falcon-E-1.2-3B-Exp

llama.cpp

# thinking mode
llama-cli -m axolotl-ai-co/Falcon-E-1.2-3B-Exp-GGUF:TQ2_0 --reasoning-format auto --temp 0.2 -cnv

# non thinking mode
llama-cli -m axolotl-ai-co/Falcon-E-1.2-3B-Exp-GGUF:TQ2_0 --reasoning-format auto --temp 0.2 -cnv --reasoning-budget 0.0

mlx-lm

mlx_lm.chat axolotl-ai-co/Falcon-E-1.2-3B-Exp --temperature 0.2

Further fine-tuning the model

You can further fine-tune this model, or the base model using their prequantized version. Refer to the axolotl config to get started on fine-tuning these models:

Aknowledgement

Falcon-E-3B-Chat-Exp models are built using Falcon LLM technology from the Technology Innovation Institute.

Downloads last month
228
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for axolotl-ai-co/Falcon-E-1.2-3B-Exp-prequantized

Quantizations
1 model

Collection including axolotl-ai-co/Falcon-E-1.2-3B-Exp-prequantized

Article mentioning axolotl-ai-co/Falcon-E-1.2-3B-Exp-prequantized