Fikra 1B Nano (GGUF Quantized) 🧠

"The Intel Inside for Edge AI."

This is the quantized (compressed) version of Fikra 1B Nano, optimized for edge devices, consumer smartphones, and offline environments. It uses the GGUF format for high-speed inference on CPU.

  • Developer: Lacesse Ventures
  • Format: GGUF (Q4_K_M)
  • Size: ~700MB
  • Logic: Fine-tuned on GSM8K (Math) and Dolly 15k (Instruction).

πŸš€ The Easiest Way to Run (Python SDK)

We have built a dedicated SDK to handle the complexity of GGUF for you. It runs 100% offline.

pip install fikra
from fikra import Fikra

# 1. Initialize (Automatically downloads this model to your machine)
brain = Fikra() 

# 2. Reason (Offline)
answer = brain.reason("If I have 3 apples and eat one, how many are left?")
print(answer)
# Output: "You have 2 apples."

πŸ› οΈ Manual Usage (llama.cpp)

If you prefer using llama.cpp directly without our SDK:

./main -m fikra-1b-nano-v0.2-q4_k_m.gguf -n 128 -p "User: Why is the sky blue?\nAnswer:"

πŸ“¦ About this Quantization

Quantization Size Perplexity Loss Recommended
Q4_K_M ~700 MB Negligible βœ… Yes (Balanced)

License

Apache 2.0. You are free to use this for commercial applications, including embedded hardware and proprietary software.

Built by James Miano / Lacesse Ventures.

Downloads last month
35
GGUF
Model size
2B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for lacesseapp/Fikra-1B-Nano-v0.2-GGUF

Quantized
(1)
this model