Felladrin/Smol-Llama-101M-Chat-v1-GGUF
Quantized GGUF model files for Smol-Llama-101M-Chat-v1 from Felladrin
Name | Quant method | Size |
---|---|---|
smol-llama-101m-chat-v1.fp16.gguf | fp16 | 204.25 MB |
smol-llama-101m-chat-v1.q2_k.gguf | q2_k | 51.90 MB |
smol-llama-101m-chat-v1.q3_k_m.gguf | q3_k_m | 58.04 MB |
smol-llama-101m-chat-v1.q4_k_m.gguf | q4_k_m | 66.38 MB |
smol-llama-101m-chat-v1.q5_k_m.gguf | q5_k_m | 75.31 MB |
smol-llama-101m-chat-v1.q6_k.gguf | q6_k | 84.80 MB |
smol-llama-101m-chat-v1.q8_0.gguf | q8_0 | 109.33 MB |
Original Model Card:
A Llama Chat Model of 101M Parameters
- Base model: BEE-spoke-data/smol_llama-101M-GQA
- Datasets:
- Availability in other ML formats:
Recommended Prompt Format
The recommended prompt format is as follows:
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
Recommended Inference Parameters
To get the best results, add special tokens and prefer using contrastive search for inference:
add_special_tokens: true
penalty_alpha: 0.5
top_k: 5
- Downloads last month
- 94
Model tree for afrideva/Smol-Llama-101M-Chat-v1-GGUF
Base model
BEE-spoke-data/smol_llama-101M-GQA
Finetuned
Felladrin/Smol-Llama-101M-Chat-v1