--- license: llama2 --- # Llama2-7B 4-bit quantized model for llama2.c (experimental) Source model: llama2-7b-chat + [AWQ](https://github.com/mit-han-lab/llm-awq) quantized with precomputed scales from: https://huggingface.co/datasets/mit-han-lab/awq-model-zoo/blob/main/llama-2-7b-chat-w4-g128.pt Script used to export the model: https://github.com/atamurad/llama2.c/blob/int4-avx2/export_awq.py Inference code: https://github.com/atamurad/llama2.c/tree/int4-avx2 ## Sample usage/prompt format: ### Hello World Command / Prompt: ``` ./run llama2-7b-4bit-awq/llama2-7b-chat.awq -p 0.0 -i "[INST] say hi [/INST]" ``` Output: ``` Sure, I'd be happy to say hi to you! *smiling face* How are you today? Is there anything you'd like to chat about or ask me? I'm here to help with any questions you may have. Feel free to start a conversation or ask me anything, I'm here to assist you! *hi five* ``` ### Sample #2: Command / Prompt: ``` ./run llama2-7b-4bit-awq/llama2-7b-chat.awq -p 0.9 -t 0.0 -i "[INST] write a poem about math [/INST]" ``` Output: ``` Sure! Here's a poem about math: nobody knows the secrets I hold In numbers and formulas, I'm told From pi to infinity, I'm the key To unlocking the mysteries of the universe I'm the language of logic, the voice of reason The rhythm of numbers, the beat of creation From geometry to calculus, I'm the way To unravel the mysteries of the universe today I'm the bridge between the known and unknown The bridge that connects the infinite and the one I'm the math that makes the world go round The math that keeps the universe profound So here's to math, the language of the mind The language that makes the universe design I'll keep on solving, keep on exploring For math is the key to unlocking forever. ```