catid commited on
Commit
1039b19
2 Parent(s): f645d8c 588ef64

Merge branch 'main' of hf.co:catid/cat-llama-3-8b-awq-q128-w4-gemm

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -1,3 +1,16 @@
1
  AI Model Name: Llama 3 8B "Built with Meta Llama 3" https://llama.meta.com/llama3/license/
2
 
3
  This is the result of running AutoAWQ to quantize the LLaMA-3 8B model to ~4 bits/parameter.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  AI Model Name: Llama 3 8B "Built with Meta Llama 3" https://llama.meta.com/llama3/license/
2
 
3
  This is the result of running AutoAWQ to quantize the LLaMA-3 8B model to ~4 bits/parameter.
4
+
5
+ To launch an OpenAI-compatible API endpoint on your Linux server:
6
+
7
+ ```
8
+ git lfs install
9
+ git clone https://huggingface.co/catid/cat-llama-3-8b-awq-q128-w4-gemm
10
+
11
+ conda create -n vllm8 python=3.10 -y && conda activate vllm8
12
+
13
+ pip install git+https://github.com/vllm-project/vllm.git
14
+
15
+ python -m vllm.entrypoints.openai.api_server --model cat-llama-3-8b-awq-q128-w4-gemm
16
+ ```