Copycats commited on
Commit
b0bb228
1 Parent(s): 3a82246

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -44,8 +44,11 @@ Documentation on installing and using vLLM [can be found here](https://vllm.read
44
  - vLLM can be deployed as a server that implements the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API
45
 
46
  ```shell
47
- python3 -m vllm.entrypoints.openai.api_server --model Copycats/EEVE-Korean-Instruct-10.8B-v1.0-AWQ --quantization awq --dtype float16
48
  ```
 
 
 
49
 
50
  #### Querying the model using OpenAI Chat API:
51
  - You can use the create chat completion endpoint to communicate with the model in a chat-like interface:
 
44
  - vLLM can be deployed as a server that implements the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API
45
 
46
  ```shell
47
+ python3 -m vllm.entrypoints.openai.api_server --model Copycats/EEVE-Korean-Instruct-10.8B-v1.0-AWQ --quantization awq --dtype half
48
  ```
49
+ - --model: huggingface model path
50
+ - --quantization: ”awq”
51
+ - --dtype: “half” for FP16. Recommended for AWQ quantization.
52
 
53
  #### Querying the model using OpenAI Chat API:
54
  - You can use the create chat completion endpoint to communicate with the model in a chat-like interface: