daslab-testing
/

CloverLM

Text Generation

low-precision-training

Model card Files Files and versions

mansaripo commited on 2 days ago

Commit

fb4181d

·

verified ·

1 Parent(s): 1cbe20e

Update vllm_plugin/SERVING.md

Files changed (1) hide show

vllm_plugin/SERVING.md +3 -3

vllm_plugin/SERVING.md CHANGED Viewed

@@ -38,14 +38,14 @@ uv pip install \
 ### Offline inference (quick test)
 ```bash
-cd /home/matin/convert_dir/CloverLM/vllm_plugin
 python serve.py
 ```
 ### OpenAI-compatible API server
 ```bash
-cd /home/matin/convert_dir/CloverLM/vllm_plugin
 python serve.py --api --port 8000
 ```
@@ -55,7 +55,7 @@ Then query:
 curl http://localhost:8000/v1/completions \
     -H "Content-Type: application/json" \
     -d '{
-        "model": "/home/matin/convert_dir/CloverLM",
         "prompt": "The capital of France is",
         "max_tokens": 64,
         "temperature": 0.8

 ### Offline inference (quick test)
 ```bash
+cd CloverLM/vllm_plugin
 python serve.py
 ```
 ### OpenAI-compatible API server
 ```bash
+cd CloverLM/vllm_plugin
 python serve.py --api --port 8000
 ```
 curl http://localhost:8000/v1/completions \
     -H "Content-Type: application/json" \
     -d '{
+        "model": "path/to/CloverLM",
         "prompt": "The capital of France is",
         "max_tokens": 64,
         "temperature": 0.8