Commit
•
b2a6536
1
Parent(s):
4505133
Update README.md
Browse files
README.md
CHANGED
@@ -28,10 +28,20 @@ tags:
|
|
28 |
|
29 |
## Run with LlamaEdge
|
30 |
|
31 |
-
- LlamaEdge version:
|
32 |
|
33 |
- Context size: `384`
|
34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
## Quantized GGUF Models
|
36 |
|
37 |
| Name | Quant method | Bits | Size | Use case |
|
|
|
28 |
|
29 |
## Run with LlamaEdge
|
30 |
|
31 |
+
- LlamaEdge version: [v0.8.2](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.8.2) and above
|
32 |
|
33 |
- Context size: `384`
|
34 |
|
35 |
+
- Run as LlamaEdge service
|
36 |
+
|
37 |
+
```bash
|
38 |
+
wasmedge --dir .:. --nn-preload default:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
|
39 |
+
llama-api-server.wasm \
|
40 |
+
--prompt-template llama-2-chat \
|
41 |
+
--ctx-size 384 \
|
42 |
+
--model-name all-MiniLM-L6-v2
|
43 |
+
```
|
44 |
+
|
45 |
## Quantized GGUF Models
|
46 |
|
47 |
| Name | Quant method | Bits | Size | Use case |
|