Update README.md
Browse files
README.md
CHANGED
@@ -27,9 +27,9 @@ The following table shows the performance degradation due to quantization:
|
|
27 |
|
28 |
| Model | ELYZA-tasks-100 GPT4 score |
|
29 |
| :-------------------------------- | ---: |
|
30 |
-
| Llama-3-ELYZA-JP-8B | 3.655 |
|
31 |
-
| Llama-3-ELYZA-JP-8B-GGUF (Q4_K_M) | 3.57 |
|
32 |
-
| Llama-3-ELYZA-JP-8B-AWQ | 3.39 |
|
33 |
|
34 |
|
35 |
## Use with llama.cpp
|
@@ -90,6 +90,10 @@ There are various desktop applications that can handle GGUF models, but here we
|
|
90 |
- **Setting Options**: You can set options from the sidebar on the right. Faster inference can be achieved by setting Quick GPU Offload to Max in the GPU Settings.
|
91 |
- **(For Developers) Starting an API Server**: Click `<->` in the left sidebar and move to the Local Server tab. Select the model and click Start Server to launch an OpenAI API-compatible API server.
|
92 |
|
|
|
|
|
|
|
|
|
93 |
## Developers
|
94 |
|
95 |
Listed in alphabetical order.
|
|
|
27 |
|
28 |
| Model | ELYZA-tasks-100 GPT4 score |
|
29 |
| :-------------------------------- | ---: |
|
30 |
+
| [Llama-3-ELYZA-JP-8B](https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B) | 3.655 |
|
31 |
+
| [Llama-3-ELYZA-JP-8B-GGUF (Q4_K_M)](https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B-GGUF) | 3.57 |
|
32 |
+
| [Llama-3-ELYZA-JP-8B-AWQ](https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B-AWQ) | 3.39 |
|
33 |
|
34 |
|
35 |
## Use with llama.cpp
|
|
|
90 |
- **Setting Options**: You can set options from the sidebar on the right. Faster inference can be achieved by setting Quick GPU Offload to Max in the GPU Settings.
|
91 |
- **(For Developers) Starting an API Server**: Click `<->` in the left sidebar and move to the Local Server tab. Select the model and click Start Server to launch an OpenAI API-compatible API server.
|
92 |
|
93 |
+
![lmstudio-demo](./lmstudio-demo.gif)
|
94 |
+
|
95 |
+
This demo showcases Llama-3-ELYZA-JP-8B-GGUF running smoothly on a MacBook Pro (M1 Pro), achieving an inference speed of approximately 20 tokens per second.
|
96 |
+
|
97 |
## Developers
|
98 |
|
99 |
Listed in alphabetical order.
|