Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,26 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
5 |
+
|
6 |
+
I'm adding both q4-k-m and q5-k-m this time since it's a big model. On my 4090, q4-k-m is twice as face a q5-k-m without noticeable difference in chat or information quality. The speed of q5-k-m on my desktop computer is unusable, q4-k-m recommended.
|
7 |
+
|
8 |
+
The quantized gguf was downloaded straight from [TheBloke](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF) this time,
|
9 |
+
and then zipped into a llamafile using [Mozilla's awesome project](https://github.com/Mozilla-Ocho/llamafile).
|
10 |
+
|
11 |
+
It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.
|
12 |
+
|
13 |
+
WSL note: If you get the error about APE, and the recommended command
|
14 |
+
|
15 |
+
`sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop'`
|
16 |
+
|
17 |
+
doesn't work, the file might be named something else so I had success with
|
18 |
+
|
19 |
+
`sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop-late'`
|
20 |
+
|
21 |
+
If that fails too, just navigate to `/proc/sys/fs/binfmt_msc` and see what files look like `WSLInterop` and echo a -1 to whatever it's called by changing that part of the recommended command.
|
22 |
+
|
23 |
+
|
24 |
+
Llamafiles are a standalone executable that run an LLM server locally on a variety of operating systems.
|
25 |
+
You just run it, open the chat interface in a browser, and interact.
|
26 |
+
Options can be passed in to expose the api etc.
|