gobean commited on
Commit
477ec28
1 Parent(s): 73a5ec1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md CHANGED
@@ -1,3 +1,26 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ This is a llamafile for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
5
+
6
+ I'm adding both q4-k-m and q5-k-m this time since it's a big model. On my 4090, q4-k-m is twice as face a q5-k-m without noticeable difference in chat or information quality. The speed of q5-k-m on my desktop computer is unusable, q4-k-m recommended.
7
+
8
+ The quantized gguf was downloaded straight from [TheBloke](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF) this time,
9
+ and then zipped into a llamafile using [Mozilla's awesome project](https://github.com/Mozilla-Ocho/llamafile).
10
+
11
+ It's over 4gb so if you want to use it on Windows you'll have to run it from WSL.
12
+
13
+ WSL note: If you get the error about APE, and the recommended command
14
+
15
+ `sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop'`
16
+
17
+ doesn't work, the file might be named something else so I had success with
18
+
19
+ `sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop-late'`
20
+
21
+ If that fails too, just navigate to `/proc/sys/fs/binfmt_msc` and see what files look like `WSLInterop` and echo a -1 to whatever it's called by changing that part of the recommended command.
22
+
23
+
24
+ Llamafiles are a standalone executable that run an LLM server locally on a variety of operating systems.
25
+ You just run it, open the chat interface in a browser, and interact.
26
+ Options can be passed in to expose the api etc.