update readme
Browse files
README.md
CHANGED
@@ -56,42 +56,10 @@ llamafile is a new format introduced by Mozilla Ocho on Nov 20th 2023. It uses C
|
|
56 |
|
57 |
## Replication Steps Assumption
|
58 |
|
59 |
-
* You have already installed llamafile `/usr/local/bin/llamafile`
|
60 |
* You have already pulled in all the submodules including Maykeye's model in safe.tensor format
|
61 |
* Your git has LFS configured correctly or you get this issue https://github.com/ggerganov/llama.cpp/issues/1994 where `safe.tensor` doesn't download properly (and only a small pointer file is downloaded)
|
62 |
-
* You are using llama.cpp repo that has some extra changes to convert.py to support metadata import (for now it's pointed to my repo)
|
63 |
|
64 |
## Replication Steps
|
65 |
|
66 |
-
|
67 |
-
#!/bin/sh
|
68 |
-
|
69 |
-
# Pull both the model folder and llama.cpp (for the conversion script)
|
70 |
-
git submodule update --init
|
71 |
-
|
72 |
-
# Convert from safetensor to gguf
|
73 |
-
# (Assuming llama.cpp is in the next folder)
|
74 |
-
./llama.cpp/convert.py maykeye_tinyllama --metadata maykeye_tinyllama-metadata.json
|
75 |
-
|
76 |
-
# Copy the generated gguf to this folder
|
77 |
-
cp maykeye_tinyllama/TinyLLama-v0-5M-F16.gguf TinyLLama-v0-5M-F16.gguf
|
78 |
-
|
79 |
-
# Get the llamafile engine
|
80 |
-
cp /usr/local/bin/llamafile TinyLLama-v0-5M-F16.llamafile
|
81 |
-
|
82 |
-
# Create an .args file with settings defaults
|
83 |
-
cat >.args <<EOF
|
84 |
-
-m
|
85 |
-
TinyLLama-v0-5M-F16.gguf
|
86 |
-
...
|
87 |
-
EOF
|
88 |
-
|
89 |
-
# Combine
|
90 |
-
zipalign -j0 \
|
91 |
-
TinyLLama-v0-5M-F16.llamafile \
|
92 |
-
TinyLLama-v0-5M-F16.gguf \
|
93 |
-
.args
|
94 |
-
|
95 |
-
# Test
|
96 |
-
./TinyLLama-v0-5M-F16.llamafile --cli -p "hello world the gruff man said"
|
97 |
-
```
|
|
|
56 |
|
57 |
## Replication Steps Assumption
|
58 |
|
|
|
59 |
* You have already pulled in all the submodules including Maykeye's model in safe.tensor format
|
60 |
* Your git has LFS configured correctly or you get this issue https://github.com/ggerganov/llama.cpp/issues/1994 where `safe.tensor` doesn't download properly (and only a small pointer file is downloaded)
|
61 |
+
* You are using llama.cpp repo that has some extra changes to convert.py to support metadata import (for now it's pointed to my repo. A [Pull Request is Pending at the main llama.cpp for this feature](https://github.com/ggerganov/llama.cpp/pull/4858))
|
62 |
|
63 |
## Replication Steps
|
64 |
|
65 |
+
For the most current replication steps, refer to the bash script `llamafile-creation.sh` in this repo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|