mofosyne
/

TinyLLama-v0-5M-F16-llamafile

Text Generation

model-conversion

Inference Endpoints

Model card Files Files and versions Community

mofosyne commited on Apr 5, 2024

Commit

41d282a

·

1 Parent(s): 5d5e38b

update readme

Files changed (1) hide show

README.md +2 -34

README.md CHANGED Viewed

@@ -56,42 +56,10 @@ llamafile is a new format introduced by Mozilla Ocho on Nov 20th 2023. It uses C
 ## Replication Steps Assumption
-* You have already installed llamafile `/usr/local/bin/llamafile`
 * You have already pulled in all the submodules including Maykeye's model in safe.tensor format
 * Your git has LFS configured correctly or you get this issue https://github.com/ggerganov/llama.cpp/issues/1994 where `safe.tensor` doesn't download properly (and only a small pointer file is downloaded)
-* You are using llama.cpp repo that has some extra changes to convert.py to support metadata import (for now it's pointed to my repo)
 ## Replication Steps
-```bash
-#!/bin/sh
-# Pull both the model folder and llama.cpp (for the conversion script)
-git submodule update --init
-# Convert from safetensor to gguf
-# (Assuming llama.cpp is in the next folder)
-./llama.cpp/convert.py maykeye_tinyllama --metadata maykeye_tinyllama-metadata.json
-# Copy the generated gguf to this folder
-cp maykeye_tinyllama/TinyLLama-v0-5M-F16.gguf TinyLLama-v0-5M-F16.gguf
-# Get the llamafile engine
-cp /usr/local/bin/llamafile TinyLLama-v0-5M-F16.llamafile
-# Create an .args file with settings defaults
-cat >.args <<EOF
--m
-TinyLLama-v0-5M-F16.gguf
-...
-EOF
-# Combine
-zipalign -j0 \
-  TinyLLama-v0-5M-F16.llamafile \
-  TinyLLama-v0-5M-F16.gguf \
-  .args
-# Test
-./TinyLLama-v0-5M-F16.llamafile --cli -p "hello world the gruff man said"
-```

 ## Replication Steps Assumption
 * You have already pulled in all the submodules including Maykeye's model in safe.tensor format
 * Your git has LFS configured correctly or you get this issue https://github.com/ggerganov/llama.cpp/issues/1994 where `safe.tensor` doesn't download properly (and only a small pointer file is downloaded)
+* You are using llama.cpp repo that has some extra changes to convert.py to support metadata import (for now it's pointed to my repo. A [Pull Request is Pending at the main llama.cpp for this feature](https://github.com/ggerganov/llama.cpp/pull/4858))
 ## Replication Steps
+For the most current replication steps, refer to the bash script `llamafile-creation.sh` in this repo