reach-vb HF staff commited on
Commit
b884564
1 Parent(s): 9d4ffaf

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +38 -9
README.md CHANGED
@@ -1,14 +1,16 @@
1
  ---
2
- language:
3
- - en
4
- license: apache-2.0
5
- tags:
6
- - llama-cpp
7
  datasets:
8
  - cerebras/SlimPajama-627B
9
  - bigcode/starcoderdata
10
  - HuggingFaceH4/ultrachat_200k
11
  - HuggingFaceH4/ultrafeedback_binarized
 
 
 
 
 
 
12
  widget:
13
  - example_title: Fibonacci (Python)
14
  messages:
@@ -20,18 +22,45 @@ widget:
20
  ---
21
 
22
  # reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF
23
- This model was converted to GGUF format from [`TinyLlama/TinyLlama-1.1B-Chat-v1.0`](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) using llama.cpp.
24
  Refer to the [original model card](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) for more details on the model.
 
25
  ## Use with llama.cpp
 
26
 
27
  ```bash
28
- brew install ggerganov/ggerganov/llama.cpp
 
29
  ```
 
30
 
 
31
  ```bash
32
- llama-cli --hf-repo reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF --model tinyllama-1.1b-chat-v1.0.Q2_K.gguf -p "The meaning to life and the universe is "
33
  ```
34
 
 
35
  ```bash
36
- llama-server --hf-repo reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF --model tinyllama-1.1b-chat-v1.0.Q2_K.gguf -c 2048
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```
 
1
  ---
2
+ base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
 
 
 
 
3
  datasets:
4
  - cerebras/SlimPajama-627B
5
  - bigcode/starcoderdata
6
  - HuggingFaceH4/ultrachat_200k
7
  - HuggingFaceH4/ultrafeedback_binarized
8
+ language:
9
+ - en
10
+ license: apache-2.0
11
+ tags:
12
+ - llama-cpp
13
+ - gguf-my-repo
14
  widget:
15
  - example_title: Fibonacci (Python)
16
  messages:
 
22
  ---
23
 
24
  # reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF
25
+ This model was converted to GGUF format from [`TinyLlama/TinyLlama-1.1B-Chat-v1.0`](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
26
  Refer to the [original model card](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) for more details on the model.
27
+
28
  ## Use with llama.cpp
29
+ Install llama.cpp through brew (works on Mac and Linux)
30
 
31
  ```bash
32
+ brew install llama.cpp
33
+
34
  ```
35
+ Invoke the llama.cpp server or the CLI.
36
 
37
+ ### CLI:
38
  ```bash
39
+ llama-cli --hf-repo reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF --hf-file tinyllama-1.1b-chat-v1.0-q2_k.gguf -p "The meaning to life and the universe is"
40
  ```
41
 
42
+ ### Server:
43
  ```bash
44
+ llama-server --hf-repo reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF --hf-file tinyllama-1.1b-chat-v1.0-q2_k.gguf -c 2048
45
+ ```
46
+
47
+ Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
48
+
49
+ Step 1: Clone llama.cpp from GitHub.
50
+ ```
51
+ git clone https://github.com/ggerganov/llama.cpp
52
+ ```
53
+
54
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
55
+ ```
56
+ cd llama.cpp && LLAMA_CURL=1 make
57
+ ```
58
+
59
+ Step 3: Run inference through the main binary.
60
+ ```
61
+ ./llama-cli --hf-repo reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF --hf-file tinyllama-1.1b-chat-v1.0-q2_k.gguf -p "The meaning to life and the universe is"
62
+ ```
63
+ or
64
+ ```
65
+ ./llama-server --hf-repo reach-vb/TinyLlama-1.1B-Chat-v1.0-Q2_K-GGUF --hf-file tinyllama-1.1b-chat-v1.0-q2_k.gguf -c 2048
66
  ```