cobrakenji commited on
Commit
c0e64b1
1 Parent(s): a31caaa

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +25 -11
README.md CHANGED
@@ -6,7 +6,7 @@ tags:
6
  - granite
7
  - llama-cpp
8
  - gguf-my-repo
9
- base_model: ibm-granite/granite-34b-code-base
10
  datasets:
11
  - bigcode/commitpackft
12
  - TIGER-Lab/MathInstruct
@@ -88,29 +88,43 @@ model-index:
88
  # cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF
89
  This model was converted to GGUF format from [`ibm-granite/granite-34b-code-instruct`](https://huggingface.co/ibm-granite/granite-34b-code-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
90
  Refer to the [original model card](https://huggingface.co/ibm-granite/granite-34b-code-instruct) for more details on the model.
91
- ## Use with llama.cpp
92
 
93
- Install llama.cpp through brew.
 
94
 
95
  ```bash
96
- brew install ggerganov/ggerganov/llama.cpp
 
97
  ```
98
  Invoke the llama.cpp server or the CLI.
99
 
100
- CLI:
101
-
102
  ```bash
103
- llama-cli --hf-repo cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF --model granite-34b-code-instruct.Q4_K_M.gguf -p "The meaning to life and the universe is"
104
  ```
105
 
106
- Server:
107
-
108
  ```bash
109
- llama-server --hf-repo cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF --model granite-34b-code-instruct.Q4_K_M.gguf -c 2048
110
  ```
111
 
112
  Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
113
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
  ```
115
- git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make && ./main -m granite-34b-code-instruct.Q4_K_M.gguf -n 128
116
  ```
 
6
  - granite
7
  - llama-cpp
8
  - gguf-my-repo
9
+ base_model: ibm-granite/granite-34b-code-instruct
10
  datasets:
11
  - bigcode/commitpackft
12
  - TIGER-Lab/MathInstruct
 
88
  # cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF
89
  This model was converted to GGUF format from [`ibm-granite/granite-34b-code-instruct`](https://huggingface.co/ibm-granite/granite-34b-code-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
90
  Refer to the [original model card](https://huggingface.co/ibm-granite/granite-34b-code-instruct) for more details on the model.
 
91
 
92
+ ## Use with llama.cpp
93
+ Install llama.cpp through brew (works on Mac and Linux)
94
 
95
  ```bash
96
+ brew install llama.cpp
97
+
98
  ```
99
  Invoke the llama.cpp server or the CLI.
100
 
101
+ ### CLI:
 
102
  ```bash
103
+ llama --hf-repo cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF --hf-file granite-34b-code-instruct-q4_k_m.gguf -p "The meaning to life and the universe is"
104
  ```
105
 
106
+ ### Server:
 
107
  ```bash
108
+ llama-server --hf-repo cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF --hf-file granite-34b-code-instruct-q4_k_m.gguf -c 2048
109
  ```
110
 
111
  Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
112
 
113
+ Step 1: Clone llama.cpp from GitHub.
114
+ ```
115
+ git clone https://github.com/ggerganov/llama.cpp
116
+ ```
117
+
118
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
119
+ ```
120
+ cd llama.cpp && LLAMA_CURL=1 make
121
+ ```
122
+
123
+ Step 3: Run inference through the main binary.
124
+ ```
125
+ ./main --hf-repo cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF --hf-file granite-34b-code-instruct-q4_k_m.gguf -p "The meaning to life and the universe is"
126
+ ```
127
+ or
128
  ```
129
+ ./server --hf-repo cobrakenji/granite-34b-code-instruct-Q4_K_M-GGUF --hf-file granite-34b-code-instruct-q4_k_m.gguf -c 2048
130
  ```