| | --- |
| | pipeline_tag: text-generation |
| | inference: false |
| | license: apache-2.0 |
| | library_name: transformers |
| | tags: |
| | - language |
| | - granite-3.0 |
| | - llama-cpp |
| | - gguf-my-repo |
| | base_model: ibm-granite/granite-3.0-8b-instruct |
| | model-index: |
| | - name: granite-3.0-2b-instruct |
| | results: |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: IFEval |
| | type: instruction-following |
| | metrics: |
| | - type: pass@1 |
| | value: 52.27 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 8.22 |
| | name: pass@1 |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: AGI-Eval |
| | type: human-exams |
| | metrics: |
| | - type: pass@1 |
| | value: 40.52 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 65.82 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 34.45 |
| | name: pass@1 |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: OBQA |
| | type: commonsense |
| | metrics: |
| | - type: pass@1 |
| | value: 46.6 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 71.21 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 82.61 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 77.51 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 60.32 |
| | name: pass@1 |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: BoolQ |
| | type: reading-comprehension |
| | metrics: |
| | - type: pass@1 |
| | value: 88.65 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 21.58 |
| | name: pass@1 |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: ARC-C |
| | type: reasoning |
| | metrics: |
| | - type: pass@1 |
| | value: 64.16 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 33.81 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 51.55 |
| | name: pass@1 |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: HumanEvalSynthesis |
| | type: code |
| | metrics: |
| | - type: pass@1 |
| | value: 64.63 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 57.16 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 65.85 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 49.6 |
| | name: pass@1 |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: GSM8K |
| | type: math |
| | metrics: |
| | - type: pass@1 |
| | value: 68.99 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 30.94 |
| | name: pass@1 |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: PAWS-X (7 langs) |
| | type: multilingual |
| | metrics: |
| | - type: pass@1 |
| | value: 64.94 |
| | name: pass@1 |
| | - type: pass@1 |
| | value: 48.2 |
| | name: pass@1 |
| | --- |
| | |
| | # NikolayKozloff/granite-3.0-8b-instruct-Q8_0-GGUF |
| | This model was converted to GGUF format from [`ibm-granite/granite-3.0-8b-instruct`](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space. |
| | Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct) for more details on the model. |
| | |
| | ## Use with llama.cpp |
| | Install llama.cpp through brew (works on Mac and Linux) |
| | |
| | ```bash |
| | brew install llama.cpp |
| | |
| | ``` |
| | Invoke the llama.cpp server or the CLI. |
| | |
| | ### CLI: |
| | ```bash |
| | llama-cli --hf-repo NikolayKozloff/granite-3.0-8b-instruct-Q8_0-GGUF --hf-file granite-3.0-8b-instruct-q8_0.gguf -p "The meaning to life and the universe is" |
| | ``` |
| | |
| | ### Server: |
| | ```bash |
| | llama-server --hf-repo NikolayKozloff/granite-3.0-8b-instruct-Q8_0-GGUF --hf-file granite-3.0-8b-instruct-q8_0.gguf -c 2048 |
| | ``` |
| | |
| | Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well. |
| | |
| | Step 1: Clone llama.cpp from GitHub. |
| | ``` |
| | git clone https://github.com/ggerganov/llama.cpp |
| | ``` |
| | |
| | Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux). |
| | ``` |
| | cd llama.cpp && LLAMA_CURL=1 make |
| | ``` |
| | |
| | Step 3: Run inference through the main binary. |
| | ``` |
| | ./llama-cli --hf-repo NikolayKozloff/granite-3.0-8b-instruct-Q8_0-GGUF --hf-file granite-3.0-8b-instruct-q8_0.gguf -p "The meaning to life and the universe is" |
| | ``` |
| | or |
| | ``` |
| | ./llama-server --hf-repo NikolayKozloff/granite-3.0-8b-instruct-Q8_0-GGUF --hf-file granite-3.0-8b-instruct-q8_0.gguf -c 2048 |
| | ``` |
| | |