Text Generation
Transformers
GGUF
code
granite
llama-cpp
gguf-my-repo
Eval Results
MoMonir's picture
Update README.md
8877244 verified
metadata
license: apache-2.0
library_name: transformers
tags:
  - code
  - granite
  - llama-cpp
  - gguf-my-repo
base_model: ibm-granite/granite-8b-code-base
datasets:
  - bigcode/commitpackft
  - TIGER-Lab/MathInstruct
  - meta-math/MetaMathQA
  - glaiveai/glaive-code-assistant-v3
  - glaive-function-calling-v2
  - bugdaryan/sql-create-context-instruction
  - garage-bAInd/Open-Platypus
  - nvidia/HelpSteer
metrics:
  - code_eval
pipeline_tag: text-generation
inference: false
model-index:
  - name: granite-8b-code-instruct
    results:
      - task:
          type: text-generation
        dataset:
          name: HumanEvalSynthesis(Python)
          type: bigcode/humanevalpack
        metrics:
          - type: pass@1
            value: 57.9
            name: pass@1
          - type: pass@1
            value: 52.4
            name: pass@1
          - type: pass@1
            value: 58.5
            name: pass@1
          - type: pass@1
            value: 43.3
            name: pass@1
          - type: pass@1
            value: 48.2
            name: pass@1
          - type: pass@1
            value: 37.2
            name: pass@1
          - type: pass@1
            value: 53
            name: pass@1
          - type: pass@1
            value: 42.7
            name: pass@1
          - type: pass@1
            value: 52.4
            name: pass@1
          - type: pass@1
            value: 36.6
            name: pass@1
          - type: pass@1
            value: 43.9
            name: pass@1
          - type: pass@1
            value: 16.5
            name: pass@1
          - type: pass@1
            value: 39.6
            name: pass@1
          - type: pass@1
            value: 40.9
            name: pass@1
          - type: pass@1
            value: 48.2
            name: pass@1
          - type: pass@1
            value: 41.5
            name: pass@1
          - type: pass@1
            value: 39
            name: pass@1
          - type: pass@1
            value: 32.9
            name: pass@1

MoMonir/granite-8b-code-instruct-GGUF

This model was converted to GGUF format from ibm-granite/granite-8b-code-instruct using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

About GGUF (TheBloke Description)

GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

Here is an incomplete list of clients and libraries that are known to support GGUF:

  • llama.cpp. The source project for GGUF. Offers a CLI and a server option.
  • text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
  • KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
  • GPT4All, a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
  • LM Studio, an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
  • LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection.
  • Faraday.dev, an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
  • llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
  • candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.
  • ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.

Use with llama.cpp

Install llama.cpp through brew.

brew install ggerganov/ggerganov/llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo MoMonir/granite-8b-code-instruct-GGUF --model granite-8b-code-instruct.Q4_K_M.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo MoMonir/granite-8b-code-instruct-GGUF --model granite-8b-code-instruct.Q4_K_M.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

git clone https://github.com/ggerganov/llama.cpp &&             cd llama.cpp &&             make &&             ./main -m granite-8b-code-instruct.Q4_K_M.gguf -n 128