ssec-uw
/

OLMo-7B-Instruct-GGUF

Text Generation

Model card Files Files and versions Community

lsetiawan commited on May 9

Commit

f1dc21d

•

1 Parent(s): c1e2ef8

Update README.md

Files changed (1) hide show

README.md +51 -3

README.md CHANGED Viewed

@@ -5,7 +5,55 @@ language:
 pipeline_tag: text-generation
 ---
-# OLMo-7B-Instruct-GGUF
-This repo contains GGUF files of the [ssec-uw/OLMo-7B-Instruct-hf](https://huggingface.co/ssec-uw/OLMo-7B-Instruct-hf) which is derived from [allenai/OLMo-7B-Instruct](https://huggingface.co/allenai/OLMo-7B-Instruct) model.
-These files can be used with llama.cpp or other software like ollama and LM Studio.

 pipeline_tag: text-generation
 ---
+# OLMo 7B-Instruct-GGUF
+> For more details on OLMO-7B-Instruct, refer to [Allen AI's OLMo-7B-Instruct model card](https://huggingface.co/allenai/OLMo-7B-Instruct).
+OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
+The OLMo base models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset.
+The Instruct version is trained on the [cleaned version of the UltraFeedback dataset](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned).
+OLMo 7B Instruct is trained for better question answering. They show the performance gain that OLMo base models can achieve with existing fine-tuning techniques.
+This version of the model is derived from [ssec-uw/OLMo-7B-Instruct-hf](https://huggingface.co/ssec-uw/OLMo-7B-Instruct-hf) as [GGUF format](https://huggingface.co/docs/hub/en/gguf),
+a binary format that is optimized for quick loading and saving of models, making it highly efficient for inference purposes.
+In addition to the model being in GGUF format, the model has been [quantized](https://huggingface.co/docs/optimum/en/concept_guides/quantization),
+to reduce the computational and memory costs of running inference. *We are currently working on adding all of the [Quantization Types](https://huggingface.co/docs/hub/en/gguf#quantization-types)*.
+These files are designed for use with [GGML](https://ggml.ai/) and executors based on GGML such as [llama.cpp](https://github.com/ggerganov/llama.cpp).
+## Get Started
+To get started using one of the GGUF file, you can simply use [llama-cpp-python](https://github.com/abetlen/llama-cpp-python),
+a Python binding for `llama.cpp`.
+1. Install `llama-cpp-python` with pip.
+    ```bash
+    pip install llama-cpp-python
+    ```
+2. Download one of the GGUF file. In this example,
+   we will download the [OLMo-7B-Instruct-Q4_K_M.gguf](https://huggingface.co/ssec-uw/OLMo-7B-Instruct-GGUF/resolve/main/OLMo-7B-Instruct-Q4_K_M.gguf?download=true),
+   when the link is clicked.
+3. Open up a python interpreter and run the following commands.
+   For example, we can ask it: `What is a solar system?`.
+   *You will need to modify the `model_path` argument to where
+   the GGUF model has been saved in your system*
+    ```python
+    from llama_cpp import Llama
+    llm = Llama(
+          model_path="path/to/OLMo-7B-Instruct-Q4_K_M.gguf"
+    )
+    result_dict = llm(prompt="What is solar system?", echo=True, max_tokens=500)
+    print(result_dict['choices'][0]['text'])
+    ```
+5. That's it, you should see the result fairly quickly! Have fun! 🤖
+## Contact
+For errors in this model card, contact Don or Anant, {landungs, anmittal} at uw dot edu.