BramVanroy
/

GEITje-7B-ultra-GGUF

Inference Endpoints

Model card Files Files and versions Community

BramVanroy commited on Apr 5, 2024

Commit

529e879

·

verified ·

1 Parent(s): 677da9d

Create README.md

Files changed (1) hide show

README.md +44 -0

README.md ADDED Viewed

	@@ -0,0 +1,44 @@

+---
+license: cc-by-nc-4.0
+language:
+- nl
+tags:
+- gguf
+- llamacpp
+- dpo
+- geitje
+- conversational
+datasets:
+- BramVanroy/ultra_feedback_dutch
+---
+<img src="https://huggingface.co/BramVanroy/GEITje-7B-ultra/resolve/main/geitje-ultra-banner.png" alt="GEITje Ultra banner" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
+# GEITje 7B ultra (GGUF version)
+This is a `Q5_K_M` GGUF version of [BramVanroy/GEITje-7B-ultra](https://huggingface.co/BramVanroy/GEITje-7B-ultra). For more information on the model, data, licensing, usage, see the main model's README.
+## Repro
+Assuming you have installed and build llama cpp, current working directory is the `build` directory in llamacpp.
+Download initial model (probaby a huggingface-cli alternative exists, too...)
+```python
+from huggingface_hub import snapshot_download
+model_id = "BramVanroy/GEITje-7B-ultra"
+snapshot_download(repo_id=model_id, local_dir="geitje-ultra-hf", local_dir_use_symlinks=False)
+```
+Convert to GGML format
+```shell
+# Convert to GGML format
+python convert.py build/geitje-ultra-hf/
+cd build
+# Quantize to Q5_K_M
+bin/quantize geitje-ultra-hf/ggml-model-f32.gguf geitje-ultra-hf/GEITje-7B-ultra-Q5_K_M.gguf Q5_K_M
+```