BrewInteractive
/

fikri-3.1-8B-Instruct-Q4_K_M

Inference Endpoints

Model card Files Files and versions Community

osmanorhan commited on Sep 5

Commit

7a7ce0e

•

1 Parent(s): 10a3311

Update README.md

Files changed (1) hide show

README.md +33 -3

README.md CHANGED Viewed

@@ -1,3 +1,33 @@
----
-license: llama3.1
----

+---
+license: llama3.1
+language:
+- tr
+---
+This is a quantized version of the BrewInteractive/fikri-3.1-8B-Instruct model.
+* Original model: fikri-3.1-8B-Instruct
+* Base model: LLaMA-3.1-8B
+* Quantization: Q4_K_M
+* Optimized for faster inference and reduced memory usage while maintaining performance
+* Built on the LLaMA 3.1 architecture (8B)
+* Fine-tuned for Turkish language tasks
+* Quantized for improved efficiency
+# How to use
+1. Install llama.cpp:
+   * For macOS, use Homebrew:
+     ```
+     brew install llama.cpp
+     ```
+   * For other operating systems, follow the installation instructions on the [llama.cpp GitHub repository](https://github.com/ggerganov/llama.cpp).
+2. Download the quantized GGUF file from this repository's Files section.
+3. Run the following command for conversation mode:
+```
+llama-cli -m ./fikri-3.1-8B-Instruct-Q4_K_M.gguf --no-mmap -fa -c 4096 --temp 0.8 -if --in-prefix "<|start_header_id|>user<|end_header_id|>\n\n" --in-suffix "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
+```