Geraldine
/

FineLlama-3.2-3B-Instruct-ead-GGUF

Inference Endpoints

Model card Files Files and versions Community

Geraldine commited on Jan 1

Commit

f5b5c97

·

verified ·

1 Parent(s): d2725dd

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -41,8 +41,9 @@ The following quantized versions are available:
 ./server -m models/FineLlama-3.2-3B-Instruct-ead-Q4_K_M.gguf -c 4096
 ```
 ```python
-# Example using llama-cpp-python library
 from llama_cpp import Llama
 query = "..."
 llm = Llama.from_pretrained(
@@ -62,6 +63,11 @@ output = llm.create_chat_completion(
 )
 print(output["choices"][0]["message"]["content"])
 ```
 ## Quantization Details

 ./server -m models/FineLlama-3.2-3B-Instruct-ead-Q4_K_M.gguf -c 4096
 ```
+### Example using llama-cpp-python library
 ```python
 from llama_cpp import Llama
 query = "..."
 llm = Llama.from_pretrained(
 )
 print(output["choices"][0]["message"]["content"])
 ```
+### Example using Ollama
+```bash
+ollama run hf.co/Geraldine/FineLlama-3.2-3B-Instruct-ead-GGUF:Q4_K_M
+```
 ## Quantization Details