Instructions to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF", filename="Home-Cook-Mistral-Small-Omni-2507-Q4_K_M.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Use Docker
docker model run hf.co/ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with Ollama:
ollama run hf.co/ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
- Unsloth Studio
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF to start chatting
- Pi
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with Docker Model Runner:
docker model run hf.co/ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
- Lemonade
How to use ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Home-Cook-Mistral-Small-Omni-24B-2507-GGUF-Q4_K_M
List all available models
lemonade list
| #!/usr/bin/env python3 | |
| from __future__ import annotations | |
| import logging | |
| import argparse | |
| import os | |
| import sys | |
| import json | |
| from pathlib import Path | |
| from tqdm import tqdm | |
| from typing import Any, Sequence, NamedTuple | |
| # Necessary to load the local gguf package | |
| if "NO_LOCAL_GGUF" not in os.environ and (Path(__file__).parent.parent.parent.parent / 'gguf-py').exists(): | |
| sys.path.insert(0, str(Path(__file__).parent.parent.parent)) | |
| import gguf | |
| logger = logging.getLogger("gguf-mmproj-merge") | |
| class MetadataDetails(NamedTuple): | |
| type: gguf.GGUFValueType | |
| value: Any | |
| description: str = '' | |
| sub_type: gguf.GGUFValueType | None = None | |
| def get_field_data(reader: gguf.GGUFReader, key: str) -> Any: | |
| field = reader.get_field(key) | |
| return field.contents() if field else None | |
| def merge_multiple_ggufs(readers: Sequence[gguf.GGUFReader], writer: gguf.GGUFWriter) -> None: | |
| total_bytes = 0 | |
| seen_fields = set() | |
| for reader in readers: | |
| for field in reader.fields.values(): | |
| # Suppress virtual fields and fields written by GGUFWriter | |
| if field.name == gguf.Keys.General.ARCHITECTURE or field.name.startswith('GGUF.') or "projector_type" in field.name: | |
| logger.debug(f'Suppressing {field.name}') | |
| continue | |
| if field.name in seen_fields: | |
| logger.debug(f'Skipping duplicate field {field.name}') | |
| continue | |
| seen_fields.add(field.name) | |
| val_type = field.types[0] | |
| sub_type = field.types[-1] if val_type == gguf.GGUFValueType.ARRAY else None | |
| old_val = MetadataDetails(val_type, field.contents(), sub_type=sub_type) | |
| val = old_val | |
| assert val.value is not None | |
| logger.debug(f'Copying {field.name}') | |
| writer.add_key_value(field.name, val.value, val.type, sub_type=sub_type if val.sub_type is None else val.sub_type) | |
| for tensor in reader.tensors: | |
| total_bytes += tensor.n_bytes | |
| writer.add_tensor_info(tensor.name, tensor.data.shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type) | |
| bar = tqdm(desc="Writing", total=total_bytes, unit="byte", unit_scale=True) | |
| writer.add_string("clip.vision.projector_type", "pixtral") | |
| writer.add_string("clip.audio.projector_type", "voxtral") | |
| writer.write_header_to_file() | |
| writer.write_kv_data_to_file() | |
| writer.write_ti_data_to_file() | |
| for reader in readers: | |
| for tensor in reader.tensors: | |
| writer.write_tensor_data(tensor.data) | |
| bar.update(tensor.n_bytes) | |
| writer.close() | |
| def main() -> None: | |
| reader0 = gguf.GGUFReader('audio.gguf', 'r') | |
| reader1 = gguf.GGUFReader('vision.gguf', 'r') | |
| output_path = 'mmproj-model.gguf' | |
| logger.info(f'* Writing: {output_path}') | |
| writer = gguf.GGUFWriter(output_path, arch='clip', endianess=reader0.endianess) | |
| merge_multiple_ggufs([reader0, reader1], writer) | |
| if __name__ == '__main__': | |
| main() | |