Image-Text-to-Text
Transformers
Safetensors
MLX
English
gemma4
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prosing
vivid writing
fiction
roleplaying
float32
swearing
rp
horror
della
mistral
Merge
mergekit
mlx-my-repo
conversational
8-bit precision
Instructions to use McG-221/Goetia-31B-v1-mlx-8Bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use McG-221/Goetia-31B-v1-mlx-8Bit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="McG-221/Goetia-31B-v1-mlx-8Bit") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("McG-221/Goetia-31B-v1-mlx-8Bit") model = AutoModelForMultimodalLM.from_pretrained("McG-221/Goetia-31B-v1-mlx-8Bit") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - MLX
How to use McG-221/Goetia-31B-v1-mlx-8Bit with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("McG-221/Goetia-31B-v1-mlx-8Bit") config = load_config("McG-221/Goetia-31B-v1-mlx-8Bit") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- vLLM
How to use McG-221/Goetia-31B-v1-mlx-8Bit with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "McG-221/Goetia-31B-v1-mlx-8Bit" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "McG-221/Goetia-31B-v1-mlx-8Bit", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/McG-221/Goetia-31B-v1-mlx-8Bit
- SGLang
How to use McG-221/Goetia-31B-v1-mlx-8Bit with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "McG-221/Goetia-31B-v1-mlx-8Bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "McG-221/Goetia-31B-v1-mlx-8Bit", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "McG-221/Goetia-31B-v1-mlx-8Bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "McG-221/Goetia-31B-v1-mlx-8Bit", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Pi
How to use McG-221/Goetia-31B-v1-mlx-8Bit with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "McG-221/Goetia-31B-v1-mlx-8Bit"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "McG-221/Goetia-31B-v1-mlx-8Bit" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use McG-221/Goetia-31B-v1-mlx-8Bit with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "McG-221/Goetia-31B-v1-mlx-8Bit"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default McG-221/Goetia-31B-v1-mlx-8Bit
Run Hermes
hermes
- Docker Model Runner
How to use McG-221/Goetia-31B-v1-mlx-8Bit with Docker Model Runner:
docker model run hf.co/McG-221/Goetia-31B-v1-mlx-8Bit
McG-221/Goetia-31B-v1-mlx-8Bit
The Model McG-221/Goetia-31B-v1-mlx-8Bit was converted to MLX format from Naphula/Goetia-31B-v1 using mlx-lm version 0.31.2.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("McG-221/Goetia-31B-v1-mlx-8Bit")
prompt="hello"
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 32
Model size
31B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
8-bit
Model tree for McG-221/Goetia-31B-v1-mlx-8Bit
Base model
Naphula/Goetia-31B-v1