Text Generation
MLX
Safetensors
ouro
looped-language-model
reasoning
recurrent-depth
thinking
chain-of-thought
conversational
custom_code
4-bit precision
Instructions to use mlx-community/Ouro-2.6B-Thinking-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Ouro-2.6B-Thinking-4bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("mlx-community/Ouro-2.6B-Thinking-4bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use mlx-community/Ouro-2.6B-Thinking-4bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "mlx-community/Ouro-2.6B-Thinking-4bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "mlx-community/Ouro-2.6B-Thinking-4bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/Ouro-2.6B-Thinking-4bit", "messages": [ {"role": "user", "content": "Hello"} ] }'
No MLX support of model type ouro - How to run?
#1
by pculebras - opened
I was trying to run this model locally and mlx_lm keeps throwing the error: ValueError: Model type ouro not supported.
Is it safe to assume that it was quantized with MLX, but we have to wait for MLX to support the model type?
try upgrade the version you using..
Hey! Support for this model was never merged , but there's a PR here: https://github.com/ml-explore/mlx-lm/pull/599