Instructions to use NotHereNorThere/Qwemini-1.7b-Beta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NotHereNorThere/Qwemini-1.7b-Beta with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("NotHereNorThere/Qwemini-1.7b-Beta", dtype="auto")

llama-cpp-python

How to use NotHereNorThere/Qwemini-1.7b-Beta with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="NotHereNorThere/Qwemini-1.7b-Beta",
	filename="model-Q5_K_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use NotHereNorThere/Qwemini-1.7b-Beta with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M
# Run inference directly in the terminal:
llama-cli -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M
# Run inference directly in the terminal:
llama-cli -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M
# Run inference directly in the terminal:
./llama-cli -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Use Docker

docker model run hf.co/NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

LM Studio
Jan
Ollama
How to use NotHereNorThere/Qwemini-1.7b-Beta with Ollama:
```
ollama run hf.co/NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M
```

Unsloth Studio

How to use NotHereNorThere/Qwemini-1.7b-Beta with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for NotHereNorThere/Qwemini-1.7b-Beta to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for NotHereNorThere/Qwemini-1.7b-Beta to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for NotHereNorThere/Qwemini-1.7b-Beta to start chatting

How to use NotHereNorThere/Qwemini-1.7b-Beta with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use NotHereNorThere/Qwemini-1.7b-Beta with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Run Hermes

hermes

Docker Model Runner
How to use NotHereNorThere/Qwemini-1.7b-Beta with Docker Model Runner:
```
docker model run hf.co/NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M
```

Lemonade

How to use NotHereNorThere/Qwemini-1.7b-Beta with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull NotHereNorThere/Qwemini-1.7b-Beta:Q5_K_M

Run and chat with the model

lemonade run user.Qwemini-1.7b-Beta-Q5_K_M

List all available models

lemonade list

Qwemini-1.7B-Beta

Qwen3-1.7B fine-tuned on 250 Gemini 3 Pro chain-of-thought traces.

The grown-up version of Qwemini-0.5B-Alpha. Same teacher, same approach, a model that actually has the architecture to use it.

What it is

QLoRA fine-tune of Qwen3-1.7B on 250 Gemini 3 Pro structured reasoning traces. Goal was pure style transfer, Qwen3 already knows how to reason, this teaches it how we want it to reason. The native <think> token support changes everything compared to the 0.5B predecessor.

Training

Setting	Value
Base model	`Qwen/Qwen3-1.7B`
Method	QLoRA (4-bit NF4, LoRA r=16)
Dataset	250 Gemini 3 Pro CoT traces
Hardware	RTX 4060 8GB
Attention	FlashAttention 2
Packing	Enabled

Eval results

Prompt	Result	Notes
Bat & ball ($1.10 problem)	⚠️ Wrong answer, right process	Got $0.10, but thinking block caught its own error and rationalized past it anyway
1/2 of 12 Fish drowning	⚠️ Near miss	Noted "ambiguity in the question's phrasing" inside think block, answered 6 anyway — closest any model got to catching the false premise
Jug problem (3gal + 5gal = 4gal)	✅ Correct strategy	Thinking block described the correct solution perfectly, written steps got slightly garbled
Pills trick (3 pills, every 30 min)	⚠️ Contradicted itself	Produced two different answers (60 min and 90 min) in the same response without resolving the conflict

The big finding

Thinking tags activated unprompted.

Qwen3's native thinking architecture survived the fine-tune intact. The model genuinely uses an internal scratchpad before answering rather than just formatting its output to look like reasoning. This is qualitatively different from every other model in the Qwemini/YapLlama/AtomCoT family — those learned the costume of reasoning. This one is actually thinking, just not always correctly.

Honest assessment

The failure modes are completely different from the smaller model, instead of confident wrong answers or structured nonsense, you get a model that notices problems, almost catches false premises, and occasionally argues with itself.

The bat and ball error is the most interesting result: the thinking block explicitly computed $1.20 ≠ $1.10 and then declared the solution valid anyway. It's not that it can't detect errors — it's that it doesn't always act on them. More data and more epochs would likely close this gap.

Compared to Qwemini-0.5B-Alpha

	0.5B-Alpha	1.7B-Beta
Native thinking tags	❌	✅
Bat & ball	✅ Correct	⚠️ Wrong but self-aware
Premise checking	❌	⚠️ Almost
Jug problem	❌ Hallucinated	✅ Correct strategy
Reasoning quality	Structured correct	Genuinely thinking

What would improve it

More epochs, loss was still healthy at checkpoint, room to keep learning
Premise-checking traces, it almost caught the fish problem, 50 targeted examples would probably close it
More data and more varietey (eg 6000 rows, Gemini 3.1 + Opus 4.6) is the natural next step

Part of

The Qwemini model family, Qwen models fine-tuned for structured reasoning.

Model	Params	Thinking tags	Actually reasons
Qwemini-0.5B-Alpha	500M	❌	✅ simple problems
Qwemini-1.7B-Beta	1.7B	✅	✅ with self-correction attempts

Downloads last month: 3

GGUF

Model size

2B params

Architecture

qwen3

Hardware compatibility

5-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NotHereNorThere/Qwemini-1.7b-Beta

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Quantized

(281)

this model

Dataset used to train NotHereNorThere/Qwemini-1.7b-Beta

Collection including NotHereNorThere/Qwemini-1.7b-Beta

Qwemini

Collection

Qwen models fine tuned to use structured and self-correcting reasoning, primarily from Gemini 3.x models. • 2 items • Updated about 8 hours ago