How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="tuxqeq/tux-ai-chat",
	filename="",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

tux-ai-chat

A Qwen3-8B chatbot fine-tuned on tokenized PII records via QLoRA.

What it does

All personally identifiable information is replaced with [TYPE_hash] placeholders (e.g. [PERSON_a1b2c3d4], [SSN_e5f6g7h8]). The model can:

  • Generate synthetic tokenized records
  • Answer questions about specific fields in a tokenized record
  • Summarize tokenized records while preserving all placeholders
  • Extract and reformat sections of tokenized records
  • Hold multi-turn conversations about records

It never emits raw PII and never attempts to decode placeholders.

Quickstart (Ollama)

ollama create tux-ai-chat -f Modelfile
ollama run tux-ai-chat

Example prompts:

Generate a customer record for a healthcare professional.
What is the SSN token in this record?
[paste tokenized record]

Training details

Setting Value
Base model Qwen/Qwen3-8B
Method QLoRA (4-bit, r=16, alpha=32)
Training data 1 000 synthetic tokenized records, ~6 000 chat examples
Epochs 3
Thinking mode Disabled (enable_thinking=False)
Quantization Q8_0 GGUF

Full project

github.com/tuxqeq/tux.ai

Downloads last month
227
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tuxqeq/tux-ai-chat

Finetuned
Qwen/Qwen3-8B
Quantized
(317)
this model