Instructions to use lerugray/melian-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lerugray/melian-7b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="lerugray/melian-7b",
	filename="melian-melian-7b-Q5_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use lerugray/melian-7b with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf lerugray/melian-7b:Q5_K_M
# Run inference directly in the terminal:
llama-cli -hf lerugray/melian-7b:Q5_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf lerugray/melian-7b:Q5_K_M
# Run inference directly in the terminal:
llama-cli -hf lerugray/melian-7b:Q5_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf lerugray/melian-7b:Q5_K_M
# Run inference directly in the terminal:
./llama-cli -hf lerugray/melian-7b:Q5_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf lerugray/melian-7b:Q5_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf lerugray/melian-7b:Q5_K_M

Use Docker

docker model run hf.co/lerugray/melian-7b:Q5_K_M

LM Studio
Jan

vLLM

How to use lerugray/melian-7b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lerugray/melian-7b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lerugray/melian-7b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/lerugray/melian-7b:Q5_K_M

Ollama
How to use lerugray/melian-7b with Ollama:
```
ollama run hf.co/lerugray/melian-7b:Q5_K_M
```

Unsloth Studio

How to use lerugray/melian-7b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for lerugray/melian-7b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for lerugray/melian-7b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for lerugray/melian-7b to start chatting

Atomic Chat new
Docker Model Runner
How to use lerugray/melian-7b with Docker Model Runner:
```
docker model run hf.co/lerugray/melian-7b:Q5_K_M
```

Lemonade

How to use lerugray/melian-7b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull lerugray/melian-7b:Q5_K_M

Run and chat with the model

lemonade run user.melian-7b-Q5_K_M

List all available models

lemonade list

melian: Thucydides (7B)

A voice model of Thucydides (c. 460–400 BC): Athenian general, exile, and historian of the Peloponnesian War, and the founder of political realism. It answers in his austere analytical first-person register. The cold logic of the Melian Dialogue ("the strong do what they can and the weak suffer what they must"), the methodical search for the truest cause beneath the stated pretexts of war, the autopsy of the plague at Athens, the funeral oration, the slow catastrophe in Sicily. A full fine-tune of Qwen2.5-7B-Instruct, quantized to Q5_K_M. The codename comes from the Melian Dialogue, his coldest statement of power without justice.

The frame puts a visitor in front of him and lets him answer in his own voice: "A visitor puts a question to Thucydides the historian: ___. Thucydides answers in the first person, in his own austere analytical voice:"

What this voice carries that no other in the set does: the original realist account of why states actually fight, told by a man who commanded a fleet, lost a city, and was exiled for it, then spent the rest of his life writing down what he had seen "as a possession for all time." He distinguishes the pretexts states announce from the truest cause they will not name, and he does not flinch from where that reasoning leads.

Source material (all public domain)

Thucydides wrote in the fifth century BC and Richard Crawley's translation dates to 1874, so every text used is in the public domain. The corpus is built from one canonical translation — about 936 first-person passages / ~203k words after the headings, chapter summaries, and Gutenberg apparatus are stripped:

History of the Peloponnesian War, trans. Richard Crawley (1874), Project Gutenberg #7142

One translation only, by design. The model learns the translator's English as much as the author's voice, so mixing Crawley with Jowett or Hobbes would muddy the register. Crawley is the standard pick and his etext is clean. The corpus is entirely Thucydides' own narration and the speeches he reports; no modern "about him" material goes into training.

Running it: the stop tokens are load-bearing

Serve it with the provided Modelfile. Its stop tokens are not optional. Thucydides answers in register first, then the base model drifts into apparatus — third-person narration ("Thucydides observes…"), book-and-chapter citations, translator's notes, source attributions. The stops cut at the drift and hold the voice in the first person. Without them it reads like a footnoted edition of Thucydides rather than Thucydides. (The model is also trained to stop on its own; the stop tokens are belt-and-suspenders.)

Limitations

A 7B model makes things up and gets facts wrong. It will stretch Thucydides' cadence onto wars and ideas he never addressed and will produce anachronisms. This is a stylistic instrument, not a scholar and not a historian.

Not the man, and do not act on it

This is not Thucydides, not an oracle, and not advice. It is an amateur imitation, trained on a fraction of one translation of one man's work, that gets things wrong.

Thucydides wrote without consolation. He records what the strong do to the weak and what fear, honor, and interest drive cities to, and he does not break character to soften it or to caution you. Nothing it says is an endorsement of anything, and nothing it says should be acted on. It exists to let a historical voice run as an instrument, not as a guide to conduct, belief, or action.

Part of The Elect, a small fleet of public-domain historical-voice models. https://lerugray.github.io/the-elect/

Downloads last month: -

GGUF

Model size

8B params

Architecture

qwen2

Hardware compatibility

5-bit

Model tree for lerugray/melian-7b

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Quantized

(341)

this model