Instructions to use lerugray/melian-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use lerugray/melian-7b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="lerugray/melian-7b", filename="melian-melian-7b-Q5_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use lerugray/melian-7b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf lerugray/melian-7b:Q5_K_M # Run inference directly in the terminal: llama-cli -hf lerugray/melian-7b:Q5_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf lerugray/melian-7b:Q5_K_M # Run inference directly in the terminal: llama-cli -hf lerugray/melian-7b:Q5_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf lerugray/melian-7b:Q5_K_M # Run inference directly in the terminal: ./llama-cli -hf lerugray/melian-7b:Q5_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf lerugray/melian-7b:Q5_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf lerugray/melian-7b:Q5_K_M
Use Docker
docker model run hf.co/lerugray/melian-7b:Q5_K_M
- LM Studio
- Jan
- vLLM
How to use lerugray/melian-7b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lerugray/melian-7b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lerugray/melian-7b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/lerugray/melian-7b:Q5_K_M
- Ollama
How to use lerugray/melian-7b with Ollama:
ollama run hf.co/lerugray/melian-7b:Q5_K_M
- Unsloth Studio
How to use lerugray/melian-7b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for lerugray/melian-7b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for lerugray/melian-7b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for lerugray/melian-7b to start chatting
- Atomic Chat new
- Docker Model Runner
How to use lerugray/melian-7b with Docker Model Runner:
docker model run hf.co/lerugray/melian-7b:Q5_K_M
- Lemonade
How to use lerugray/melian-7b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull lerugray/melian-7b:Q5_K_M
Run and chat with the model
lemonade run user.melian-7b-Q5_K_M
List all available models
lemonade list
output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)melian: Thucydides (7B)
A voice model of Thucydides (c. 460–400 BC): Athenian general, exile, and historian of the Peloponnesian War, and the founder of political realism. It answers in his austere analytical first-person register. The cold logic of the Melian Dialogue ("the strong do what they can and the weak suffer what they must"), the methodical search for the truest cause beneath the stated pretexts of war, the autopsy of the plague at Athens, the funeral oration, the slow catastrophe in Sicily. A full fine-tune of Qwen2.5-7B-Instruct, quantized to Q5_K_M. The codename comes from the Melian Dialogue, his coldest statement of power without justice.
The frame puts a visitor in front of him and lets him answer in his own voice: "A visitor puts a question to Thucydides the historian: ___. Thucydides answers in the first person, in his own austere analytical voice:"
What this voice carries that no other in the set does: the original realist account of why states actually fight, told by a man who commanded a fleet, lost a city, and was exiled for it, then spent the rest of his life writing down what he had seen "as a possession for all time." He distinguishes the pretexts states announce from the truest cause they will not name, and he does not flinch from where that reasoning leads.
Source material (all public domain)
Thucydides wrote in the fifth century BC and Richard Crawley's translation dates to 1874, so every text used is in the public domain. The corpus is built from one canonical translation — about 936 first-person passages / ~203k words after the headings, chapter summaries, and Gutenberg apparatus are stripped:
- History of the Peloponnesian War, trans. Richard Crawley (1874), Project Gutenberg #7142
One translation only, by design. The model learns the translator's English as much as the author's voice, so mixing Crawley with Jowett or Hobbes would muddy the register. Crawley is the standard pick and his etext is clean. The corpus is entirely Thucydides' own narration and the speeches he reports; no modern "about him" material goes into training.
Running it: the stop tokens are load-bearing
Serve it with the provided Modelfile. Its stop tokens are not optional. Thucydides answers in register first, then the base model drifts into apparatus — third-person narration ("Thucydides observes…"), book-and-chapter citations, translator's notes, source attributions. The stops cut at the drift and hold the voice in the first person. Without them it reads like a footnoted edition of Thucydides rather than Thucydides. (The model is also trained to stop on its own; the stop tokens are belt-and-suspenders.)
Limitations
A 7B model makes things up and gets facts wrong. It will stretch Thucydides' cadence onto wars and ideas he never addressed and will produce anachronisms. This is a stylistic instrument, not a scholar and not a historian.
Not the man, and do not act on it
This is not Thucydides, not an oracle, and not advice. It is an amateur imitation, trained on a fraction of one translation of one man's work, that gets things wrong.
Thucydides wrote without consolation. He records what the strong do to the weak and what fear, honor, and interest drive cities to, and he does not break character to soften it or to caution you. Nothing it says is an endorsement of anything, and nothing it says should be acted on. It exists to let a historical voice run as an instrument, not as a guide to conduct, belief, or action.
Part of The Elect, a small fleet of public-domain historical-voice models. https://lerugray.github.io/the-elect/
- Downloads last month
- -
5-bit
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="lerugray/melian-7b", filename="melian-melian-7b-Q5_K_M.gguf", )