Instructions to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF", filename="PlayerAI-1.2B-v1.5-BF16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Use Docker
docker model run hf.co/YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
- Ollama
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with Ollama:
ollama run hf.co/YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
- Unsloth Studio
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF to start chatting
- Pi
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with Docker Model Runner:
docker model run hf.co/YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
- Lemonade
How to use YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.PlayerAI-1.2B-v1.5-GGUF-Q4_K_M
List all available models
lemonade list
PlayerAI-1.2B-v1.5-GGUF contains GGUF quantized versions of PlayerAI-1.2B-v1.5, a fine-tuned conversational language model designed for immersive, human-like interaction in multiplayer social environments.
This version improves conversational coherence, tone stability, and multi-turn consistency compared to previous releases, while remaining optimized for lightweight local inference.
Available Quantizations
| File | Quant | Size | Quality | Recommended For |
|---|---|---|---|---|
PlayerAI-1.2B-v1.5-Q2_K.gguf |
Q2_K | 483 MB | Lowest | Very limited RAM |
PlayerAI-1.2B-v1.5-Q3_K_S.gguf |
Q3_K_S | 558 MB | Very Low | Minimal RAM |
PlayerAI-1.2B-v1.5-Q3_K_M.gguf |
Q3_K_M | 600 MB | Low | Low RAM |
PlayerAI-1.2B-v1.5-Q3_K_L.gguf |
Q3_K_L | 635 MB | Low-Med | Low RAM |
PlayerAI-1.2B-v1.5-iQ4_XS.gguf |
IQ4_XS | 669 MB | Medium | Better than Q4 at same size |
PlayerAI-1.2B-v1.5-iQ4_NL.gguf |
IQ4_NL | 700 MB | Medium | Better than Q4 at same size |
PlayerAI-1.2B-v1.5-Q4_K_S.gguf |
Q4_K_S | 700 MB | Medium | Balanced |
PlayerAI-1.2B-v1.5-Q4_K_M.gguf |
Q4_K_M | 731 MB | Medium | ⭐ Recommended |
PlayerAI-1.2B-v1.5-Q5_K_S.gguf |
Q5_K_S | 825 MB | Good | High quality |
PlayerAI-1.2B-v1.5-Q5_K_M.gguf |
Q5_K_M | 843 MB | Good | High quality |
PlayerAI-1.2B-v1.5-Q6_K.gguf |
Q6_K | 963 MB | High | Near lossless |
PlayerAI-1.2B-v1.5-Q8_0.gguf |
Q8_0 | 1.25 GB | Very High | Best quality |
PlayerAI-1.2B-v1.5-BF16.gguf |
BF16 | 2.34 GB | Native precision | Reference |
PlayerAI-1.2B-v1.5-F16.gguf |
F16 | 2.34 GB | Full | Reference / conversion |
Which One Should I Pick?
Since this is a 1.2B model, all quantizations are lightweight enough for local use.
Any device with 1GB+ RAM → Q4_K_M ⭐ (recommended)
Best quality → Q8_0
Lowest size → Q2_K
Balanced performance → Q3_K_M / Q4_K_S
No limits → F16 / BF16
How to Use
With llama.cpp CLI
hf download YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF \
PlayerAI-1.2B-v1.5-Q4_K_M.gguf \
--local-dir ./PlayerAI-GGUF
./llama.cpp/build/bin/llama-cli \
-m ./PlayerAI-GGUF/PlayerAI-1.2B-v1.5-Q4_K_M.gguf \
-p "User: hi\nAI:" \
-n 100 \
--temp 0.8 \
--top-p 0.9
With llama-cpp-python
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF",
filename="PlayerAI-1.2B-v1.5-Q4_K_M.gguf",
n_ctx=512,
verbose=False,
)
SYSTEM_PROMPT = (
"You are a human-like player in a multiplayer chat environment. "
"Respond casually, with short informal messages and natural tone."
)
response = llm.create_chat_completion(
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "hi wsp"},
],
max_tokens=80,
temperature=0.8,
top_p=0.9,
)
print(response["choices"][0]["message"]["content"])
Model Overview
- Base Model: LiquidAI/LFM2.5-1.2B-Instruct
- Parent Model: PlayerAI-1.2B-v1.5
- Parameters: ~1.2B
- Architecture: Decoder-only Transformer
- Training Type: Supervised fine-tuning (full model)
- Context Style: Multi-turn conversational sequences
- Primary Objective: Social realism in dialogue generation
Intended Use
This model is intended for research and experimental use cases involving:
- Multiplayer conversational agents
- Social simulation environments
- NPC dialogue systems
- Human-like chat behavior modeling
- Interactive roleplay systems
It is not intended for:
- factual question answering
- structured instruction following
- safety-critical systems
- deterministic reasoning tasks
Example Interactions
Note: All assistant messages are generated by PlayerAI-1.2B-v1.5.
Example 1 — Single Turn
Example 2 — Short Conversation
Example 3 — Extended Context Chain
Example 4 — Nonsense Interaction
Example 5 — Reverse psychology
Behavior Characteristics
The model exhibits:
- informal conversational tone
- short and adaptive responses
- occasional ambiguity or inconsistency
- strong dependence on recent dialogue context
- variability in emotional and linguistic style
These properties are intentional and aligned with the social simulation objective.
Limitations
- Not suitable for factual reasoning tasks
- May produce inconsistent outputs in long contexts
- Limited stability in structured instruction formats
- Not optimized for deterministic responses
- Can exhibit conversational drift
Ethical Considerations
This model is intended for research and simulation purposes.
- Outputs may appear human-like in social contexts
- Behavior is optimized for realism, not correctness
- Conversational ambiguity is an intentional feature
Appropriate safeguards should be applied depending on deployment context.
Attribution (Optional)
If you use PlayerAI in a project, attribution is appreciated but not required:
"Powered by PlayerAI"
License
Apache 2.0
- Downloads last month
- 168
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for YoussefElsafi/PlayerAI-1.2B-v1.5-GGUF
Base model
LiquidAI/LFM2.5-1.2B-Base



