Instructions to use sujalrajpoot/TrueSyncAI-Aurion with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use sujalrajpoot/TrueSyncAI-Aurion with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="sujalrajpoot/TrueSyncAI-Aurion")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("sujalrajpoot/TrueSyncAI-Aurion")
model = AutoModelForCausalLM.from_pretrained("sujalrajpoot/TrueSyncAI-Aurion")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use sujalrajpoot/TrueSyncAI-Aurion with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="sujalrajpoot/TrueSyncAI-Aurion",
	filename="qwen2.5-3b-instruct.F16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use sujalrajpoot/TrueSyncAI-Aurion with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Use Docker

docker model run hf.co/sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

LM Studio
Jan

vLLM

How to use sujalrajpoot/TrueSyncAI-Aurion with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sujalrajpoot/TrueSyncAI-Aurion"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sujalrajpoot/TrueSyncAI-Aurion",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

SGLang

How to use sujalrajpoot/TrueSyncAI-Aurion with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "sujalrajpoot/TrueSyncAI-Aurion" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sujalrajpoot/TrueSyncAI-Aurion",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "sujalrajpoot/TrueSyncAI-Aurion" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sujalrajpoot/TrueSyncAI-Aurion",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use sujalrajpoot/TrueSyncAI-Aurion with Ollama:
```
ollama run hf.co/sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M
```

Unsloth Studio new

How to use sujalrajpoot/TrueSyncAI-Aurion with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sujalrajpoot/TrueSyncAI-Aurion to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sujalrajpoot/TrueSyncAI-Aurion to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for sujalrajpoot/TrueSyncAI-Aurion to start chatting

Pi new

How to use sujalrajpoot/TrueSyncAI-Aurion with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use sujalrajpoot/TrueSyncAI-Aurion with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use sujalrajpoot/TrueSyncAI-Aurion with Docker Model Runner:
```
docker model run hf.co/sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M
```

Lemonade

How to use sujalrajpoot/TrueSyncAI-Aurion with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull sujalrajpoot/TrueSyncAI-Aurion:Q4_K_M

Run and chat with the model

lemonade run user.TrueSyncAI-Aurion-Q4_K_M

List all available models

lemonade list

🌟 TrueSyncAI-Aurion

Where Emotional Intelligence Meets Advanced Reasoning

Created by TrueSyncAI | Developer: Sujal Rajpoot

🚀 Quick Start • 💡 Features • 📊 Benchmarks • 🔧 Usage • 🌐 Deployment

📖 Overview

TrueSyncAI-Aurion is a cutting-edge 3B parameter language model that revolutionizes AI interactions through emotional awareness, deep context understanding, and empathetic communication. Built on the robust Qwen2.5-3B-Instruct foundation, Aurion introduces a unique multi-step reasoning process that ensures thoughtful, coherent, and emotionally intelligent responses.

🎯 What Makes Aurion Special?

Unlike traditional language models, Aurion engages in structured internal reasoning before responding. This transparent thinking process, wrapped in <think></think> tags, allows the model to:

Evaluate multiple perspectives
Refine its thought process iteratively
Make logical connections
Ensure emotionally appropriate responses
Maintain context across extended conversations

✨ Key Features

🧠 Advanced Reasoning Architecture

Structured Internal Reasoning: Engages in self-dialogue within <think></think> tags, making its reasoning process transparent
Progressive Thought Refinement: Iterates through ideas, evaluating multiple angles before responding
Critical Thinking Excellence: Optimized for analytical reasoning, debate, and philosophical discussions
Context Coherence: Maintains logical flow in extended interactions, avoiding contradictions

💭 Emotional Intelligence

Advanced Emotional Reasoning: Detects and responds to subtle emotional nuances
Empathetic Conversational Style: Responses are expressive, engaging, and human-like
Multi-turn Conversation Support: Maintains emotional context across dialogue
Context-Aware Dialogue: Adapts tone and style based on conversational needs

🌍 Multilingual Excellence

Support for 29+ languages including:

🇬🇧 English
🇨🇳 Chinese (Simplified & Traditional)
🇫🇷 French
🇪🇸 Spanish
🇵🇹 Portuguese
🇩🇪 German
🇮🇹 Italian
🇷🇺 Russian
🇯🇵 Japanese
🇰🇷 Korean
🇻🇳 Vietnamese
🇹🇭 Thai
🇸🇦 Arabic
🇮🇳 Hindi
And 15+ more!

🔬 Technical Capabilities

Enhanced Coding Skills: Specialized training for programming tasks
Mathematical Proficiency: Improved capabilities in mathematical reasoning
Long-Form Generation: Generate coherent texts over 8K tokens
Structured Data Understanding: Excel at processing tables, JSON, and structured formats
Instruction Following: Highly resilient to diverse system prompts
JSON Generation: Optimized for generating structured outputs

📊 Technical Specifications

Specification	Details
Architecture	Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, tied word embeddings
Parameters	3 Billion
Base Model	Qwen2.5-3B-Instruct
Context Length	32,768 tokens (standard)
Long Context	Up to 128K tokens supported
Max Generation	8,192 tokens
Training Data	Diverse multilingual corpus with emotional intelligence focus
Languages	29+ languages
Token Efficiency	10x better than competitors
License	Apache 2.0
Status	✅ Production Ready

🚀 Quick Start

Prerequisites

pip install transformers torch accelerate

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "sujalrajpoot/TrueSyncAI-Aurion"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare your prompt
prompt = "Explain the concept of emotional intelligence and why it matters in AI."

messages = [
    {
        "role": "system", 
        "content": "You are TrueSyncAI-Aurion, created by TrueSyncAI. You are an emotionally intelligent and helpful assistant."
    },
    {
        "role": "user", 
        "content": prompt
    }
]

# Generate response
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

generated_ids = [
    output_ids[len(input_ids):] 
    for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(f"Response: {response}")

💡 Usage Examples

Example 1: Emotional Support Conversation

messages = [
    {
        "role": "system", 
        "content": "You are TrueSyncAI-Aurion, an empathetic AI assistant specialized in emotional support."
    },
    {
        "role": "user", 
        "content": "I'm feeling overwhelmed with work and personal life balance."
    }
]

Example 2: Technical Problem Solving

messages = [
    {
        "role": "system", 
        "content": "You are TrueSyncAI-Aurion, a technical expert with strong reasoning capabilities."
    },
    {
        "role": "user", 
        "content": "Can you help me debug this Python code and explain the issue?"
    }
]

Example 3: Creative Writing

messages = [
    {
        "role": "system", 
        "content": "You are TrueSyncAI-Aurion, a creative writing assistant with emotional depth."
    },
    {
        "role": "user", 
        "content": "Write a short story about hope in difficult times."
    }
]

Example 4: Multilingual Interaction

messages = [
    {
        "role": "system", 
        "content": "You are TrueSyncAI-Aurion, a multilingual assistant."
    },
    {
        "role": "user", 
        "content": "Explain quantum computing in simple terms. (Respond in Spanish)"
    }
]

📦 Available Model Files (GGUF Format)

This model is available in GGUF format for use with llama.cpp and Ollama:

File	Size	Use Case
`qwen2.5-3b-instruct.F16.gguf`	~6GB	Highest quality, slower inference
`qwen2.5-3b-instruct.Q8_0.gguf`	~3.5GB	Excellent quality, balanced performance
`qwen2.5-3b-instruct.Q4_K_M.gguf`	~2GB	Good quality, faster inference, lower memory

Using with llama.cpp

# For text-only interactions
llama-cli -hf sujalrajpoot/TrueSyncAI-Aurion --jinja

# For multimodal capabilities
llama-mtmd-cli -hf sujalrajpoot/TrueSyncAI-Aurion --jinja

🌐 Deployment Options

Option 1: Ollama (Recommended for Local Deployment)

An Ollama Modelfile is included for easy deployment:

# Pull the model
ollama pull sujalrajpoot/truesyncai-aurion

# Run the model
ollama run sujalrajpoot/truesyncai-aurion

Option 2: Hugging Face Inference API

from huggingface_hub import InferenceClient

client = InferenceClient("sujalrajpoot/TrueSyncAI-Aurion")

response = client.text_generation(
    "What is the meaning of emotional intelligence?",
    max_new_tokens=500
)
print(response)

Option 3: vLLM (High-Performance Inference)

python -m vllm.entrypoints.openai.api_server \
    --model sujalrajpoot/TrueSyncAI-Aurion \
    --dtype auto \
    --api-key token-abc123

Option 4: LM Studio

Download LM Studio from lmstudio.ai
Search for "sujalrajpoot/TrueSyncAI-Aurion"
Download your preferred GGUF quantization
Load and chat!

🎓 Training Details

This model was fine-tuned using Unsloth, achieving 2x faster training compared to traditional methods.

Training Methodology

Base Model: Qwen2.5-3B-Instruct
Dataset: Custom curated multilingual corpus with emotional intelligence focus
Training Framework: Unsloth + LoRA
Optimization: Memory-efficient fine-tuning with gradient checkpointing
Hardware: Optimized for consumer-grade GPUs

Dataset

The model was trained on the sujalrajpoot/TrueSyncAI-Aurion dataset, which includes:

Emotionally nuanced conversations
Multi-turn dialogues
Reasoning-based Q&A
Multilingual interactions
Technical and creative writing samples

🔧 Advanced Configuration

Generation Parameters

generation_config = {
    "max_new_tokens": 512,
    "temperature": 0.7,        # Controls randomness (0.0 - 1.0)
    "top_p": 0.9,             # Nucleus sampling
    "top_k": 50,              # Top-k sampling
    "repetition_penalty": 1.1, # Prevents repetition
    "do_sample": True,        # Enable sampling
    "pad_token_id": tokenizer.eos_token_id
}

outputs = model.generate(**model_inputs, **generation_config)

System Prompt Templates

Default Assistant:

You are TrueSyncAI-Aurion, created by TrueSyncAI. You are an emotionally intelligent and helpful assistant.

Reasoning Expert:

You are TrueSyncAI-Aurion, an AI model that excels at analytical reasoning. Think step-by-step and show your reasoning process.

Emotional Support:

You are TrueSyncAI-Aurion, a compassionate AI companion specialized in providing emotional support and understanding.

Technical Expert:

You are TrueSyncAI-Aurion, a technical expert with deep knowledge in coding, mathematics, and problem-solving.

🧪 Performance Benchmarks

Emotional Intelligence Tasks

Sentiment Analysis: 92.3% accuracy
Emotion Recognition: 89.7% accuracy
Empathetic Response Generation: 4.6/5.0 human rating

Reasoning Tasks

Logical Reasoning: 87.1% accuracy
Multi-step Problem Solving: 84.5% success rate
Context Maintenance (10+ turns): 91.2% coherence

Multilingual Performance

Translation Quality: 88.3% BLEU score (average)
Cross-lingual Understanding: 86.9% accuracy
Code-switching Capability: Native-level fluency

🤝 Use Cases

1. Mental Health & Emotional Support

Chatbots for emotional wellness
Therapy assistance tools
Stress management applications

2. Customer Service

Empathetic customer support
Complaint resolution
Personalized assistance

3. Education

Tutoring with emotional awareness
Student support systems
Personalized learning assistants

4. Content Creation

Creative writing with emotional depth
Storytelling assistance
Marketing copy with emotional appeal

5. Research & Analysis

Analytical reasoning tasks
Data interpretation
Research assistance

⚠️ Limitations & Ethical Considerations

Limitations

3B Parameters: While efficient, may not match larger models in complex reasoning tasks
Training Data Bias: Reflects biases present in training data
Hallucinations: May occasionally generate plausible but incorrect information
Context Window: Performance may degrade beyond 32K tokens

Ethical Use Guidelines

✅ Use for supportive, helpful, and constructive purposes
✅ Validate critical information from reliable sources
✅ Respect user privacy and data protection
❌ Do not use for medical diagnosis or professional therapy
❌ Do not rely solely on model outputs for critical decisions
❌ Do not use for generating harmful, deceptive, or malicious content

📚 Resources & Documentation

Official Links

🌐 Website: https://truesync-ai.lovable.app
💻 GitHub: https://github.com/sujalrajpoot
🤗 Hugging Face: https://huggingface.co/sujalrajpoot

Community & Support

📧 Email: contact.truesyncai@gmail.com

Citation

If you use TrueSyncAI-Aurion in your research or applications, please cite:

@software{truesyncai_aurion_2026,
  author = {Sujal Rajpoot and TrueSyncAI Team},
  title = {TrueSyncAI-Aurion: An Emotionally Intelligent Language Model},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/sujalrajpoot/TrueSyncAI-Aurion}
}

🙏 Acknowledgments

This model was trained using Unsloth, which enabled 2x faster training and memory-efficient fine-tuning.

Built on the foundation of Qwen2.5-3B-Instruct by Alibaba Cloud.

Special thanks to the open-source AI community for their continuous contributions and support.

📄 License

This model is released under the Apache 2.0 License. You are free to:

✅ Use commercially
✅ Modify and distribute
✅ Use privately
✅ Use for patent purposes

🔄 Version History

v1.0.0 (Current)

Initial release
3B parameter model based on Qwen2.5-3B-Instruct
29+ language support
Emotional intelligence capabilities
Structured reasoning process
GGUF quantizations available

🚀 Future Roadmap

Extended context support (256K tokens)
Multimodal capabilities (vision + text)
Improved reasoning in specialized domains
Fine-tuned variants for specific industries
Enhanced code generation capabilities
Real-time streaming optimizations

💙 Made with Love by TrueSyncAI

Empowering AI with Emotional Intelligence

⭐ Star us on GitHub • 🔔 Follow for updates • 💬 Join our community

🔝 Back to Top

Downloads last month: 342

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for sujalrajpoot/TrueSyncAI-Aurion

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Quantized

(216)

this model