Aether-1.5B-Agentic-core

Aether-1.5B-Agentic-core is an elite, edge-native language model explicitly optimized to serve as the core routing hub and tool-execution engine for autonomous multi-agent orchestration frameworks (such as CrewAI, LangChain, and AutoGen).

By optimizing the attention heads specifically for API schemas, this model bridges a massive architectural gap: enabling localized, low-latency deployment without sacrificing structural integrity during code output steps.

⚡ Technical Specifications

Feature	System Configuration
Model Blueprint	Qwen-2.5 (Instruct variant backplane)
Parameter Volume	1.54 Billion
Context Window	4,096 Tokens
Quantization Format	Un-quantized; natively merged back into 16-bit Float (fp16)
Inference VRAM Profile	~3.5 GB (Highly accessible for consumer hardware)
Primary Specialty	Deterministic JSON-Schema Parsing & Argument Extraction

🛠️ The Fine-Tuning Pipeline

Standard small language models (<3B parameters) are notoriously fragile when handling code parameters—they frequently introduce syntax hallucinations, drop trailing brackets, or introduce verbose conversational fluff ("Sure, let me call that function for you!") that instantly breaks automated software parsers.

🧠 Training Strategy & Hyperparameters

Aether-1.5B-Agentic-core was constructed via Parameter-Efficient Fine-Tuning (PEFT) using Unsloth:

Quantization Method: 4-bit QLoRA target module injection to maximize gradient headroom.
Target Modules: Complete attention mechanism coverage (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj).
Dataset Alignment: Conditioned exclusively on high-fidelity multi-turn function invocation layouts from NousResearch/hermes-function-calling-v1.
Baking Protocol: Merged permanently back into structural base layers (merged_16bit), eliminating adapter latency overhead entirely.

⚙️ System Infrastructure & Stack Badges

🚀 Behavioral Traits & Core Abilities

Deterministic Structured Layouts: Hardened against schema syntax decay, guaranteeing valid, clean, parseable JSON payload extractions.
Zero-Dialogue Overhead: Stripped of non-operational text arrays. The model targets raw arguments instantly, cutting down execution latency and compute token costs.
Strict Data-Type Preservation: Natively correlates natural text variables into explicit system-level parameters (e.g., matching raw strings directly to accurate int, boolean, or array properties).

💻 Implementation & Inference Sample

To trigger the tool-calling pathway natively, structure your system prompt with clear tool boundaries.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Jenil05/Aether-1.5B-Agentic-core"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

messages =

Downloads last month: 68

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for Jenil05/Aether-1.5B-Agentic-core

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Quantized

unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit

Finetuned

(75)

this model

Quantizations

1 model