Instructions to use orhanaydinn/OAX-1B-Humanoid-Merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use orhanaydinn/OAX-1B-Humanoid-Merged with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="orhanaydinn/OAX-1B-Humanoid-Merged")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("orhanaydinn/OAX-1B-Humanoid-Merged")
model = AutoModelForCausalLM.from_pretrained("orhanaydinn/OAX-1B-Humanoid-Merged")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use orhanaydinn/OAX-1B-Humanoid-Merged with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "orhanaydinn/OAX-1B-Humanoid-Merged"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "orhanaydinn/OAX-1B-Humanoid-Merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/orhanaydinn/OAX-1B-Humanoid-Merged

SGLang

How to use orhanaydinn/OAX-1B-Humanoid-Merged with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "orhanaydinn/OAX-1B-Humanoid-Merged" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "orhanaydinn/OAX-1B-Humanoid-Merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "orhanaydinn/OAX-1B-Humanoid-Merged" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "orhanaydinn/OAX-1B-Humanoid-Merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use orhanaydinn/OAX-1B-Humanoid-Merged with Docker Model Runner:
```
docker model run hf.co/orhanaydinn/OAX-1B-Humanoid-Merged
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

OAX-1B-Humanoid

OAX-1B-Humanoid is a custom 1B-parameter LLaMA-style language model developed for humanoid robot interaction, high-level reasoning, and structured JSON tool calling.

The model was first pre-trained as a base language model and then supervised fine-tuned (SFT) to produce JSON-formatted robot-assistant responses. It was designed for use in the OAX humanoid robot project, where the language model acts as the high-level reasoning layer between natural language user commands and executable robot tools.

The model does not directly control motors, servos, or low-level hardware. Instead, it generates structured JSON outputs that can be passed to a Planner, Controller, or tool-execution layer for validation before any real or simulated action is executed.

Intended Use

This model is intended for research and prototyping around LLM-driven humanoid robot assistants.

It is designed to convert user commands into structured JSON responses such as:

{
  "type": "tool_call",
  "response": "Searching for the cup.",
  "tool": "search_object",
  "arguments": {
    "object": "cup"
  }
}

The expected use case is a controlled robot-agent pipeline:

User command
→ LLM JSON response
→ Planner / Controller validation
→ Tool execution
→ Robot or environment state update

Output Format

The model is fine-tuned to return a valid JSON object with exactly these fields:

{
  "type": "tool_call",
  "response": "short natural language explanation",
  "tool": "tool_name",
  "arguments": {}
}

Valid type values are:

chat
tool_call
clarify
refuse

For chat, clarify, and refuse, the tool field should be null.

When no arguments are required, arguments should be an empty object:

{
  "type": "tool_call",
  "response": "Checking visible objects.",
  "tool": "get_visible_objects",
  "arguments": {}
}

Tool-Calling Behaviour

The model was fine-tuned around robot-assistant tools such as:

get_visible_objects
get_robot_status
search_object
pick_object
place_object
stop

Example outputs:

{
  "type": "tool_call",
  "response": "Checking robot status.",
  "tool": "get_robot_status",
  "arguments": {}
}

{
  "type": "tool_call",
  "response": "Attempting to pick up the bottle.",
  "tool": "pick_object",
  "arguments": {
    "object": "bottle"
  }
}

{
  "type": "tool_call",
  "response": "Placing the bottle on the table.",
  "tool": "place_object",
  "arguments": {
    "object": "bottle",
    "destination": "table"
  }
}

Prompt Format

The model was trained with explicit role tags:

SYSTEM_TAG = "<|system|>"
USER_TAG = "<|user|>"
ASSISTANT_TAG = "<|assistant|>"

A typical prompt follows this structure:

<|system|>
You are OAX, a humanoid robot assistant. Always return a valid JSON object with exactly these fields: type, response, tool, arguments.

<|user|>
Find the cup.

<|assistant|>
{"type":"tool_call","response":"Searching for the cup.","tool":"search_object","arguments":{"object":"cup"}}

During inference, the prompt should end with:

<|assistant|>

so that the model generates the next JSON response.

Model Structure

This repository contains the model in two parts:

base_model/
lora_adapter/

The base_model folder contains the pre-trained 1B LLaMA-style model.

The lora_adapter folder contains the supervised fine-tuned adapter used for JSON tool-calling behaviour.

A typical loading flow is:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

repo_id = "orhanaydinn/OAX-1B-Humanoid"

tokenizer = AutoTokenizer.from_pretrained(
    repo_id,
    subfolder="base_model",
    trust_remote_code=True,
    use_fast=False
)

base_model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    subfolder="base_model",
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    trust_remote_code=True,
    low_cpu_mem_usage=True
)

model = PeftModel.from_pretrained(
    base_model,
    repo_id,
    subfolder="lora_adapter",
    is_trainable=False
)

model.eval()

Depending on the local setup, it may be more reliable to download the repository first and then load the base_model/ and lora_adapter/ folders as local paths.

Example System Prompt

The model works best when the system prompt clearly defines the JSON schema and tool-calling rules:

You are OAX, a humanoid robot assistant.
Respond briefly and clearly.
When replying, always output a valid JSON object with exactly these fields: type, response, tool, arguments.
Valid type values are: chat, tool_call, clarify, refuse.
Use tool=null for chat, clarify, and refuse.
Use an empty object for arguments when no arguments are needed.
Do not add extra fields.
Do not use low-level motor or servo commands.
Do not hallucinate perception results.
If the request is incomplete, ask for clarification.
If the request is unsafe or unsupported, refuse.

Notes on Safety and Validation

This model is intended to act as a high-level reasoning layer, not as a direct actuator controller.

The model may occasionally produce imperfect, premature, or inconsistent tool calls. For this reason, it should be used with an external validation layer such as a Planner or Controller before any action is executed.

A recommended architecture is:

LLM output
→ JSON parsing
→ Controller validation
→ Action repair or rejection
→ Tool execution
→ State update

This separation is important because the model output should not directly change the robot or environment state without deterministic validation.

Limitations

This model is experimental and was developed for a research prototype.

Known limitations include:

It may occasionally call a tool too early.
It may produce an incorrect object or destination name.
It may require a Controller to normalise or repair tool arguments.
It is not designed for direct low-level robot control.
It should not be used for safety-critical robotic control without additional verification, safety constraints, and human supervision.

Research Context

OAX-1B-Humanoid was developed as part of a humanoid robot assistant project involving:

Natural language interaction
Structured JSON tool calling
Vision-aware robot commands
Planner and Controller validation
Pick, place, search, status, and visible-object behaviours

The model is intended for experimentation with LLM-based robot-agent interfaces and high-level humanoid robot decision-making.

Downloads last month: 3

Safetensors

Model size

0.9B params

Tensor type

F16

Model tree for orhanaydinn/OAX-1B-Humanoid-Merged

Quantizations

1 model

orhanaydinn
/

OAX-1B-Humanoid-Merged