Octo-planner: On-device Language Model for Planner-Action Agents Framework

We're thrilled to introduce the Octo-planner, the latest breakthrough in on-device language models from Nexa AI. Developed for the Planner-Action Agents Framework, Octo-planner enables rapid and efficient planning without the need for cloud connectivity, this model together with Octopus-V2 can work on edge devices locally to support AI Agent usages.

Key Features of Octo-planner:

  • Efficient Planning: Utilizes fine-tuned plan model based on Gemma-2b (2.51 billion parameters) for high efficiency and low power consumption.
  • Agent Framework: Separates planning and action, allowing for specialized optimization and improved scalability.
  • Enhanced Accuracy: Achieves a planning success rate of 98.1% on benchmark dataset, providing reliable and effective performance.
  • On-device Operation: Designed for edge devices, ensuring fast response times and enhanced privacy by processing data locally.

Example Usage

Below is a demo of Octo-planner:

ondevice

Run below code to use Octopus Planner for a given question:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "NexaAIDev/octo-planner-2b"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)
question = "Find my presentation for tomorrow's meeting, connect to the conference room projector via Bluetooth, increase the screen brightness, take a screenshot of the final summary slide, and email it to all participants"
inputs = f"<|user|>{question}<|end|><|assistant|>"
input_ids = tokenizer(inputs, return_tensors="pt").to(model.device)
outputs = model.generate(
        input_ids=input_ids["input_ids"], 
        max_length=1024,
        do_sample=False)
res = tokenizer.decode(outputs.tolist()[0])
print(f"=== inference result ===\n{res}")

Training Data

We wrote 10 Android API descriptions to used to train the models, see this file for details. Below is one Android API description example

def send_email(recipient, title, content):
    """
    Sends an email to a specified recipient with a given title and content.
    Parameters:
    - recipient (str): The email address of the recipient.
    - title (str): The subject line of the email. This is a brief summary or title of the email's purpose or content.
    - content (str): The main body text of the email. It contains the primary message, information, or content that is intended to be communicated to the recipient.
    """

Contact Us

For support or to provide feedback, please contact us.

License and Citation

Refer to our license page for usage details. Please cite our work using the below reference for any academic or research purposes.

@article{chen2024octoplannerondevicelanguagemodel,
      title={Octo-planner: On-device Language Model for Planner-Action Agents}, 
      author={Wei Chen and Zhiyuan Li and Zhen Guo and Yikang Shen},
      year={2024},
      eprint={2406.18082},
      url={https://arxiv.org/abs/2406.18082}, 
}

We thank the Google Gemma team for their amazing models!

@misc{gemma-2023-open-models,
  author = {{Gemma Team, Google DeepMind}},
  title = {Gemma: Open Models Based on Gemini Research and Technology},
  url = {https://goo.gle/GemmaReport},  
  year = {2023},
}
Downloads last month
38
Safetensors
Model size
2.51B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for NexaAIDev/octo-planner-2b

Quantizations
1 model

Collection including NexaAIDev/octo-planner-2b