BTAgent/BTAgent-v0.1 · Hugging Face

BTAgent: Large Language Model for Behavior Tree Generation with Constrained DPO This paper introduces BTAgent, an autonomous robot control method based on large language models (LLMs), which can generate robot behavior trees based on the operator's instructions. The main contribution of this paper is to propose a novel approach that combines LLMs and robot agents, leveraging the parsing capabilities of LLMs to generate structured behavior trees and enable task execution. First, we propose a prompt method based on self-instruct style without requiring additional human expert annotations, which uses stage-based and self-reflection prompts to automatically generate behavior tree preference instruction-following datasets. Then, we introduce a constrained DPO (Direct Policy Optimization) method to fine-tune the LLM and enhance its performance. To study the method in depth, we evaluate the generated behavior trees based on StarCraft II simulation environment. Over 95% average win rate for heterogeneous environments. To the best of our knowledge, this paper is the first study to generate structured behavior trees using LLMs for intelligent agent control in the StarCraft II environment. Furthermore, this work explores the feasibility of LLMs with parameters up to 7B in understanding complex instructions and task generation. We provide download links for both code and dataset, please refer to \url{https://github.com/BTAgent/BTAgent} to obtain related resources.