Papers
arxiv:2410.13757

MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Published on Oct 17
· Submitted by JamesZhutheThird on Oct 18
Authors:
,
,
,
,
,
,
,

Abstract

Current mobile assistants are limited by dependence on system APIs or struggle with complex user instructions and diverse interfaces due to restricted comprehension and decision-making abilities. To address these challenges, we propose MobA, a novel Mobile phone Agent powered by multimodal large language models that enhances comprehension and planning capabilities through a sophisticated two-level agent architecture. The high-level Global Agent (GA) is responsible for understanding user commands, tracking history memories, and planning tasks. The low-level Local Agent (LA) predicts detailed actions in the form of function calls, guided by sub-tasks and memory from the GA. Integrating a Reflection Module allows for efficient task completion and enables the system to handle previously unseen complex tasks. MobA demonstrates significant improvements in task execution efficiency and completion rate in real-life evaluations, underscoring the potential of MLLM-empowered mobile assistants.

Community

Paper author Paper submitter
•
edited Oct 18

🎮MobA manipulates mobile phones just like how you would, with a two-level agent system mimicking brain functions. The "cerebrum" (Global Agent) comprehends, plans, and reflects🎯, while the "cerebellum" (Local Agent) predicts actions based on current information🕹️. It achieves a superior scoring rate of 66.2% in 50 real-world scenarios with similar execution efficiency by human experts.

Paper author Paper submitter

🎉We have open-sourced MobA on GitHub.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.13757 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.13757 in a Space README.md to link it from this page.

Collections including this paper 7