Papers
arxiv:2404.03648

AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Published on Apr 4
· Featured in Daily Papers on Apr 8
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

Large language models (LLMs) have fueled many intelligent agent tasks, such as web navigation -- but most existing agents perform far from satisfying in real-world webpages due to three factors: (1) the versatility of actions on webpages, (2) HTML text exceeding model processing capacity, and (3) the complexity of decision-making due to the open-domain nature of web. In light of the challenge, we develop AutoWebGLM, a GPT-4-outperforming automated web navigation agent built upon ChatGLM3-6B. Inspired by human browsing patterns, we design an HTML simplification algorithm to represent webpages, preserving vital information succinctly. We employ a hybrid human-AI method to build web browsing data for curriculum training. Then, we bootstrap the model by reinforcement learning and rejection sampling to further facilitate webpage comprehension, browser operations, and efficient task decomposition by itself. For testing, we establish a bilingual benchmark -- AutoWebBench -- for real-world web browsing tasks. We evaluate AutoWebGLM across diverse web navigation benchmarks, revealing its improvements but also underlying challenges to tackle real environments. Related code, model, and data will be released at https://github.com/THUDM/AutoWebGLM.

Community

I feel like the HTML pruning algorithm deserves a paper of its own.

Amazing work, very comprehensive and in-depth approach, I'm not surprised this blows GPT4 out of the water

Thank you, very interesting work! Do you have a link to a HF page to get the model?

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2404.03648 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2404.03648 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2404.03648 in a Space README.md to link it from this page.

Collections including this paper 14