Papers
arxiv:2310.06830

Lemur: Harmonizing Natural Language and Code for Language Agents

Published on Oct 10, 2023
ยท Featured in Daily Papers on Oct 13, 2023

Abstract

We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents. The evolution from language chat models to functional language agents demands that models not only master human interaction, reasoning, and planning but also ensure grounding in the relevant environments. This calls for a harmonious blend of language and coding capabilities in the models. Lemur and Lemur-Chat are proposed to address this necessity, demonstrating balanced proficiencies in both domains, unlike existing open-source models that tend to specialize in either. Through meticulous pre-training using a code-intensive corpus and instruction fine-tuning on text and code data, our models achieve state-of-the-art averaged performance across diverse text and coding benchmarks among open-source models. Comprehensive experiments demonstrate Lemur's superiority over existing open-source models and its proficiency across various agent tasks involving human communication, tool usage, and interaction under fully- and partially- observable environments. The harmonization between natural and programming languages enables Lemur-Chat to significantly narrow the gap with proprietary models on agent abilities, providing key insights into developing advanced open-source agents adept at reasoning, planning, and operating seamlessly across environments. https://github.com/OpenLemur/Lemur

Community

Yay, papers are back! Here's my summary of this article.

Today's conversational bots like Claude and GPT can chat impressively but aren't great at complex planning or executing technical tasks. To overcome this, new research from HKU builds open-source AI agents that blend natural language and coding skills. They're called Lemur and Lemur-Chat.

The researchers think achieving versatile real-world agents requires models that integrate both fluid natural language abilities and precise programming language control. Humans combine plain speech for higher-level goals with languages like Python when we need to plan intricately and execute exactly. AI needs both capacities too.

But most existing models specialize in pure language or pure code. There's a separation that is limiting.

The team created Lemur by pretraining the open-source Llama-2 on a massive mixed corpus with 10x more natural language than code. This improved its programming abilities while retaining conversational strength. Further instruction tuning optimized Lemur-Chat for following free-form directions in language.

Experiments found Lemur surpassed specialized coding-only models like Codex in overall benchmarks. Lemur-Chat then exceeded Lemur by 15% after instruction tuning.

More importantly, Lemur-Chat won 12/13 new "agent tests" designed to mimic real-world challenges needing both language and programming prowess.

It beat alternatives at:

  • Using tools like Python and Wikipedia to enhance reasoning
  • Debugging code by leveraging error messages
  • Improving the most from natural language feedback
  • Exploring partially observable environments like cybersecurity and web browsing simulations.

Lemur-Chat matched GPT-3.5 in many tests, closing the gap between commercial and open-source agents.

TLDR: New open-source AI agents combine coding and language skills. Experiments show the combo unlocks more performance across technical challenges.

Full summary is here. Paper is here.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

Sign up or log in to comment

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2310.06830 in a dataset README.md to link it from this page.

Spaces citing this paper 24

Collections including this paper 10