DaYin commited on
Commit
53c72db
1 Parent(s): cb58e2c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -9
README.md CHANGED
@@ -13,28 +13,32 @@ tags:
13
  - planning
14
  ---
15
 
16
- # 🪄 Lumos: Language Agents with Unified Formats, Modular Design, and Open-Source LLMs
17
  <p align="center">
18
  🌐<a href="https://allenai.github.io/lumos">[Website]</a> &nbsp;
19
- 📝<a href="">[Paper]</a> &nbsp;
20
  🤗<a href="https://huggingface.co/datasets?sort=trending&search=ai2lumos">[Data]</a> &nbsp;
21
  🤗<a href="https://huggingface.co/models?sort=trending&search=ai2lumos">[Model]</a> &nbsp;
 
22
  </p>
23
 
24
  We introduce 🪄**Lumos**, Language Agents with **Unified** Formats, **Modular** Design, and **Open-Source** LLMs. **Lumos** unifies a suite of complex interactive tasks and achieves competitive performance with GPT-4/3.5-based and larger open-source agents.
25
 
26
  **Lumos** has following features:
27
  * 🧩 **Modular Architecture**:
28
- - **Lumos** consists of planning, grounding, and execution modules built based on LLAMA-2-7B.
 
29
  * 🌍 **Diverse Training Data**:
30
- - **Lumos** is trained with ~40K high-quality annotations from ground-truth reasoning steps in existing benchmarks with GPT-4.
 
31
  * 🚀 **Competitive Performance**:
32
- - 🚀 **Lumos** outperforms **GPT-4/3.5-based** agents on complex QA and web agent tasks, and **larger open agents** on maths tasks.
33
- - 🚀 **Lumos** performs better than open agent baseline formulations including **chain-of-thoughts** and **unmodularized** training.
34
- - 🚀 **Lumos** surpasses larger open LLM agents and domain-specific agents on an unseen task, WebShop.
 
35
 
36
  ## Model Overview
37
- `lumos_unified_plan_iterative` is a **planning** module checkpoint finetuned on **complex QA**, **web agent** and **maths** tasks in **Lumos-Iterative (Lumos-I)** formulation.
38
 
39
  The training annotation is shown below:
40
 
@@ -48,8 +52,9 @@ The training annotation is shown below:
48
  If you find this work is relevant with your research, please feel free to cite our work!
49
  ```
50
  @article{yin2023lumos,
51
- title={Lumos: Towards Language Agents that are Unified, Modular, and Open Source},
52
  author={Yin, Da and Brahman, Faeze and Ravichander, Abhilasha and Chandu, Khyathi and Chang, Kai-Wei and Choi, Yejin and Lin, Bill Yuchen},
 
53
  year={2023}
54
  }
55
  ```
 
13
  - planning
14
  ---
15
 
16
+ # 🪄 Agent Lumos: Unified and Modular Training for Open-Source Language Agents
17
  <p align="center">
18
  🌐<a href="https://allenai.github.io/lumos">[Website]</a> &nbsp;
19
+ 📝<a href="https://arxiv.org/abs/2311.05657">[Paper]</a> &nbsp;
20
  🤗<a href="https://huggingface.co/datasets?sort=trending&search=ai2lumos">[Data]</a> &nbsp;
21
  🤗<a href="https://huggingface.co/models?sort=trending&search=ai2lumos">[Model]</a> &nbsp;
22
+ 🤗<a href="https://huggingface.co/spaces/ai2lumos/lumos_data_demo">[Demo]</a> &nbsp;
23
  </p>
24
 
25
  We introduce 🪄**Lumos**, Language Agents with **Unified** Formats, **Modular** Design, and **Open-Source** LLMs. **Lumos** unifies a suite of complex interactive tasks and achieves competitive performance with GPT-4/3.5-based and larger open-source agents.
26
 
27
  **Lumos** has following features:
28
  * 🧩 **Modular Architecture**:
29
+ - 🧩 **Lumos** consists of planning, grounding, and execution modules built based on LLAMA-2-7B/13B and off-the-shelf APIs.
30
+ - 🤗 **Lumos** utilizes a unified data format that encompasses multiple task types, thereby enabling the developed agent framework to conveniently support a range of interactive tasks.
31
  * 🌍 **Diverse Training Data**:
32
+ - 🌍 **Lumos** is trained with ~56K diverse high-quality subgoal/action annotations from ground-truth reasoning steps in existing benchmarks with GPT-4.
33
+ - ⚒️ **Lumos** data can be instrumental for future research in developing open-source agents for complex interactive tasks.
34
  * 🚀 **Competitive Performance**:
35
+ - 🚀 **Lumos** is comparable or even beats **GPT-series** agents on web/complex QA tasks Mind2Web and HotpotQA, and **larger open agents** on math and multimodal tasks.
36
+ - 🚀 **Lumos** exceeds contemporaneous agents that have been **fine-tuned** with in-domain HotpotQA, Mind2Web and ScienceQA annotations, such as **FiReAct**, **AgentLM**, and **AutoAct**.
37
+ - 🚀 **Lumos** performs better than open agent baseline formulations including **chain-of-thoughts** and **integrated** training.
38
+ - 🚀 **Lumos** surpasses larger open LLM agents and domain-specific agents on unseen tasks, WebShop and InterCode_SQL.
39
 
40
  ## Model Overview
41
+ `lumos_unified_plan_iterative` is a **planning** module checkpoint finetuned on **complex QA**, **web agent**, **multimodal** and **maths** tasks in **Lumos-Iterative (Lumos-I)** formulation.
42
 
43
  The training annotation is shown below:
44
 
 
52
  If you find this work is relevant with your research, please feel free to cite our work!
53
  ```
54
  @article{yin2023lumos,
55
+ title={Agent Lumos: Unified and Modular Training for Open-Source Language Agents},
56
  author={Yin, Da and Brahman, Faeze and Ravichander, Abhilasha and Chandu, Khyathi and Chang, Kai-Wei and Choi, Yejin and Lin, Bill Yuchen},
57
+ journal={https://arxiv.org/abs/2311.05657},
58
  year={2023}
59
  }
60
  ```