Collections

24

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018
Mapping Natural Language Commands to Web Elements

Paper • 1808.09132 • Published Aug 28, 2018
Learning to Navigate the Web

Paper • 1812.09195 • Published Dec 21, 2018
Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations

Paper • 1909.00031 • Published Aug 30, 2019

30

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 38
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 19

1

-

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12 • 41

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Mapping Natural Language Commands to Web Elements

Learning to Navigate the Web

Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

ScreenAI: A Vision-Language Model for UI and Infographics Understanding

EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration

Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

More Agents Is All You Need

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Generative Agents: Interactive Simulacra of Human Behavior

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Self-Rewarding Language Models

Self-Discover: Large Language Models Self-Compose Reasoning Structures

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Learning From Mistakes Makes LLM Better Reasoner

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Communicative Agents for Software Development

Self-Refine: Iterative Refinement with Self-Feedback

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

ReAct: Synergizing Reasoning and Acting in Language Models

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Design2Code: How Far Are We From Automating Front-End Engineering?

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents

Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models

Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration