Papers
arxiv:2312.09187

Vision-Language Models as a Source of Rewards

Published on Dec 14, 2023
· Submitted by akhaliq on Dec 15, 2023
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning. A key limiting factor for building generalist agents with RL has been the need for a large number of reward functions for achieving different goals. We investigate the feasibility of using off-the-shelf vision-language models, or VLMs, as sources of rewards for reinforcement learning agents. We show how rewards for visual achievement of a variety of language goals can be derived from the CLIP family of models, and used to train RL agents that can achieve a variety of language goals. We showcase this approach in two distinct visual domains and present a scaling trend showing how larger VLMs lead to more accurate rewards for visual goal achievement, which in turn produces more capable RL agents.

Community

IMG_20211218_132613.jpg
What's colours in this image

What's this image

Quelles sont ses races de chiens?

Peux-tu me dire de quelle race sont ces chiens

this is not the page where you can try the model guys, this is a research paper. 😅

@akhaliq sorry to bother you, but I noticed quite a few papers getting comments from people thinking that this is a place to try models, Unfortunately, this confusion seems to be growing and I think/suggest that there might be a need to add a little disclosure to make sure everyone understands that this is not the place to test the models.
for example, 2 pages on 15-Dec-2023 had these types of comments one of them being this and the other one is here

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2312.09187 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2312.09187 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2312.09187 in a Space README.md to link it from this page.

Collections including this paper 4