English
Chinese
File size: 1,560 Bytes
dd2e8c7
 
22f02bf
 
 
 
 
dd2e8c7
baf4b58
22f02bf
 
 
 
 
 
 
baf4b58
22f02bf
 
 
 
 
 
 
baf4b58
22f02bf
 
 
 
 
 
 
baf4b58
22f02bf
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
license: apache-2.0
datasets:
- Zhoues/Goal-Drift-Dataset
language:
- en
- zh
---
# Model Card for *MineDreamer* 🔥

<!-- Provide a quick summary of what the model is/does. -->

[![arXiv](https://img.shields.io/badge/arXiv%20papr-2403.12037-b31b1b.svg)](https://arxiv.org/abs/2403.12037)

[![project page](https://img.shields.io/badge/Play%20with%20MineDreamer%21-MineDreamer%20project%20page-lightblue)](https://sites.google.com/view/minedreamer/main)

*MineDreamer* is an instructable embodied agent for simulated control and it is developed on top of recent advances in Multimodal Large Language Models (MLLMs) and diffusion models!



<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/63f08dc79cf89c9ed1bb89cd/S62I1Tn5qz5qJ3IkgMHH8.png" width=93%>
<p>
  
*MineDreamer* can follow instructions steadily by employing a Chain-of-Imagination (CoI) mechanism to envision the step-by-step process of executing instructions and translating imaginations into more precise visual prompts tailored to the current state; subsequently, it generates keyboard-and-mouse actions to efficiently achieve these imaginations,


<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/63f08dc79cf89c9ed1bb89cd/LJxBMChCFng_RkXwUotfk.png" width=93%>
<p>


**This repo is used for hosting MineDreamer's InstructPix2Pix checkpoints, which are not only the baseline checkpoints but the training stage 2 checkpoints for Imaginator as well.**

For more details or tutorials see https://github.com/Zhoues/MineDreamer.