jbilcke-hf's picture
Update README.md
84a1c35 verified
|
raw
history blame
5.17 kB
metadata
title: Matrix
emoji: 🐟
colorFrom: blue
colorTo: blue
sdk: docker
app_file: server.py
pinned: true
short_description: AI Gaming server
app_port: 8080
disable_embedding: false

Matrix-Game: Interactive World Foundation Model

teaser

πŸ“ Overview

Matrix-Game is a 17B-parameter interactive world foundation model for controllable game world generation.

✨ Key Features

  • 🎯 Feature 1: Interactive Generation. A diffusion-based image-to-world model that generates high-quality videos conditioned on keyboard and mouse inputs, enabling fine-grained control and dynamic scene evolution.
  • πŸš€ Feature 2: GameWorld Score. A comprehensive benchmark for evaluating Minecraft world models across four key dimensions, including visual quality, temporal quality, action controllability, and physical rule understanding.
  • πŸ’‘ Feature 3: Matrix-Game Dataset A large-scale Minecraft dataset with fine-grained action annotations, supporting scalable training for interactive and physically grounded world modeling.

πŸ”₯ Latest Updates

  • [2025-05] πŸŽ‰ Initial release of Matrix-Game Model

πŸš€ Performance Comparison

GameWorld Score Benchmark Comparison

Model Image Quality ↑ Aesthetic Quality ↑ Temporal Cons. ↑ Motion Smooth. ↑ Keyboard Acc. ↑ Mouse Acc. ↑ 3D Cons. ↑
Oasis 0.65 0.48 0.94 0.98 0.77 0.56 0.56
MineWorld 0.69 0.47 0.95 0.98 0.86 0.64 0.51
Ours 0.72 0.49 0.97 0.98 0.95 0.95 0.76

Metric Descriptions:

  • Image Quality / Aesthetic: Visual fidelity and perceptual appeal of generated frames

  • Temporal Consistency / Motion Smoothness: Temporal coherence and smoothness between frames

  • Keyboard Accuracy / Mouse Accuracy: Accuracy in following user control signals

  • 3D Consistency: Geometric stability and physical plausibility over time

    Please check our GameWorld benchmark for detailed implementation.

Human Evaluation

Human Win Rate

Double-blind human evaluation by two independent groups across four key dimensions: Overall Quality, Controllability, Visual Quality, and Temporal Consistency.
Scores represent the percentage of pairwise comparisons in which each method was preferred. Matrix-Game consistently outperforms prior models across all metrics and both groups.

πŸš€ Quick Start

# clone the repository:
git clone https://github.com/SkyworkAI/Matrix-Game.git
cd Matrix-Game

# install dependencies:
pip install -r requirements.txt

# install apex and FlashAttention-3
# Our project also depends on [apex](https://github.com/NVIDIA/apex) and [FlashAttention-3](https://github.com/Dao-AILab/flash-attention)

# inference
bash run_inference.sh

πŸ”§ Hardware Requirements

  • GPU:
    • NVIDIA A100/H100
  • VRAM:
    • Requires β‰₯80GB of GPU memory for a single 65-frame video inference.

⭐ Acknowledgements

We would like to express our gratitude to:

We are grateful to the broader research community for their open exploration and contributions to the field of interactive world generation.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“Ž Citation

If you find this project useful, please cite our paper:

@article{zhang2025matrixgame,
  title     = {Matrix-Game: Interactive World Foundation Model},
  author    = {Yifan Zhang and Chunli Peng and Boyang Wang and Puyi Wang and Qingcheng Zhu and Zedong Gao and Eric Li and Yang Liu and Yahui Zhou},
  journal   = {arXiv},
  year      = {2025}
}