---
license: gpl-3.0
language:
  - en
tags:
  - magicdrive
  - video-generation
  - autonomous-driving
---

# MagicDriveDiT

<p style="text-align: center;">
  <p align="center">
  <a href="https://arxiv.org/abs/2411.13807">📄 Paper</a> | 
  <a href="https://gaoruiyuan.com/magicdrivedit/">🌐 Website</a> | 
  <a href="https://github.com/flymin/MagicDriveDiT/blob/main/LICENSE">📖 LICENSE</a> | 
  <a href="https://github.com/flymin/MagicDriveDiT">🤖 GitHub</a>
</p>

This repository contains the model checkpoint of the paper.

This model is fine-tuned for 10k steps with lr=1e-5 after 40k steps of training for stage-3.

Note:
- For inference/testing, we recommend the ema model, i.e., `ema.pt`.
- For fine-tuning, you can consider to use the training checkpoint, i.e., `model/*`

> MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control <br>
> [Ruiyuan Gao](https://gaoruiyuan.com/)<sup>1</sup>, [Kai Chen](https://kaichen1998.github.io/)<sup>2</sup>, [Bo Xiao](https://www.linkedin.com/in/bo-xiao-19909955/?originalSubdomain=ie)<sup>3</sup>, [Lanqing Hong](https://scholar.google.com.sg/citations?user=2p7x6OUAAAAJ&hl=en)<sup>4</sup>, [Zhenguo Li](https://scholar.google.com/citations?user=XboZC1AAAAAJ&hl=en)<sup>4</sup>, [Qiang Xu](https://cure-lab.github.io/)<sup>1</sup><br>
> <sup>1</sup>CUHK <sup>2</sup>HKUST <sup>3</sup>Huawei Cloud <sup>4</sup>Huawei Noah's Ark Lab

<div style="text-align: center;">
    <video width="100%" controls style="margin: auto; max-width:1080px">
        <source src="https://github.com/user-attachments/assets/f43812ea-087b-4b70-883b-1e2f1c0df8d7" type="video/mp4">
    </video>
</div>

Please find more information on our GitHub: https://github.com/flymin/MagicDriveDiT