Papers
arxiv:2412.16410

Application of Multimodal Large Language Models in Autonomous Driving

Published on Dec 21, 2024
Authors:

Abstract

In this era of technological advancements, several cutting-edge techniques are being implemented to enhance Autonomous Driving (AD) systems, focusing on improving safety, efficiency, and adaptability in complex driving environments. However, AD still faces some problems including performance limitations. To address this problem, we conducted an in-depth study on implementing the Multi-modal Large Language Model. We constructed a Virtual Question Answering (VQA) dataset to fine-tune the model and address problems with the poor performance of MLLM on AD. We then break down the AD decision-making process by scene understanding, prediction, and decision-making. Chain of Thought has been used to make the decision more perfectly. Our experiments and detailed analysis of Autonomous Driving give an idea of how important MLLM is for AD.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.16410 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.16410 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.16410 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.