nuScenes Waypoints Model (Trajectory-Only)

Part of the RnB-EnCoRe-SelfDriving collection from the Stanford Autonomous Systems Lab.

This is a Qwen3-VL-4B vision-language model fine-tuned on nuScenes driving data to directly predict a future waypoint trajectory from multi-camera observations, without producing intermediate natural-language reasoning.

For the reasoning + waypoints variant, see stanfordasl/nuscenes-rnbencore-reasoning-waypoints.

Model details

  • Base model: Qwen/Qwen3-VL-4B-Instruct (Qwen3VLForConditionalGeneration)
  • Architecture: hidden size 2560, 36 layers
  • Modality: image/video + text → text
  • Task: future trajectory (waypoint) prediction on nuScenes
  • Output: predicted waypoints (no chain-of-thought)

Training

  • Fine-tuned on a nuScenes VQA trajectory-only dataset
  • Epochs: 30 (10,980 optimizer steps)
  • Max sequence length: 2048

Usage

For dataset preparation, prompting, inference, and evaluation, follow the instructions in the project repository: https://github.com/rnb-encore/RnB-EnCoRe-SelfDriving

from transformers import AutoModelForImageTextToText, AutoProcessor

model_id = "stanfordasl/nuscenes-waypoints-model"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype="auto", device_map="auto")

# Build a chat message with the driving camera image(s) + prompt,
# then processor.apply_chat_template(...) and model.generate(...).
# See the GitHub repo for the exact prompt format and post-processing.

Intended use & limitations

This model is a research artifact for autonomous-driving planning experiments. It was trained on nuScenes and is not intended for deployment in real vehicles or safety-critical settings. Outputs may be inaccurate or unsafe; always validate in simulation before any downstream use.

Citation

If you use this model, please cite the RnB-EnCoRe self-driving work:

Downloads last month
30
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for stanfordasl/nuscenes-waypoints-model

Finetuned
(305)
this model

Collection including stanfordasl/nuscenes-waypoints-model

Paper for stanfordasl/nuscenes-waypoints-model