NaVid / README.md
Jzzhang's picture
Update README.md
1f96dbd verified
metadata
license: cc-by-nc-4.0

Pretrained Weights of NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation (RSS 2024)

The model is trained on samples collected from the training splits of VLN-CE R2R and RxR.

Evaliation Benchmark TL NE OS SR SPL
VLN-CE R2R Val. 10.7 5.65 49.2 41.9 36.5
VLN-CE R2R Test 11.3 5.39 52 45 39
VLN-CE RxR Val. 15.4 5.72 55.6 45.7 38.2

The related inference code can be found in here