File size: 774 Bytes
3ac22cc fe58433 1f96dbd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
---
license: cc-by-nc-4.0
---
Pretrained Weights of [NaVid](https://pku-epic.github.io/NaVid/): Video-based VLM Plans the Next Step for Vision-and-Language Navigation (RSS 2024)
The model is trained on samples collected from the training splits of [VLN-CE](https://github.com/jacobkrantz/VLN-CE) R2R and RxR.
| Evaliation Benchmark | TL | NE | OS | SR | SPL |
|----------------------|:----:|:----:|:----:|:----:|:----:|
| VLN-CE R2R Val. | 10.7 | 5.65 | 49.2 | 41.9 | 36.5 |
| [VLN-CE R2R Test](https://eval.ai/web/challenges/challenge-page/719/leaderboard/1966) | 11.3 | 5.39 | 52 | 45 | 39 |
| VLN-CE RxR Val. | 15.4 | 5.72 | 55.6 | 45.7 | 38.2 |
The related inference code can be found in [here](https://github.com/jzhzhang/NaVid-VLN-CE)
|