
InternVLA-N1: An Open Dual-System Navigation Foundation Model with Learned Latent Plans
The technical report will be public in the coming open-source week. Please stay tuned!
Highlights
- Dual-System Framework
The first navigation foundation model that achieves joint-tuning and asychronous inference of System-2 reasoning and System-1 action, resulting in smooth and efficient execution during the instruction-followed navigation procedure.
- State-of-the-art
The whole navigation foundation model with each system achieves state-of-the-art performance on both mainstream and our new established challenging benchmarks, including VLN-CE R2R & RxR, GRScenes-100, VLN-PE, etc.
- Sim2Real Zero-shot Generalization
The training is based on simulation data InternData-N1 only, with diverse scenes, embodiments and other randomization, while achieving great zero-shot generalization capabilities in the real world.
Usage
Please refer to InternNav for its inference, evaluation and gradio demo.
Citation
If you find our work helpful, please consider starring this repo ๐ and cite:
@misc{internvla-n1,
title = {{InternVLA-N1: An} Open Dual-System Navigation Foundation Model with Learned Latent Plans},
author = {InternVLA-N1 Team},
year = {2025},
booktitle={arXiv},
}
License
This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Acknowledgements
This repository is based on Qwen2.5-VL.
- Downloads last month
- 18