Yan Wang - Research Scientist and Tech Lead at NVIDIA Research

Yan Wang

I am a Research Scientist and Tech Lead at NVIDIA Research, specializing in autonomous driving, VLMs, VLA models, and reasoning. I lead the Alpamayo VLA project, with Alpamayo 1 announced at CES 2026 by Jensen Huang. I received my Ph.D. in Computer Science from Cornell University, where I was advised by Kilian Q. Weinberger and Bharath Hariharan. Previously, I was a Research Scientist at Waymo.

Email / GitHub / Linkedin / X / Google Scholar

Research

I'm interested in Multimodal foundation models, 3D computer vision, autonomous driving. My research focuses on developing end-to-end learning systems for autonomous vehicles, and let the driving model can think itself. I'm particularly interested in vision-language-action (VLA) models with reasoning capabilities and world models for embodied AI.

News

[Jan 2026] Alpamayo 1 announced by Jensen Huang at CES 2026! 🚗🤖
[Jan 2026] Organizing End-to-End 3D Learning Workshop at CVPR 2026.
[Dec 2025] Keynote talk at NeurIPS 2025 Workshop on VLM4RWD on reasoning VLA for autonomous vehicles.
[Oct 2025] Released Alpamayo-R1, a reasoning VLA model for autonomous driving.
[Oct 2025] Keynote at ICCV 2025 DriveX Workshop on reasoning models for physical AI.
[Oct 2025] 1 paper presented at ICCV 2025.
[Jul 2025] 1 paper accepted by COLM 2025.
[Feb 2025] 1 paper accepted by ICRA 2025.
[Jan 2025] 2 papers accepted by ICLR 2025.
[Sep 2024] 2 papers accepted by NeurIPS 2024.

Selected Publications

For a complete list of publications, please visit my Google Scholar profile.

	Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail Project Lead NVIDIA Technical Report, 2025 PDF A vision-language-action model integrating Chain of Causation reasoning with trajectory planning, demonstrating improvements on open-loop, closed-loop evaluation, and real-world road tests. Announced as Alpamayo 1 by Jensen Huang at CES 2026.
	Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning Zhenghao "Mark" Peng, Wenhao Ding, Yurong You, Yuxiao Chen, Wenjie Luo, Thomas Tian, Yulong Cao, Apoorva Sharma, Danfei Xu, Boris Ivanovic, Boyi Li, Bolei Zhou, Yan Wang, Marco Pavone * Co-advised arXiv, 2025 arXiv A self-reflective VLA model with adaptive reasoning capabilities that learns from counterfactual feedback to improve decision-making in autonomous driving.
	Language-Image Models with 3D Understanding Jang Hyun Cho, Boris Ivanovic, Yulong Cao, Edward Schmerling, Yue Wang, Xinshuo Weng, Boyi Li, Yurong You, Philipp Krähenbühl, Yan Wang, Marco Pavone* * Co-advised ICLR, 2025 OpenReview Bridging vision-language models (VLMs) with 3D understanding.
	STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes Jiawei Yang, Jiahui Huang, Yuxiao Chen, Yan Wang, Boyi Li, Yurong You, Maximilian Igl, Apoorva Sharma, Peter Karkus, Danfei Xu, Boris Ivanovic, Yue Wang, Marco Pavone ICLR, 2025 OpenReview A feed-forward, self-supervised method for fast and accurate reconstruction of dynamic 3D scenes from sparse, multi-timestep, posed camera images.
	Can Test-Time Scaling Improve World Foundation Model? Wenyan Cong, Hanqing Zhu, Peihao Wang, Bangya Liu, Dejia Xu, Kevin Wang, David Z. Pan, Yan Wang, Zhiwen Fan, Zhangyang Wang COLM, 2025 arXiv Exploring test-time scaling strategies for world foundation models.
	Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving Yan Wang, Wei-Lun Chao, Div Garg, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger CVPR, 2019 arXiv Proposing the Pseudo-LiDAR representation that bridges the gap between image-based and LiDAR-based 3D object detection.
	Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, et al. NeurIPS, 2023 arXiv / code A high-performance, JAX-based simulator for autonomous driving research enabling large-scale RL training.

Academic Service

Workshop Organizer:
- CVPR 2026: End-to-End 3D Learning
- ICCV 2025: End-to-End 3D Learning
- ECCV 2024: Autonomous Vehicles meet Multimodal Foundation Models
- CVPR 2023: Autonomous Driving Workshop
- CVPR 2022: Autonomous Driving Workshop
- CVPR 2021: Autonomous Driving Workshop

Conference Reviewer: CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, ICRA, AAAI, IJCAI, AISTATS

Journal Reviewer: IEEE TPAMI, TKDE

Visitor count
Website template from Jon Barron