行业研究公司研究宏观策略财报招股书会议纪要稀土低空经济 DeepSeek AIGC 智能驾驶大模型 GTC2025 全球流量大会（上海）

端到端自动驾驶系统研究综述

信息技术2024-01-12陈妍妍、田大新、林椿眄北京航空航天大学冷***

AI智能总结

本文首先介绍了自动驾驶的三大技术范式：模块化系统架构、多任务学习架构和端到端系统架构。模块化系统架构将感知、预测、规划和控制等任务分别解耦成独立的子模块，具有可调试性、可验证性及可重用性，但存在信息丢失、复合误差等不足。多任务学习架构通过设计各种解码头与共享主干网络，以降低计算消耗，但不同任务的优化目标不一致，可能存在表征冲突。端到端架构通过释放中间的显式接口简化网络，使整个系统包括中间表征都可朝着最终目标优化，但早期方法缺乏对驾驶场景的显式表征，可解释性较弱。

其次，本文详细调研了当前新兴的端到端自动驾驶框架，从输入—输出模态到系统架构角度进行了广泛地调研。端到端系统的输入模态主要包括视觉相机、车辆状态和异构多模态数据，输出模态主要包括转向和速度、航路点、导航输入等。现有端到端架构可分为弱解释性端到端和模块化联合端到端两大主流范式。弱解释性端到端以模仿学习和强化学习方法为主，但缺乏对驾驶场景的显式表征，可解释性较弱。模块化联合端到端在保留模块化组件的同时实现全流程的端到端训练，有效结合了传统模块化和端到端方法的优势，具有一定的可解释性。

接着，本文简单介绍了端到端自动驾驶系统的开环—闭环评估方法及适用场景。闭环评估是指在受控环境中赋予系统对汽车的控制权的前提下，从车辆的完成性、安全性及舒适性等方面综合评价模型在现实（真实/模拟）场景中的驾驶性能。开环评估主要是对现实世界的人类驾驶数据集的离线评估，用于衡量智能体的驾驶行为（动作/轨迹）与专家真实数据的匹配（偏差）程度，以及与其他智能体的碰撞程度。

最后，本文总结了端到端自动驾驶系统的研究工作，并从数据挖掘和架构设计角度展望了领域潜在挑战和亟待解决的关键问题。数据挖掘方面，端到端自动驾驶方法需要海量的大规模数据，以及高级的数据清洗和自标注技术。架构设计方面，显式的端到端方法需要改进优化以提高模型的可解释性，同时需要解决多模态融合、视觉抽象表征和模型结构与隐式接口设计等问题。此外，引入自然语言提示或直接建立视觉与语言关系的大模型也值得进一步关注。

端到端自动驾驶系统研究综述陈妍妍，田大新*，林椿眄，殷鸿博北京航空航天大学交通科学与工程学院，北京102200 摘要：近年深度学习技术助力端到端自动驾驶框架的发展和进步，涌现出一系列创新研究议题与应用部署方案。本文首先以经典的模块化系统切入，对自动驾驶感知—预测—规划—决策4大功能模块进行简要概述，分析传统的模块化和多任务方法的局限性；其次从输入—输出模态到系统架构角度对当前新兴的端到端自动驾驶框架进行广泛地调研，详细描述弱解释性端到端与模块化联合端到端两大主流范式，深入探究现有研究工作存在的不足和弊端；之后简单介绍了端到端自动驾驶系统的开环—闭环评估方法及适用场景；最后总结了端到端自动驾驶系统的研究工作，并从数据挖掘和架构设计角度展望领域潜在挑战和亟待解决的关键问题。关键词：人工智能（AI）；自动驾驶；模块式系统；端到端系统；数据驱动；可解释性 Survey of end-to-end autonomous driving systems ChenYanyan，TianDaxin*，LinChunmian，YinHongboSchool of Transportation Science and Engineering，Beihang University，Beijing 102200，China Abstract：Deep learning technologies have accelerated the development and advancement of end-to-end autonomous driv⁃ing frameworks in recent years，sparking the emergence of numerous cutting-edge research topics and application deploy⁃ment solutions.The“divide and conquer”architecture design concept，which aims to construct multiple independent butrelated module components，integrate them into the developed software system in a specific semantic or geometric order，and ultimately deploy these components to the actual vehicle，is the foundation for the majority of the autonomous drivingsystems currently in use，also known as modular systems.However，a well-developed modular design typically comprisesthousands of components，placing a considerable burden on the graphics memory and processing capacity of automotiveCPUs.Furthermore，the intrinsic mistakes of each stacked module during prediction will rise with the number of stackedmodules，and upstream flaws cannot be fixed in downstream modules，presenting a major risk to vehicle safety.A multi⁃task architecture based on the“task parallelism”principle aims to efficiently infer multiple tasks in parallel by designingvarious decoded heads with a shared backbone network to reduce computational consumption.However，the optimizationgoals for various tasks may not be consistent，and sharing features mindlessly can even degrade the overall performance ofthe system.In contrast to the previous two system architectures，the end-to-end technology paradigm eliminates informationbottlenecks and cumulative errors due to the integration of numerous intermediate components based on rule interfaces，allowing the network to continually optimize toward a unified objective.A large model can be used to generate low-levelcontrol signals or vehicle motion planning based on inputs such as sensor data and vehicle status.With sensors serving as inputs，the early end-to-end design based on imitation and reinforcement learning directly outputs the final control com⁃mands for steering，braking，and acceleration.However，no explicit representation of driving scenarios in this completely“black box”network，which is also referred to as weakly interpretable end-to-end methods，is available.Thus，understand⁃ing the reasoning behind the decision or prediction of a vehicle is difficult for humans，making debugging，validation，andoptimization challenging.Even worse，once the model malfunctions or unexpected situations occur，accurately detecting，avoiding，and repairing problems in a timely manner becomes difficult，all of which are crucial for maintaining the safeoperation of intelligent vehicles.The component decoupling approach facilitates the development and optimization of indi⁃vidual modules in the conventional modular system，thereby guaranteeing steady representation performance and stronginterpretability for each submodule.Unfortunately，this method falls short of achieving unified goals at the optimizationlevel，that is，integrating optimization and learning toward the ultimate planning goal.A modular joint end-to-end autono⁃mous driving architecture，which preserves the modular driving system while allowing the differentiability of each module，is a workable solution to ensure that every module has sufficient interpretability and overall automatic optimization capabili⁃ties.The basic idea behind this technology lies in the creation of a unique neural network that connects all independentmodules and enables the gradients from the planning modules to be fed back down to the initial sensor input for end-to-endexecution.In other words，this kind of approach merely modifies the submodule connection mechanism while maintainingthe classic modular technology stack；that is，this approach substitutes a new implicit interface for the previous explicitinterfaces，which were rule-based and required manual creation.Modular joint end-to-end procedures offer a certain inter⁃pretability because of the distinct separation between modules.The explicit end-to-end system is a relative decouplingbased on overall design and exhibits some degree of logic in its sequential functioning from perception to prediction，andthen to planning modules during decision inference.The model can be intentionally adjusted when it encounters unknownand uncontrollable results by understanding the operational logic underlying the explicit solution.Furthermore，visualiza⁃tion methods，such as internal features

点击免费查看完整报告

端到端自动驾驶系统研究综述

你可能感兴趣

资深智驾专家解读端到端自动驾驶

2024端到端自动驾驶行业研究报告

智能汽车系列深度（十六）：算法进阶：自动驾驶迎来端到端时代

中小盘周报：FSD V12呼之欲出，端到端自动驾驶即将问世

端到端模型赋能自动驾驶，机器人引领具身智能：从特斯拉FSD看人工智能