硕士学位论文MASTER THESIS 论文题目空天地一体化网络中无人机智能任务调度算法研究 学科专业信息与通信工程学号202121010740作者姓名王宇辉指导教师孙罡教授学院信息与通信工程学院 Research on Intelligent Task Scheduling Algorithmsfor Unmanned Aerial Vehicles inSpace Air Ground Integrated Network A Master Thesis Submitted toUniversity of Electronic Science and Technology of China DisciplineInformation andCommunication Engineering Student ID202121010740 AuthorWang Yuhui SupervisorProf. Sun Gang SchoolSchool of Information and Communication Engineering 摘要 空天地一体化网络被认为是下一代网络的关键结构,天基网络和空基网络是协助和卸载计算任务的潜在候选方案。在空天地一体化网络的任务调度中,无人机作为空基网络的主要节点,负责收集和处理来自地面设备的计算任务。无人机可以根据任务的特征和自身的状态,选择将任务进行本地计算或卸载到地面基站或天基网络中。本文以性能优化为目标,分别针对资源竞争场景及多智能体合作场景设计空天地一体化网络的无人机智能任务调度算法,使无人机能够合理地分配计算任务到不同的目标设备,以最大化网络效用和系统收益。 为解决资源竞争场景下负载不均衡、排队时延长的问题,本文中设计了基于比例公平感知拍卖和近端策略优化的任务调度算法,将任务调度过程分解为两个部分:资源分配和智能任务卸载决策。算法首先设计了比例公平的拍卖机制,以拍卖的方式实现了计算资源的预分配,解决了卸载决策的相互影响导致的资源竞争问题,均衡了任务负载,并保证了分配的比例公平性。通过基于深度强化学习算法的任务卸载决策,结合环境信息、任务的异构特征和预分配的资源量,智能化地调度任务,提高了任务的完成率和系统的收益。 针对多智能体合作场景下系统决策成本高、智能体协作不足的问题,提出了基于分簇的多智能体合作任务调度算法。该算法由动态无人机分簇算法和合作卸载算法两部分组成。动态无人机分簇算法利用卫星的协调能力和无人机的自主能力,将无人机分为不同的簇,每个簇由一个簇首无人机负责管理和决策,实现了分布-集中式的控制模式。合作卸载算法利用簇首无人机上布置的智能体,基于多智能体强化学习框架,通过卫星的信息传播和参数共享,实现了多智能体的集中训练和分布执行,保证了系统的整体收益。 本文通过理论分析和仿真实验,从系统收益、负载均衡、公平性、任务完成情况和环境适应等方面,分析了本文所提算法的性能和特点,展示了其在不同场景和参数下的表现和优势。仿真实验结果表明,本文所提算法都能够有效地提高空天地一体化网络中的任务调度性能,相较于对比算法,具有更高的系统收益和更好的负载均衡性、公平性、收敛性及环境适应能力。 关键词:空天地一体化网络,任务调度,深度强化学习,比例公平拍卖,无人机分簇算法 ABSTRACT Space-Air-Ground Integrated Networks (SAGIN) is considered as the key structureof the next generation network. The space satellites and air nodes are potential candidatesto assist and offload the computing tasks. Unmanned aerial vehicles (UAVs) responsiblefor collecting and processing computing tasks from ground devices. UAVs can be chosento perform local computing or offload tasks to ground base stations or space-based net-work according to the characteristics of the tasks and their own states. This thesis aims atperformance optimization, and designs intelligent task scheduling algorithms for SAGINfor resource competition scenario and multi-agent cooperation scenario respectively, en-abling UAVs to reasonably allocate computing tasks to different target devices, so as tomaximize the network utility and system benefit. To solve the problem of load imbalance and long queuing delay in the resource com-petition scenario, this thesis designs a proportional fairness-aware auction with proximalpolicy optimization task scheduling algorithm , which decomposes the task schedulingprocess into two parts: resource allocation and intelligent task offloading decision. Thealgorithm first designs a proportional fair auction mechanism, which realizes the pre-allocation of computing resources by auction, solves the resource competition problemcaused by the mutual influence of offloading decisions, balances the task load, and en-sures the proportional fairness of the allocation. By using the deep reinforcement learningPPO algorithm for task offloading decision, the algorithm intelligently schedules tasksby combining the environmental information, the heterogeneous characteristics of tasksand the pre-allocated resource amount, and improves the task completion rate and systembenefit. To address the problem of high system decision cost and insufficient cooperationof agents in the multi-agent cooperation scenario, a cluster-based multi-agent cooperativetask scheduling algorithm is proposed. The algorithm consists of two parts: dynamic UAVclustering algorithm and cooperative offloading algorithm. The dynamic UAV clusteringalgorithm uses the coordination ability of satellites and the autonomy of UAVs to divideUAVs into different clusters, each cluster is managed and decided by a cluster head UAV,to realize a distributed-centralized control mode. The cooperative offloading algorithmuses the agents deployed on the cluster head UAVs, adopts the multi-agent reinforcement learning framework based on MADDPG, realizes the centralized training and distributedexecution of multi-agents through the information propagation and parameter sharing ofsatellites, ensuring the overall profitability of the system. This thesis analyzes the performance and characteristics of the proposed algorithmsfrom the aspects of system benefit, load balance, fairness, task completion situation andenvironmental adaptation through theoretical analysis and simulation experiments, andshows the performance and advantages of the proposed algorithms in different scenariosand parameters. The simulation results show that the proposed algorithms can effectivelyimprove the task scheduling performance in the SAGIN, and have higher system benefitand better load balance, fairness, convergence and environmental adaptability than thecomparison algorithms. Keywords:Space-Air-Ground Integrated Networks, Task Schedule, Deep ReinforcementLearning, Proportional Fairness Auction, UAV Clustering. 目录 第一章绪论...............................................................................................11.