NVIDIAAIEnterprise软件套件加速企业AI应用落地 March2023 人工智能应用框架 平台NVIDIA HPC NVIDIAAI NVIDIA Omniverse 加速计算库 cuNumericCV-CUDAcuQuantumParabricksSionnaJetPack RAPIDSSparkcuDNNcuGraphTensorRTTritonDeepStreamFlare DOCAMagIOAerial 从远端到边缘从数据中心到机器人 RTX DGXHGXEGXOVX SuperPOD AGX 3芯片 GPU CPU DPU NVIDIAAI端到端平台 AI/BUSINESS APP MODELS FRAMEWORKS AIWORKFLOWS NVIDIAAI ENTERPRISE DEPLOYMENT PLATFORM CLOUDNATIVE MANAGEMENT INFRA OPTIMIZATION IntelligentVirtual Assistants Automatic SpeechRecognitionandspeech-to-text Digital Fingerprinting Real-TimeThreat Detection … #1 为企业AI集成和验证 通过AI工作流程和预训练模型加快生产时间 提高效率和节省成本的 性能 经过优化和认证,可随处部署——云、数据中心和边缘 企业级支持、安全性和 NVIDIAAI ENTERPRISE V100–T4–A800/A30-H800/H800-DGX … API稳定性 NVIDIA数据中心产品组合 A800 300W|80GB 2-SlotFHFL|Liquid|NVLink FastestCompute,FP64upto7MIGinstances DataAnalytics ScientificResearch DLTraining HighestComputePerfAI,HPC,DataProcessing A30 165W|24GB 2-SlotFHFL|NVLink RecommenderSystems VersatileMainstreamCompute FP64,Upto4MIGinstances ConversationalAI LanguageProcessing AIInference&MainstreamCompute A2 40-60W|16GB 1-SlotLowProfile Entry-levelinferenceVideo&GraphicsCompact&Versatile MobileCloudGaming EdgeVideo EdgeAI&SmallInference SmallFootprintDatacenter&EdgeAI A40 300W|48GB 2-SlotFHFL|NVLink|3xDP FastestRTGraphicsLargestrendermodels Omniverse CloudXRandvWS CloudRendering HighestGraphicsPerfVisualComputing A10 150W|24GB1-SlotFHFL 4KCloudGaming,GraphicsandVideowithAI CloudGaming VirtualWorkstation VirtualDesktop High-PerformanceGraphicswithAI A16 250W|4x16GB2-SlotFHFL 4KResolution Max#ofencode/decodestreams Transcoding VirtualWorkstation VirtualDesktop HighestDensityVirtualDesktop Compute Compute&Graphics Graphics&Compute NVIDIAA30 主流企业服务器的多功能计算加速 •20T4AIA30TF32FLOPST4FP32 GPU GPU4(QoS) •TensorCoreFP64 2 从上一代过渡到A30的3个理由 NVIDIAAmpere一代的卓越价值和性能 HigherPerformanceper$ MIGpartitioning 4instancesforQoS NochangesinapplicationSWstack SuperiorROI HigherPerformance&UtilizationwithAmpereMIG EasyPortability A30FP64TENSORCORE助力HPC 与Volta相比,速度提升30% FP64TFLOPS 10.3 7 V100峰值A30峰值A800峰值 19.5 开源AI软件栈 app app app app app app app app Over2,000Open-SourceLibrariesandTools 100sofSources 人工智能就绪的企业平台 TextRecognition ProcessAutomation ConversationalAI ImageAnalytics AI/MLExistingApplications CPU-only NVIDIADPU NVIDIAGPU ContainerOrchestrationandManagementIntegration NVIDIAAIEnterprise InfrastructureOptimization Cloud-NativeDeployment AIandDataScienceToolsandFrameworks DataScientist/Developer/ AIResearcher MainstreamServersPublicCloud ITAdministratorMLOps Multi-CloudHybridCloudPrivateCloud vSphere+NVIDIAAIEnterprise vSphere+NVIDIAAIEnterprise vSphere+NVIDIAAIEnterprise 使用NVIDIAAIEnterprise交付AI工作负载 RunAI/MLContainersin VMs RunKubernetesinVMs RunAI/MLContainersw/ OpenShiftonvSphere RunAI/MLinvSpherew/Tanzu NVIDIAAIEnterpriseWithRedHatOpenShift TextRecognition ProcessAutomation ConversationalAI ImageAnalytics AI/MLExistingApplications CPU-only NVIDIADPU NVIDIAGPU Kubernetes-PoweredApplicationDevelopment SecurelyAutomateMLOpsPipelines Self-ServiceAccesstoInfrastructureResources NVIDIAAIEnterprise InfrastructureOptimization Cloud-NativeDeployment AIandDataScienceToolsandFrameworks DataScientist/Developer/ AIResearcher MainstreamServersPublicCloud ITAdministrator MLOps NVIDIA端到端AI软件套件 DEPLOYATSCALE OPTIMIZEDFORINFERENCE TAOTOOLKIT TRAINATSCALE DATAPREP • • • 加速AI应用开发周期 VideoAnalytics InventoryManagement& TrafficEngineering ConversationalAI TextClassification& SpeechRecognition Cybersecurity Real-TimeThreat Detection SpeechAI AutomaticSpeech Recognitionandspeechtotext Recommender Personalization&Cross Sell/Upsell PhysicsML Simulation,Prediction& Analysis Logistics RouteOptimizationand Analysis Metropolis Nemo Morpheus Riva Merlin Modulus cuOpt NVIDIAAIWorkflows NVIDIAAIEnterprise NVIDIATAOToolkit 快速创建定制的、生产就绪的AI模型 TRAINEASILY 使用部分数据微调NVIDIA预 训练模型 CUSTOMIZEFASTER 基于TensorFlow和PyTorch构 建,抽象出AI框架的复杂性 OPTIMIZEFORDEPLOYMENT 优化推理并与Riva或 DeepStream集成 SUPPORTEDBYEXPERTS* 由NVIDIA专家提供支持,帮助 解决从开发到部署的问题 *RequiresNVIDIAAIENTERPRISESUBSCRIPTION.Learnmorehere:https://www.nvidia.com/en-us/data-center/products/ai-enterprise/ •快速创建和调整最先进的自定义语言模型 NowinOpen-Beta •线性扩展到1000个GPU,用于多达万亿参数的语言模型 •使用新的序列并行和选择性激活重新计算技术将训练速度提高30% Findoutmore: NVIDIANeMoMegatronhttps://developer.nvidia.com›nemo›megatron •使用Triton推理服务器的分布式推理 NeMoMegatron 用于训练和部署具有数万亿参数的大规模语言模型的端到端框架 FullStackSolution VerifiedConvergenceRecipes,EvaluationHarnessandSampleChatbotApplication DistributedDataPre-processing HyperParameterTuning DistributedTraining AcceleratedInference NVIDIABaseCommandPlatform Azure,AWS,OCI,DGXSuperPODs,DGXFoundry Trained DNN TensorRT Optimizer TensorRT Runtime Embedded Automotive DataCenter Jetson Drive DataCenter GPUs NVIDIATensorRT SDKforhigh-performancedeeplearninginference 在生产中优化和部署神经网络。 使用编译器和运行时最大限度地提高延迟关键型应用程序的吞吐量。优化每个网络,包括CNN、RNN和Transformer。 1.降低混合精度:FP32、TF32、FP16和INT8。2.层和张量融合:优化GPU内存带宽的使用。 3.内核自动调整:在目标