您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[全球软件开发大会]:AIGC浪潮下WebNN的演进与实践 - 发现报告
当前位置:首页/行业研究/报告详情/

AIGC浪潮下WebNN的演进与实践

AIGC浪潮下WebNN的演进与实践

演讲人:付俊伟 胡宁馨,英特尔首席工程师,W3CWebNeuralNetwork(WebNN)标准的起草和主要编辑者,ChromiumcommitterandChromiumWebNN组件的主要拥有者 张敏,IntelWebNN团队的技术经理,ChromiumandONNXRuntimeWebNNEP 的开发者,WebNNdeveloperpreview的作者 付俊伟,英特尔高级软件工程师,ChromiumcommitterandChromiumWebNN的 基础架构设计和ChromiumShapeDetectionAPI主要开发者 目录 01 WebNN出现的背景 02 WebNN的架构设计 03 如何使用WebNN 04 WebNN的性能对比 https://microsoft.github.io/webnn-developer-preview/ WebNNExecutionProviderofONNXRuntimeWebwithGPUaccelerationfromDirectML. RunningonIntel®CoreUltra7processor155HwithintegratedArcGPU. Acatunderthesnow StableDiffusion Unet Step 1 TextEncoderImageGeneration Unet Step 2 Unet Step 3 Unet Step 4 ImageDecoder WebNNOperation matMul gather sigmoid softmax DirectML GEMM GATHER LOGISTIC SOFTMAX TFLite BATH_MATMUL GATHER ACTIVATION_SIGMOID ACTIVATION_SOFTMAX CoreML matmul gather_along_axis sigmoid softmax 运用场景 Image Classification ObjectDetection NoiseSuppression NaturalLanguage BackgroundSegmentation 框架 Transformers.js MediaPipeWeb ONNXRuntimeWeb TensorFlow.js WebAPI WebAssembly WebGPU WebNN APIextensions Web引擎 JavaScriptRuntime (e.g.,Electron/Node.js) WebBrowser (e.g.,Chrome/Edge) 系统 MLAPIs OtherMLOSAPIs WindowsStudioEffects CoreML DirectML TFLite 硬件 NPU GPU CPU create build ComputationalGraph(Web) conv2dadd relu compile MLGraphcompute CompiledGraph(Native) Fusedconv2d output input bias filter OutputBuffers(CPU/GPU) output tmp tmp input InputBuffers(CPU/GPU) MLGraphBuilder MLContext devicetype:cpu/gpu/npu powerpreference:high-perf/low-power WebNNAPI OtherWebAPI CallflowDataflow WebNN为Web带来了神经网络的统一抽象 JSMLFrameworks WebApplication Apps/Frameworks Chromium RendererProcess WebNNMojoClient MLGraph MLGraphBuilder MLContext GPU/UtilityProcess IPC WebNNMojoServer CoreMLBackend DirectMLBackend TFLiteBackend NativeMLAPIsOSDrivers macOS CoreMLBNNS/MPS Windows DirectMLMCDM Android/ChromeOS/Linux TFLite XNNPACK/Delegate HardwareGPUNPUCPU 1.18release input WasmKernels weights bias intermediate weights intermediate WebNNGraph Intermediate WasmKernels IntegrationStatus NPU GPU CPU NativeNPUKernels NativeGPUKernels NativeCPUKernels BrowserswithWebNNsupport WebGLKernels WebGPUKernels WebNNGraph WasmKernels ONNXRuntimeWeb TensorFlowLiteWeb WebApplication Post-Processing MatMul Conv2d Pre-processing Prototype Available https://microsoft.github.io/webnn-developer-preview/ WebNNExecutionProviderofONNXRuntimeWebwithGPUaccelerationfromDirectML. RunningonIntel®CoreUltra7processor155HwithintegratedArcGPU. VanillaJS(plainJavaScript)useofWebNNAPI,withNPUaccelerationfromDirectML. RunningonIntel®CoreUltra7processor155HwithintegratedIntel®AIBoostNPU. 5.0 4.5 4.0 InferenceSpeedup 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 MediaPipeModelsInferencePerformance(Normalized/HigherisBetter) 4.44.5 3.03.1 3.31.2 33.2 .0 3.03.2 3.03.1 2.9 3.0 3.31.2 3.31.3 3.1 2.8 3.0 2.8 2.5 2.9 2.5 2.9 2.2 1.8 2.7 2.3 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 100.0% 90.0% WebNNvs.NativeRatio 80.0% 70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% •Browser:ChromeCanary118.0.5943.0 •DUT:Dell/Linux/i7-1260P,singlep-core •Workloads:MediaPipesolutionmodels(FP32,batch=1) WasmSIMDWebNNXNNPackNativeXNNPackWebNNvsNative WebNNDirectMLvs.NativeDirectML 10000120.0 InferenceTime(ms)(Logscale) 1000 100.0 95.0 87.987.3 87.4 88.8 88.1 89.5 87.6 86.7 91.4 93.2 95.6 91.5 81.5 82.6 82.4 85.3 75.8 78.6 79.0 81.5 71.4 72.0 71.5 76.5 73.0 Percentage(%) 80.0 10060.0 40.0 10 20.0 10.0 •Browser:ChromeCanary126.0.6459.0 •OS:Windows11Pro23H2 •DUT:AsusZenbook •CPU:Intel(R)Core(TM)Ultra7155H3.80GHz •GPU:Intel(R)Arc(TM)Graphics •GPUDriver:31.0.101.5512 WebNNGPUNativeDirectMLWebNNGPUvs.NativeDirectML 8.00 7.00 InferenceTime(ms) 6.00 WebNNDirectMLvsNativeonMTLNPU 95.8% 73.4 86.1% % 100.0% 90.0% 80.0% 5.00 4.00 3.00 2.00 1.00 0.00 62.7% MobileNetV2SqueezeNet1.0 70.0% WebNNvsNative(%) 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% ResNet50v1EffiecientNetLite4 WebNNDirectMLNPUNativeNPUWebNNNPUvsNative •Browser:ChromeCanary126.0.6459.0 •OS:Windows11Pro23H2 •DUT:AsusZenbook •CPU:Intel(R)Core(TM)Ultra7155H3.80GHz •NPU:Intel(R)AIBoost •NPUDriver:32.0.100.2381 Theaverageperformanceoflisted4modelsonWebNNDirectMLisabout80%ofnativeDMLonMTLNPU SpeechtoTextPoCDemoforKhanAcademyKhanmigo. WebNNExecutionProviderofONNXRuntimeWebwithNPUaccelerationfromDirectML. RunningonIntel®CoreUltra7processor155HwithintegratedIntel®AIBoostNPU. • •  • • • • • • • THANKS 大模型正在重新定义软件 LargeLanguageModelIsRedefiningTheSoftware