您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[网易]:Accelerate Spark Flink Perform - 发现报告
当前位置:首页/其他报告/报告详情/

Accelerate Spark Flink Perform

2023-03-09网易嗯***
Accelerate Spark Flink Perform

(第1期) 利用IntelOptanePMEM技术加速大数据分析 TerryWei(Intel) 徐铖(Intel) NoticesandDisclaimers ©2018IntelCorporation.Intel,theIntellogo,3DXPoint,Optane,Xeon,Xeonlogos,andIntelOptanelogoaretrademarksofIntelCorporationintheU.S.and/orothercountries.Allproducts,computersystems,dates,andfiguresspecifiedarepreliminarybasedoncurrentexpectations,andaresubjecttochangewithoutnotice. Nocomputersystemcanbeabsolutelysecure.Checkwithyoursystemmanufacturerorretailerorlearnmoreatintel.com. ThecostreductionscenariosdescribedareintendedtoenableyoutogetabetterunderstandingofhowthepurchaseofagivenIntelbasedproduct,combinedwithanumberofsituation-specificvariables,mightaffectfuturecostsandsavings.Circumstanceswillvaryandtheremaybeunaccounted-forcostsrelatedtotheuseanddeploymentofagivenproduct.Nothinginthisdocumentshouldbeinterpretedaseitherapromiseoforcontractforagivenlevelofcostsorcostreduction. Thebenchmarkresultsreportedabovemayneedtoberevisedasadditionaltestingisconducted.Theresultsdependonthespecificplatformconfigurationsandworkloadsutilizedinthetesting,andmaynotbeapplicabletoanyparticularuser’scomponents,computersystemorworkloads.Theresultsarenotnecessarilyrepresentativeofotherbenchmarksandotherbenchmarkresultsmayshowgreaterorlesserimpactfrommitigations. Resultshavebeenestimatedbasedontestsconductedonpre-productionsystems,andprovidedtoyouforinformationalpurposes.Anydifferencesinyoursystemhardware,softwareorconfigurationmayaffectyouractualperformance.SoftwareandworkloadsusedinperformancetestsmayhavebeenoptimizedforperformanceonlyonIntelmicroprocessors. Performanceresultsarebasedontestingasof03-14-2019andmaynotreflectallpubliclyavailablesecurityupdates.Seeconfigurationdisclosurefordetails.Noproductcanbeabsolutelysecure. Performancetests,suchasSYSmarkandMobileMark,aremeasuredusingspecificcomputersystems,components,software,operationsandfunctions.Anychangetoanyofthosefactorsmaycausetheresultstovary.Youshouldconsultotherinformationandperformanceteststoassistyouinfullyevaluatingyourcontemplatedpurchases,includingtheperformanceofthatproductwhencombinedwithotherproducts.Formoreinformationgotowww.intel.com/benchmarks. IntelprocessorsofthesameSKUmayvaryinfrequencyorpowerasaresultofnaturalvariabilityintheproductionprocess.SoftwareandworkloadsusedinperformancetestsmayhavebeenoptimizedforperformanceonlyonIntelmicroprocessors. OptimizationNotice:Intel'scompilersmayormaynotoptimizetothesamedegreefornon-IntelmicroprocessorsforoptimizationsthatarenotuniquetoIntelmicroprocessors.Theseoptimizations includeSSE2,SSE3,andSSSE3instructionsetsandotheroptimizations.Inteldoesnotguaranteetheavailability,functionality,oreffectivenessofanyoptimizationonmicroprocessorsnotmanufacturedbyIntel.Microprocessor-dependentoptimizationsinthisproductareintendedforusewithIntelmicroprocessors.CertainoptimizationsnotspecifictoIntelmicroarchitecturearereservedforIntelmicroprocessors.PleaserefertotheapplicableproductUserandReferenceGuidesformoreinformationregardingthespecificinstructionsetscoveredbythisnotice.NoticeRevision#20110804. Testsdocumentperformanceofcomponentsonaparticulartest,inspecificsystems.Differencesinhardware,software,orconfigurationwillaffectactualperformance.Consultothersourcesofinformationtoevaluateperformanceasyouconsideryourpurchase.Formorecompleteinformationaboutperformanceandbenchmarkresults,visitwww.intel.com/benchmarks. *Othernamesandbrandsmaybeclaimedasthepropertyofothers. Agenda •OAP概述 •OAPAnalyticcache介绍 •OptanePMEM介绍 •OAPAnalyticcache特性介绍 •NativeParquetReader •缓存及其多种模式 •算子下放(Filter/Project/Aggregation) •后续工作 OptimizedAnalyticsPackage(OAP) 概述 •OAP是Intel和相关社区维护的开源项目,旨在优化和扩展现有Spark 的性能和功能。 •Spark在多个方面都表现地性能卓越,但进一步提升现有的Spark性能还是存在着新的挑战: •现有的基于JVM的和基于row的计算引擎让Spark不能充分利用上Intel硬件的一些特性,比如AVX/AVX512,GPU •目前实现层面,比如内存管理,shuffle实现,都没有考虑到最新高级硬件比如OptanePMEM •批处理在很多时候还是不能满足高性能要求。 https://github.com/oap-project/ OAPAnalyticCache概述 •提供基于列存存储结构的高性能细粒度缓存。同时支持计算本地缓存和分离式缓存(独立于计算节点之外) •利用FSDAX(结合PMEM)跳过系统pagecache •Sharedmemory来支持单机zerocopy缓存数据共享 •支持算子的前置下推(比如Aggregation,Filter,Project等),适配主流的计算引擎,如Spark和Flink •支持下一代硬件加速(如QAT,PMEM等) OptimizedAnalyticsPackage(OAP)组件 Intel分层存储的架构设计 性能层容量层 DRAM Hottier IntelOptanetechnologycanhelpmemoryandstorage applications! 英特尔傲腾技术的独特价值 持久特性 原地擦写 字节寻址 低延迟 1.“TheChallengeofKeepingUpwithData”https://www.intel.com/content/www/us/en/products/memory-storage/optane-dc-persistent-memory.htmlForworkloadsandconfigurations,visitwww.intel.com/Performanceindex.Resultsmayvary. 英特尔傲腾持久性内存系列价值 从比以前更大的数据集中 提取更多价值 以极具吸引力的TCO规模 向更多客户大规模交付更多服务 自动改善 安全而不牺牲性能 128、256和512GB模块 更大的数据集可以存在于更靠近CPU的位置,以加快处理速度,这意味着更深入的了解 更大的内存池有助于突破输入/输出(I/O)瓶颈 将更大的数据集保留在内存中的价格更便宜 内存数据库的重新启动时间更快 支持更多的虚拟机和每个虚拟机更多的内存,所有这些都比典型的DRAM成本低 节省内存成本可以帮助满足大容量存储需求 提高虚拟机,用户和应用程序的密度,并使用更少的服务器来满足业务服务级别协议(SLA) 断电时提高业务正常运行时间 AES256位硬件加密有助于保护固件和数据 放在模块上的数据始终是加密的 启用安全性的加密擦除和DIMM覆盖帮助防 止数据被重新使用或丢弃的模块上访问 解决方案:现有的内存和高延迟存储系统无法满足高吞吐量,高带宽需求 解决方案:当将热数据存储在内存中时,数据密集型工作负载表现最佳,但是昂贵且容量有限 解决方案:数据容易受到持续,复杂的攻击,但是启用安全性和加密会牺牲性能 完整的模块系统 PMIC 生成媒体和控制器关联 SPIFlash 保存固件 Intel®Op