热门搜索：

光学字符识别技术报告

2022-08-26-Mob研究院机构上传

mobrdev OPTICALCHARACTERRECOGNITIONTECHNOLOGYGUIDEFORBUSINESSOWNERS RAZ writtenby EvgeniyKrasnokutsky Al/MLTeamLeaderatMobiDev,PhD mobidev Clutch TopMachineLearningBestAgency DevelopmentCompanyinUkraine SOFTWAREDEVELOPMENT FORVISIONARIESWHOCREATEPRODUCTS WEB&CLOUDAR&MOBILE INFRASTRUCTUREAPPS AIANDMACHINEIOT&HARDWARE LEARNINGINTEGRATION ForStartupsForEmergingcompaniesForEnterprises Guaranteeddelivery Youcanadapttoevolving TopUS-levelquality ontimeandonbudget businessneedsandincreaseROl for1/3thepricetobring Nosurprises withourflexible,provenprocesses 3xfeaturestoyourproduct 100%450+300+ ProductsApprovalratingEnglishspeaking launchedbyUpworkprofessionals Findmoreatwww.mobidev.biz ?info@mobidev.biz+18883800276 UNITEDSTATESOFFICE UNITEDKINGDOMOFFICE ENGINEERINGOFFICEINPOLAND ENGINEERINGOFFICEINUKRAINE Atlanta,Georgia Sheffield Lodz Chernivtsi TableofContents TableofContents WhatisOCRandHowDoesItWork? STEP1.CHECKINGTHEDOCUMENTTYPE&IMAGEPRE-PROCESSING STEP2.CHARACTERRECOGNITIONSTEP3.POST-PROCESSING OCRisaMachineLearningandComputerVisionTask QCRBusinessCases OCRINFINANCIALSERVICES OCRINHEALTHCARE OCRINRETAIL OCRINSECURITYANDLAWENFORCEMENT HardwareforOCR Qut-of-the-boxSolutionsvsCustomQCR.Development COMMERCIALVSOPEN-SOURCEOCRSOLUTIONS LimitationsofOCRTechnologyandHowtoOvercomeThem KeyTakeaways 1 WiththegrowinginterestinOCRandMachineLearning,moreandmore businessownersarelookingforwaystoapplythiskillingcombinationto optimizetheirbusinessprocesses,andifyouareoneofthem,thisarticleisfor you. Let'sfindoutmoreaboutwhatOCRis,howOCRpoweredwithmachinelearning isdifferentfromtheoriginaltechnology,andhowitcanbeusedinbusiness. WhatisOcRandHowDoesItWork? convertsanykindofimagecontainingwrittentextintomachine-readabletext data.OcRallowsyoutoquicklyandautomaticallydigitizeadocumentwithout theneedformanualdataentry.That'swhyOCRiscommonlyusedforbusiness flowoptimizationandautomation.TheoutputofOCRisfurtherusedfor electronicdocumentediting,andcompactdatastorageandalsoformsthebasis forcognitivecomputing,machinetranslationandtext-to-speechtechnologies. TherearedifferenttypesofOCRdependingonthetaskstheysolve: IntelligentWordRecognition(IWR)isusedfortherecognitionofunconstrainedhandwrittenwordsinsteadofrecognitionofindividual characters. IntelligentCharacterRecognition(ICR)isamoreadvancedformofOCR hand-printedcharacters. OpticalWordRecognition(OwR)scanstypewrittentextwordbyword. OpticalMarkRecognition(OMR)isusedtoidentifytheinformationthat peoplemarkonsurveys,tests,etc. Let'sfindouthowOCRworks.Thefunctioningofthetraditionalopticalcharacter recognitionsystemconsistsofthreestages:imagepre-processing,character recognition,post-processing. 2 DOCUMENTPROCESSING PIPELINESELICTIONPRE-PROCESSING 目OCR Result mobidev STEP1.CHECKINGTHEDOCUMENTTYPE&IMAGE PRE-PROCESSING Themainchallengeoftextrecognitionisthateachdocumenttemplatehasitsownsetofentities,values,andlocationofentitiesinthedocument.ForOCR softwaretoworkaccurately,itmustbeabletoidentifydifferenttypesof documentsandrunthecorrectpredefinedpipelinebasedonthat.Forexample, PDFdocumentsmayormaynotcontainatextlayer.IfthePDFdoesnotcontain atextlayer,wemustprocessitdifferentlythanifitdid. Afterchoosingtherightpipelinetheimagecomestothepre-processingstep. Thisisapreparationstepthataffectstheoutcomes.lmagepre-processinghelpstoremoveimagenoiseandincreasethecontrastbetweenthebackgroundand text,whichwillhelpimprovetextrecognition.Atthisstep,theOCRprogram convertsthedocumenttoablackandwhiteversionandthenanalyzesitforthe presenceoflightanddarkareas,Lightareasareidentifiedasthebackground, whiledarkareasareidentifiedascharacterstobeprocessed. 3 STEP2.CHARACTERRECOGNITION Withtheuseoffeaturedetectionandpatternrecognitionalgorithmsasingle characterisdetected.Then,asetofthecharactersareassembledintowords andsentences.Charactersareidentifiedusingpatternrecognitionorfeature detectionalgorithms. Patternrecognitionisamethodbasedonfindingmatchesbetweenthe imagetextandtextsamplesembeddedinthesysteminvariousfontsand formats.Thismethodworksbestwithtypescriptanditdoesn'tworkwell whennewfontsareencounteredthatarenotincludedinthesystem. Thefeaturedetectionalgorithmmakesitpossibletorecognizenew Suchfeaturesmayincludethenumberofslantedlines,intersectinglines, orcurvesinthecomparisonsymbol. Mostoften,OCRprogramswithfeaturedetectionuseclassifiersbasedon machinelearningorneuralnetworkstoprocesscharacters.Classifiersareused tocompareimagefeatureswithstoredexamplesinthesystemandselectthe closestmatch.Thefeaturedetectionalgorithmisgoodforunusualfontsor low-qualityimageswherethefontisdistorted. STEP3.POST-PROCESSING Onceasymbolisidentified,itisconvertedintoacodethatcanbeusedbycomputersystemsforfurtherprocessing.Weshouldmentionthattheoutputof anyOCRandOCR-relatedtechnology/algorithmhasalotofnoiseandfalse positives.ItmakesitdifficulttouseOCR'soutputdirectly,sowehaveto: •Filteroutnoisyoutputsandfalsepositives •Combinerecognizedentitieswiththeirextractedmeaning •Checkforpossiblemistakesandpreventoutputtotheuserifany Based

点击免费查看完整报告

你可能感兴趣

光学字符识别技术报告

你可能感兴趣

光学字符识别技术报告（EN）

光学字符识别技术报告（EN）

化工行业光学膜系列报告（一）：技术突破者领跑赛道，光学基膜需求空间巨大

电子元器件行业周度报告：半导体行业技术提升产能扩张并进，终端产业链光学领域受到关注

公司首次覆盖报告：深耕光学冷加工领域，技术工艺受知名厂商认可