RESEARCHMETHODSANDDATAANALYTICS RESEARCHREPORT DecennialDisclosure AnExplaineronFormalPrivacyandtheTopDownAlgorithm ClaireMcKayBowenAaronR.WilliamsMadelinePickensSeptember2022 ABOUTTHEURBANINSTITUTE ThenonprofitUrbanInstituteisaleadingresearchorganizationdedicatedtodevelopingevidence-basedinsightsthatimprovepeople’slivesandstrengthencommunities.For50years,Urbanhasbeenthetrustedsourceforrigorousanalysisofcomplexsocialandeconomicissues;strategicadvicetopolicymakers,philanthropists,andpractitioners;andnew,promisingideasthatexpandopportunitiesforall.Ourworkinspireseffectivedecisionsthatadvancefairnessandenhancethewell-beingofpeopleandplaces. Copyright©September2022.UrbanInstitute.Permissionisgrantedforreproductionofthisfile,withattributiontotheUrbanInstitute.CoverimagebyTimMeko. Contents Acknowledgmentsiv DecennialDisclosure1 Introductiontothe2020CensusandDataPrivacy1 DataPrivacyDefinitionsandTerminology4 DataPrivacyMethodologyWorkflow6 IntroductiontoFormalPrivacy10 FormalPrivacy10 DifferentialPrivacyandOtherFormallyPrivateDefinitions11 Privacy-LossBudget13 GlobalSensitivity16 GaussianMechanism17 ModelsofDifferentialPrivacy19 Introductionto2020DisclosureAvoidanceSystem21 PrivacyandUtilityMeasures21 StatisticalDisclosureControlMethod22 TakeawaysandOngoingChallenges26 Notes29 References30 AbouttheAuthors31 StatementofIndependence32 Acknowledgments ThisreportwasfundedbytheTableauFoundation.Wearegratefultothemandtoallourfunders,whomakeitpossibleforUrbantoadvanceitsmission. TheviewsexpressedarethoseoftheauthorsandshouldnotbeattributedtotheUrbanInstitute,itstrustees,oritsfunders.FundersdonotdetermineresearchfindingsortheinsightsandrecommendationsofUrbanexperts.FurtherinformationontheUrbanInstitute’sfundingprinciplesisavailableaturban.org/fundingprinciples. Theauthorsthankthefollowingindividualswhogenerouslyprovidedinvaluablefeedbackthatgreatlyimprovedthisexplainer: ConstanceCitro,SeniorScholar,CommitteeonNationalStatisticsattheNationalAcademiesofSciences,Engineering,andMedicine RonPrevost,ResearchProfessor,MassiveDataInstitute,McCourtSchoolofPublicPolicyatGeorgetownUniversity LeslieReynolds,ResearchSupportSpecialist,ProgramonAppliedDemographics,CornellJeb E.BrooksSchoolofPublicPolicy JosephSalvo,Fellow,SocialandDecisionAnalyticsDivisionattheUniversityofVirginiaBiocomplexityInstitute MeghanStuessy,Analyst,GovernmentOrganizationandManagementatCongressionalResearchService DavidVanRiper,DirectorofSpatialAnalysis,InstituteforSocialResearchandDataInnovationattheUniversityofMinnesota JanVink,ExtensionAssociate,ProgramonAppliedDemographics,CornellJebE.BrooksSchoolofPublicPolicy iv IzzyYoungs,ResearchSpecialist,MassiveDataInstitute,McCourtSchoolofPublicPolicyatGeorgetownUniversity ACKNOWLEDGMENTS DecennialDisclosure Althoughcollectingmoreandbetterdatacanprovidegreatbenefitstosociety,suchasfurtheringmedicalresearchortargetinginvestmentstothosemostinneed,dataprivacyconcernssurfacefromthosechargedwithprotectingdatawhenthatinformationcanbede-anonymizedandusedmaliciously. Forexample,theUSCensusBureauconductedasimulatedattackonthe2010DecennialCensusanddiscoveredtheycouldreidentifyaboutone-sixthoftheUSpopulationusingpubliclyavailabledata(suchasname,sex,andage)fromexternalsources,likepublicsocialmediaprofiles(Leclerc2019).Thistypeofattackonthe2020DecennialCensushasthepotentialtobeevenmoredisclosivebecauseofthedetailedinformationcollected,suchasmoreraceandethnicitycategories,thatcouldleadtomoreindividualsbeingidentifiedwithgreatspecificity.ThereconstructionattackresultsandthemoredetailedinformationavailableinthedecennialcensusmotivatedtheCensusBureautoupdatetheirDisclosureAvoidanceSystem(DAS)fromtraditionalstatisticaldisclosurecontrolmethodstoaformallyprivatemethod—theTopDownAlgorithm—forthe2020DecennialCensus. However,thisdrasticchangeinhowdataprivacyandconfidentialitywasdefinedforthe2020DAScausedsignificantfrictionbetweentheUSCensusBureauandcensusdatausers.Forinstance,leadersfromstates,counties,cities,andtownsrelyoncensusdataforschoolplanning,budgeting,socialprogramprovisions,redistricting,revenuesharing,andamultitudeofotherstatutoryrequirements.ThesedatauserswantmoreaccuratedataatgranulargeographicareasandfearthattheupdatedDASwillleadtoincorrectpublicpolicydecisions. ThisexplaineraimstohelpreadersbetterunderstandwhatformalprivacyisandhowtheTopDownAlgorithmworks.Theexplainerisalsoacontinuationof“PersonalPrivacyandthePublicGood:BalancingDataPrivacyandDataUtility”(Bowen2021)andweencouragereaderstoreadthatreportfirst. Introductiontothe2020CensusandDataPrivacy ThedecennialcensusdataproductsaffecthowtheUnitedStatesapportionthe435seatsfortheUnitedStatesHouseofRepresentatives,redistrictvotinglines,planfornaturaldisasters,andconductmanyotherpurposes.Therefore,theCensusBureau’smissionis“…tocounteveryoneonce,onlyonce, andintherightplace.”Withthisgoalinmind,theUSCensusBureaucollectsinformationoneverypersonandhouseholdatvariousgeographiclevelsfortheUnitedStates(figure1). FIGURE1 USCensusBureau’sGeographicLevels Source:Authors’illus