Guide to the TechWeb Network

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Advanced Search
RSS
Webcasts
Whitepapers
Subscribe
Home




June 13, 2001



Mining a Demographic Mother Lode

Mine census data to enhance your organizations's marketing programs and customer relationships

By Seth Grimes

The first detailed results from the U.S. Census 2000 - a demographic snapshot of the American people - hit the streets back in March 2001, with increasingly informative data sets slated for release through 2003 and beyond. You won't find a better statistical picture of the country. Study the New Economy all you want, but you'll make the best sense of your findings in light of the baseline provided by census data.

Census summary (aggregated) data sets are yours to use as you see fit - for state and local government, marketing, or academic research - to better understand your neighbors, customers, and business opportunities. They're available on CD and for download on the U.S. Census Bureau (USCB) Web site. You can query them and draw thematic maps at the USCB's American FactFinder Web site or obtain value-added analytic and geospatial-display packages from commercial vendors. The first steps, however, are to learn what data is available and understand how you can (and can't) use it: the subject of this column.

What You Get

Apportionment of congressional seats among the states is the raison d'etre of the U.S. census, and the states use the most important data set to draw congressional district boundaries. Sets of redistricting data files, one set for each state plus the District of Columbia and Puerto Rico, were released in February and March 2001. This data focuses on total and voting-age population counts. Given voting-rights sensitivity, it breaks out tallies by membership in one or more of six major racial categories and Hispanic or Latino origin.

Data sets slated for release starting in July will cover housing and family, as well as population. They will constitute an in-depth demographics analysis with, for instance, detailed age categories and more than 250 iterations of race, ancestry, ethnicity, and much more.

Data drawn from the detailed questionnaire sent to a sample of one in six households will be weighted to extend their validity to the full population and analyzed in data sets slated for release in 2002 and 2003. Because the survey universe for the sample data products is smaller and the results are more detailed than for the 100 percent coverage data products, to avoid disclosing information about individuals, the sample products will be computed only to census tract rather than census block level.

What You Don't Get

U.S. law backs its requirement that each U.S. household file a census form with guaranteed confidentiality: Untreated responses from individual forms may not be released until 72 years after the census. These rules, U.S. Titles 13 and 44, clearly says that legislative representation is more important than gleaning evidence of illegal immigration or criminal activity from household forms. So what you don't get first and foremost is access to unsummarized microdata.

The USCB does accommodate academic and commercial researchers by creating "public use microdata samples," large-geographic-area subsets of the raw census form responses, with information that can identify individuals removed.

Other steps are taken to prevent disclosure of individual information in the aggregated macro data. They include thresholding (the USCB suppresses certain results if the population of a geographic area is less than 100) and rounding of aggregate income values if too few individuals contribute to them. Lastly, the USCB alters raw data through techniques such as swapping: exchanging selected records between geographic areas in ways that should not alter the statistical characteristics of results. These techniques protect against dangers like complementary disclosure, where manipulating published values can reveal protected, unpublished values. Disclosure-avoidance alterations should make little difference relative to survey errors, however.

The USCB computed summary results both with and without statistical adjustments designed to counteract the effects of undercounting and double counting. The USCB conducted an exhaustive control survey of 314,000 households from which it estimated a net undercount of 1.18 percent, or about 3.3 million people. The USCB ultimately determined that efforts to improve the accuracy of full census results through reference to the control survey could introduce its own errors. Given the current balance of power in Washington, D.C. - the consensus understanding is that correcting the miscount would add a high proportion of minority voters who have not historically voted Republican - the government decided not to release the adjusted figures.







IE Weekly Newsletter
Subscribe to the newsletter
    Email Address







InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space