How have characteristics of population and housing changed in the United States over time? What information is available to support this analysis? The Long Form released by the US Census Bureau contains the most detailed set of official US demographics available. In 1990 this set of data is referred to as Summary Tape File 3 (STF3) and in 2000 it is called the Summary File 3 (SF3). There are several issues that arise when you try to compare two sets of data that were collected ten years apart. Direct comparisons between these two sets of data are made more complicated by two factors: 1. changes in the questionnaire design and 2. changes in area boundary definitions.
The first issue, changes in the questionnaire design, has several components: wording of questions that vary, ordering of the questions, categories of questions are dropped and others added. There are also instances of cross-tabulation tables changing, as well as many cross-tabulations that were released in 1990 at the Block Group but only released at the Tract level in 2000. Furthermore, sometimes data for small areas like block groups and tracts were imputed, or taken from like or nearby areas, to protect confidentiality. This also decreases the reliability of the data at smaller levels of geography. Likewise, population under-counting and over-counting may be addressed differently in different census years. These types of issues may be addressed by reviewing the summary information provided by the US Census.
The second obstacle is the changes in geographic definitions. These occur because areas split (1:2), merge (2:1) or both (2:3). The remainder of this paper will discuss how GeoLytics normalized the 1990 Long Form census data to various 2000 geographies. This enables comparisons between 1990 and 2000 Long Form data to be made in standard 2000 geographies. To explain the normalization of 1990 STF3 data to 2000 geographies, we start by weighting and converting 1990 Block Group data to 2000 areas. 1990 Block Group data is used because it is the smallest level of 1990 geography at which the full set of US Census 1990 Long Form data is available. To facilitate the splitting and merging of 1990 Block Groups to 2000 areas, Census Blocks are used. A Census Block is much smaller than a Block Group. There are approximately 30 to 40 Blocks in each Block Group. And unlike previous censuses, Blocks and Block Groups cover 100% of the US in 1990 and 2000.
The 1990 to 2000 Block relations were determined from Tiger/Line 2000, Type 1 and Type 3 records. 85% of the Blocks had a 1:1 relationship, 10% had a 2:1, and 5% had a greater than 2:1. Block splits between 1990 and 2000 were weighted by an analysis of the 1990 streets. To split a Block into parts, the sub-Block areas were weighted according to the 1990 streets relating to each 2000 Block part. The assumption is that local roads indicate where the population lived. 1990 streets were determined using Tiger/Line 1992. Using Tiger 1992 and Tiger 2000 we created a correspondence between 1990 and 2000 Blocks, as well as a weighting value. The weighting value was then used to help split Block demographics for those Blocks that had been split or merged between 1990 and 2000. The file produced by this process is the 1990 to 2000 Block Weighting File (BWF). From this BWF we can roll up the 1990 data to any 2000 geography (tract, zip code, county, etc.).
A final weighting consideration should be noted. The weighting of 1990 Block Group data to 2000 areas has been done as statistically accurately as possible. The 1990 STF3 data is the official Census data and our methodology presents an accurate and comprehensive method to statistically compare 1990 data with 2000 data. However, the converted 1990 data in 2000 boundaries cannot be considered official census data. While a major obstacle to comparing altered geographic areas has been overcome, those areas that have not changed between 1990 and 2000 may contain rounding differences in the weighting process and may not exactly match the official census.