Available Data Sets:
The Core Demographics database includes a wide range of demographic variables for the current year and 5- year projections, covering five broad topic areas:
- labor force
- and dwelling
With a foundation of the Experian household level databases and over fifteen years of experience in demographic forecasting, AGS offers the highest quality demographic estimates in the marketplace today.
Current Year Estimates & 5 Year Projections
Householder under 25 years
Less than $10,000
$10,000 to $14,999
$15,000 to $19,999
$20,000 to $24,999
$25,000 to $34,999
$35,000 to $39,999
$40,000 to $49,999
$50,000 to $59,999
$60,000 to $74,999
$75,000 to $99,999
$100,000 to $124,999
$125,000 to $149,999
$150,000 to $199,999
$200,000 and over
Householder 25 to 34 years
Householder 35 to 44 years
Householder 45 to 54 years
Householder 55 to 64 years
Householder 65 to 74 years
Householder 75 and over
BusinessCounts is a geographic summary database of business establishments, employment, and occupation. The core BusinessCounts data, which now utilizes the industry standard InfoUSA database as its primary source data, includes data to the major SIC group with detailed establishment types. The database is available at the block group level and higher, including all standard geographic aggregations.
BusinessCounts is a vital addition to residential demographic data, in that the success of many business establishments is dependent upon not only the residential population, but also the working population during the daytime. Based primarily on the InfoUSA business database and supplemented by various public data sources, BusinessCounts provides a clear look at the range and size of establishments and their employees within any geographic area.
The Consumer Spending database covers most major household expenditures in a multi-level hierarchical classification. Expenditures can be expressed either as aggregate expenditure or per household expenditure for any geographic level from the block group to national.
The major categories represented are:
- Total Expenditure
- Food and Beverages
- Household Operations
- Household Furnishings/Equipment
- Health Care
- Personal Care
- Tobacco Products
- Miscellaneous Expenses
- Cash Contributions
- Personal Insurance
Most of these categories include two or three levels of sub-category detail. For example, a typical classification for an item in the food group is:
TOTAL Total Expenditure
FB Food and Beverage
FB1 Food At Home
FB102 Dairy Products
This structure permits ready analysis of expenditures at any level of detail and between levels of detail. It is possible to analyze any individual category within the context of its parent category (e.g. cheese expenditures as a share of total dairy product expenditures or total food at home expenditures).
Methodology and Data Sources
The consumer spending database consists of a multi-level hierarchical classification of household expenditures, which covers the majority of annual household expenditures. It is derived from an extensive modeling effort using the 1998, 1999 and 2000 Consumer Expenditure Survey data from the Bureau of Labor Statistics. The BLS survey is a comprehensive survey that averages over 5,000 households four times a year using a rotating sampling frame. The use of several consecutive years of data provides a rich base of expenditure data from which to build expenditure models based on household demographics.
The database consists of a total of 493 base variables, which are aggregated in up to four levels of detail. A hierarchical structure is utilized throughout, so that it is possible to aggregate or disaggregate categories as required for analysis. The survey includes a wide range of demographic attributes related to “consuming units” (generally households), which have been modeled separately for each discrete expenditure category. The older surveys were first inflated to the 1997 price levels using the detailed consumer price index series. For each individual expenditure category in the survey, summary statistics were calculated for each separate element in the list below. In several cases, it was possible to utilize cross tabulation data (e.g. income by age of head of household). These variables are listed below:
- Geographic region (Northeast, South, Midwest, West)
- Metropolitan status (metropolitan, non-metropolitan) and size (e.g. > 4 million)
- Housing tenure (owner or renter)
- Age of head of household (< 25 years, 25-34 years, 35-44 years, 45-54 years, 55-64 years, 65-74 years, and 75+ years)
- Size of household (1 person, 2 persons, 3 persons, 4 persons, 5 persons, 6+ persons)
- Household income (< 5000, 5-10000, 10-15000, 15-20000, 20-30000, 30-40000, 40-50000, 50-70000, 70000+)
- Race (White, Black, American Indian, Asian)
- Number of vehicles (none, 1, 2+ vehicles per household
The total sample was utilized to obtain an average expenditure for each item. For each expenditure item, a series of adjustment factors were derived for each unique demographic attribute. These adjustment factors were then applied to the block group level using the same demographic variables in order to create estimates at the local level, which are consistent with local characteristics. Consistency checks were undertaken in order to ensure that the results at the block group level were consistent in the aggregate with overall income levels and published expenditures. Finally, the 1998 estimates were inflated using detailed consumer price indexes to current year levels. The total sample was utilized to obtain an average expenditure for each item. For each expenditure item, a series of adjustment factors were derived for each unique demographic attribute. These adjustment factors were then applied to the block group level using the same demographic variables in order to create estimates at the local level, which are consistent with local characteristics. Consistency checks were undertaken in order to ensure that the results at the block group level were consistent in the aggregate with overall income levels and published expenditures. Finally, the 1998 estimates were inflated using detailed consumer price indexes to current year levels.
In addition to providing average household expenditures, AGS also provides total market estimates for use in market share and demand analysis.
Marketers are challenged by how to reach and influence today’s dramatically evolving and digitally engaged American consumer. Over the past five years, household composition, economic status and technology usage have morphed due to the recession, unemployment, a housing market crash and a digital revolution. The combination of these forces has changed how Americans live, behave, communicate and interact on every level.
Many consumers have altered their lifestyles to accommodate their current socioeconomic situations. It’s not new news that during economic changes there is an impact in household formation. The fact that seven in 10 college graduates will live at home after college is a game-changing statistic for data-driven marketers. As the dynamics of American households transform, it is critical for marketers to recognize changes in their customer landscape and understand the emerging values driving behavior so that they can communicate with greater relevancy and impact.
Understanding consumers in this evolving environment is a crucial business task. Prioritizing and targeting the best customers for the greatest return on marketing investment requires an updated and accurate customer segmentation system. Additionally, unifying marketing programs across traditional and digital media with defined customer segments enables maximum customer engagement, profitable acquisition, increased customer loyalty, retention and lifetime value.
Key American consumer dynamic shifts in the past five years:
- Aging of America — baby-boomer population now turning 65
- Increase of multigenerational households
- Moms having children later in life
- Digital diversity and mobile movement — more and increasing ways for consumers to connect
- Consumer trends — GreenAwareSM, healthy lifestyles
Experian Marketing Services’ Mosaic® USA is a household-based consumer lifestyle segmentation system that classifies all U.S. households and neighborhoods into 71 unique segments and 19 overarching groups, providing a 360-degree view of consumers’ choices, preferences and habits. The new Mosaic® system is the first segmentation tool built in the U.S. market in the past five to 10 years and reflects critical new data presented in the 2010 census.
This groundbreaking classification system paints a rich picture of U.S. consumers and their sociodemographics, lifestyles, behaviors and culture, providing marketers with the most accurate and comprehensive view of their customers, prospects and markets.
Mosaic USA offers a common customer language to define, measure, describe and engage target audiences through accurate segment definitions that enable more strategic and sophisticated conversations with consumers. Using Mosaic USA lifestyle segmentation, marketers can anticipate the behavior, attitudes and preferences of their best customers and reach them in the most effective traditional and digital channels with the best messages.
CrimeRisk is a block group and higher level geographic database consisting of a series of standardized indexes for a range of serious crimes against both persons and property. It is derived from an extensive analysis of several years of crime reports from the vast majority of law enforcement jurisdictions nationwide. The crimes include murder, rape, robbery, assault, burglary, larceny, and motor vehicle theft. These categories are the primary reporting categories used by the FBI in its Uniform Crime Report (UCR), with the exception of Arson, for which data is very inconsistently reported at the jurisdictional level.
In accordance with the reporting procedures using in the UCR reports, aggregate indexes have been prepared for personal and property crimes separately, as well as a total index. While this provides a useful measure of the relative “overall” crime rate in an area, it must be recognized that these are unweighted indexes, in that a murder is weighted no more heavily than a purse snatching in the computation. For this reason, caution is advised when using any of the aggregate index values.
The primary source of CrimeRisk was a careful compilation and analysis of the FBI Uniform Crime Report databases. On an annual basis, the FBI collects data from each of about 16,000 separate law enforcement jurisdictions at the city, county, and state levels and compiles these into its annual Uniform Crime Report (UCR). The latest national crime report can be obtained either from the FBI web site in Adobe Portable Document (PDF) format or can be ordered directly from the FBI. While useful, the UCR provides detailed data only for the largest cities, counties, and metropolitan areas.
The original analysis was undertaken by obtaining detailed jurisdictional level data for the years 1990 through 1996, which were supplemented with 1999 preliminary UCR statistics at the State level and for cities and metropolitan areas where those have been released. AGS now uses UCR data from 1998-2006. The preliminary 2007 release data was used to balance the models to the latest available data.
A considerable effort was made to correct a number of problems that are prevalent within the FBI databases, including:
- The standardization of jurisdictional names: the FBI does not employ Census bureau codes in its databases and the jurisdictional names contain numerous typographical errors and format discrepancies which needed to be manually corrected
- Reporting by individual jurisdictions can be inconsistent from year to year, in that data for some jurisdictions is missing for one or more years and required handling
- Reporting for some crime types is inconsistent between jurisdictions. The FBI handles this by simply suppressing the statistics entirely for those areas. This primarily affects the rape category for Illinois, where statistics are suppressed for all but the largest jurisdictions. These missing values were handled via the modeling process, in which rape estimates were prepared for these jurisdictions by using a model which related rape incidence to other crime types
- The standardization of the database to account for jurisdictional overlaps. For example, the California Highway Patrol has jurisdiction over only state and Interstate highways in urban areas.
- Crime rates in general have been declining over the past several years, so it was necessary to adjust the historical data to reflect current crime rates.
Once this correction and standardization effort was completed, the database consisted of a time series of six years of data covering:
- All cities and towns which have their own police agency.
- All cities and towns where policing for the local jurisdiction is contracted to a higher level agency but which tracks statistics separately.
- A record for each county, which covers the population not covered by either of the two cases above. This is normally either a County Sheriff (or equivalent) or a State level jurisdiction, which reports incidence of crime by county (e.g. in New York, the State Trooper).
The initial models were undertaken using a subset of this database. In the smallest cities, a single murder will have a profound effect on the crime rate per 100,000 population that would severely distort the resulting models. Cities with less than 2,500 people were reassigned to their parent counties for the purpose of the analysis. A wide range of 1990 Census and current year demographic attributes was extracted from AGS’ databases for the remaining areas (approximately 8,500 separate “jurisdictions”). This database was then used as the primary modeling database and was used later for scaling purposes.
Each of the seven crime types was modeled separately, using an initial range of about 65 socio-economic characteristics taken from the 2000 Census and AGS’ current year estimates. Separate models were constructed for each of the nine Census regions (e.g. New England, East North Central, Pacific) in order to account for regional differences in crime rates and the demographic characteristics, which underlay them. The models constructed typically accounted for over 85% of the variance in crime rates at this “jurisdiction” level, although it should be noted that the results for property crimes were generally more reliable than for personal crimes.
The results of these models were then applied to the block group level using the same demographic attributes compiled at the block group level. The resulting estimates were then scaled to match the master database of 8,500 jurisdictions. For cities, the block groups within each city were scaled to match the city total. For areas outside of these cities (or for smaller centers), results were scaled to match the county total after adjusting for those cities scaled separately.
The final crime rate estimates were then weighted by population and aggregated to the national totals.