Reproducible Area Deprivation Index (ReADI)
Background on the Original ADI
The Area Deprivation Index (ADI) was first developed by Singh (2003) using 1990 U.S. Census data. It is a factor-based measure that combines 17 socioeconomic indicators, covering poverty, education, housing, and employment, to summarize area-level deprivation.
Kind and colleagues subsequently updated the ADI using 2000 Census data (Kind et al., 2014) and released national estimates through the Neighborhood Atlas (Kind et al., 2018). Later releases used 5-year American Community Survey (ACS) data (2015, 2020, and 2022) to generate ADI scores at the census block-group level. These data products provide both nationally ranked scores (0–100) and state-specific decile ranks (0–10). For tract- and county-level applications, ADI values are typically constructed as population-weighted averages of the underlying block-group scores. The ADI has since been widely adopted in health services research and incorporated into multiple federal and state policy applications, including risk adjustment and resource allocation models.
Singh's Original Method
1. Singh initially selected 21 socioeconomic indicators and conducted factor analysis using 1990 U.S. Census data.
2. Seventeen variables with strong loadings on the first factor (≥ 0.45) were retained.
3. A single-factor solution was used to derive factor score coefficients, which served as variable weights.
4. The resulting scores were standardized (mean = 100, SD = 20).
5. County-level ADI values were generated by averaging tract-level scores.
6. Higher ADI values indicate higher levels of socioeconomic deprivation.
Why We Created ReADI
Over time, inconsistencies and methodological limitations in later ADI implementations became apparent. Independent groups, including our own, identified issues that affect the reproducibility, comparability, and interpretability of ADI scores. The Reproducible Area Deprivation Index (ReADI) was developed to address these concerns while preserving the original purpose of the ADI: a transparent, consistent, and scalable measure of area deprivation based on publicly available data.
Improvements Made in ReADI
In the original ADI, Singh (2003) used standardized input data to perform factor analysis and then applied the resulting factor score coefficients to compute index scores, either by applying them to standardized data or by using factor scores directly.
Modern factor-analysis implementations (e.g., psych::fa() in R or SAS PROC FACTOR) standardize the input variables internally, apply the estimated weights, and return standardized factor scores as part of the output. These scores already reflect both the loadings and the appropriate scaling of the input variables.
In ReADI, we use this standard approach. We extract factor scores directly from the factor analysis output. This avoids ad hoc post-processing and ensures that scoring is reproducible and correctly scaled.
In contrast, in the Neighborhood Atlas ADI, Singh’s 1990 weights were manually applied to raw, unstandardized ACS variables. Because several variables (e.g., median home value, gross rent) are orders of magnitude larger than others, they end up dominating the index when unstandardized values are multiplied by weights that were originally derived from z-scored data. This violates a core assumption of factor-based scoring and distorts the resulting index (Petterson, 2023).
By relying on factor scores produced directly by the factor model, ReADI maintains appropriate scaling and yields a more valid and interpretable measure of deprivation.
The Neighborhood Atlas continues to apply the original 1990 weights to newer data releases. In contrast, Singh recalculated weights when validating the ADI on earlier Census years (e.g., 1970), indicating that re-estimation was part of the intended design.
ReADI re-estimates factor loadings for each version of the index using psych::fa() in R, with a one-factor solution and population weighting. This allows the contribution of each component to adapt to contemporary distributions rather than being locked to 1990 patterns.
Several original ADI thresholds are no longer aligned with current socioeconomic conditions. For example:
- Education categories based on “< 9th grade” understate contemporary educational disadvantage.
- Income categories are not adjusted for inflation.
- Telephone ownership is no longer a meaningful deprivation marker.
ReADI updates these definitions to reflect current contexts. Examples include:
- Education: replacing “< 9th grade” with “< high school diploma,” and “< high school diploma” with “< bachelor’s degree.”
- Income disparity: redefining the disparity measure as log(100 (Households ≤ $20,000/Households ≥ $100,000). These changes preserve the original construct (relative deprivation) while updating thresholds to contemporary realities.
Census block groups, tracts, and counties vary substantially in population size and distribution. To reduce skew and improve comparability across areas:
- We derive a population weight from B01003_001 (total population).
- This weight is transformed using sqrt(pop) + 1 to dampen the influence of very large areas while still reflecting their greater information content.
- These weights are incorporated into the factor analysis so that the index reflects the population structure of the areas being measured, rather than treating all units as equally informative regardless of size.
ReADI Methods Summary
1. Construct input variables: Calculate 17 socioeconomic variables using ACS 5-year data at the relevant geographic level (Table 1).
| Variable | Census Table | Formula |
|---|---|---|
| Median family income, $ | B19013 | 001 |
| Income disparity𝑎 | B19001 | ln((sum(002-004)+1)/(sum(014-017)+1)x100) |
| Families below poverty level, % | B17017 | (002/001)x100 |
| Population below 150% of the poverty threshold, % | C17002 | (sum(002-005)/001)x100 |
| Population ≥25y with < a high school diploma, % | B15003 | (sum(002-016)/001)x100 |
| Population ≥25y with at least a bachelor’s degree, % | B15003 | (sum(022-025)/001)x100 |
| Employed persons ≥16y in white-collar occupations, % | C24010 | (sum(003, 027, 039, 063)/001)x100 |
| Civilian labor force population ≥16 y unemployed, % | B23025 | (005/003)x100 |
| Median home value, $ | B25077 | 001 |
| Median gross rent, $ | B25064 | 001 |
| Median monthly mortgage, $ | B25088 | 002 |
| Owner-occupied housing units, % | B25003 | (002/001)x100 |
| Single-parent households with children <18y, % | B11012 | (sum(010, 015)/001)x100 |
| Households without a motor vehicle, % | B25044 | (sum(010, 003)/001)x100 |
| Households without internet, % | B28002 | (013/001)x100 |
| Occupied housing units without complete plumbing, % | B25049 | ln((sum(004, 007)+1)/(001+1))x100) |
| Households with more than 1 person per room, % | B25014 | (sum(005-007, 011-013)/001)x100 |
aIncome disparity in 2022 is defined as log(100 × (Households ≤ $20,000 / Households ≥ $100,000)).
2. Handle structural zeros: Replace resulting NAs with 0 when they arise from a denominator of 0 (e.g., no households in a given category), as these represent structural zeros rather than true missing data.
3. Standardize variables: Standardize all variables using z-score transformation.
4. Compute population weights: Calculate population weights (POPWT) as each unit’s share of the total population, then transform using sqrt(POPWT) + 1 to reduce the influence of very large units while retaining population information.
5. Estimate the factor model: Fit a one-factor model using the fa() function from the psych package in R, with population weights and a principal-factor extraction method.
6. Generate ReADI scores: Extract factor scores directly from the factor analysis output and rescale them to a 0–100 range to produce the final ReADI scores (higher values = greater deprivation).
Together, these steps provide a fully reproducible implementation of the Area Deprivation Index that is aligned with contemporary data, transparent in its assumptions, and suitable for use in research and policy applications.