The paper evaluates 11 measures of inequality _{1},p_{2}) between two proportions p_{1} and p_{2}, some of which are new to the health disparities literature. These measures are selected because they are continuous, nonnegative, equal to zero if and only if |p_{1}-p_{2}|=0, and maximal when |p_{1}-p_{2}|=1. They are also symmetric [_{1},p_{2})=_{2},p_{1})] and complement-invariant [_{1},p_{2})=_{2},1-p_{1})]. To study inter-measure agreement, five of the 11 measures, including the absolute difference, are retained, because they remain finite and are maximal _{1}-p_{2}|=1. Even when the two proportions are assumed to be drawn at random from a shared distribution—interpreted as the absence of an avoidable difference—the expected value of _{1},p_{2}) depends on the shape of the distribution (and the choice of

Healthy People is a U.S. public health initiative that, for four decades, has established national goals and measurable objectives with 10-year targets to guide evidence-based policies, programs, and other actions to improve health and well-being (

Much of the above-cited literature is concerned with summary health inequality indices, summarizing all possible pairwise comparisons among population subgroups (e.g., by race/ethnicity). However, even for conceptually simple comparisons between two subgroup proportions, the assessment of the magnitude of inequality as well as the direction and magnitude of the change in inequality over time depend on analytic choices. For example, an analysis could focus on health attainment versus shortfall, or absolute versus relative differences (

Many measures of inequality are available and may lead to diverging conclusions about magnitude and change in inequality over time (

The main objective of this paper is to present a standard measurement unit _{1},p_{2}), defined as:

This metric is constructed in the same way as a z-score, by centering and scaling the inequality measure _{1},p_{2}) between two subgroup proportions p_{1} and p_{2} relative to its mean E[_{1},p_{2})] and variance Var[_{1},p_{2})] under the assumption that the proportions are drawn at random from a known underlying distribution, allowing an “apples-to-apples” comparison of magnitude among measures.

Prior to constructing the standard measurement unit, the paper abstracts mathematical properties that may impact comparability among different inequality measures. Two mathematical properties are most useful in distinguishing among inequality measures. Firstly, whether they attain a “unique maximum”, namely that _{1},p_{2}) is maximal if and only if |p_{1}-p_{2}|=1; this property facilitates a simple reading of the worst-case scenario for inequality—one subgroup proportion is zero and the other is one. Secondly, the extent of the “penalty” that each measure assesses when small departures from equality occur (e.g., p_{1}=p_{2}+δ for a small δ>0), which may impact the mean E[_{1},p_{2})] and variance Var[_{1},p_{2})] in

This paper focuses on pairwise comparisons between proportions (typically multiplied by 100 and reported as percentages), because 70% of the nearly 1,100 measurable Healthy People 2020 objectives are tracked using percentages, and proportions are commonly used elsewhere (_{1},p_{2}) that separates two subgroup proportions p_{1} and p_{2}. Even though this operational definition is agnostic to whether the difference was avoidable, the extent to which inequality decreases reflects progress toward achieving equity (

Drawing from the related concepts of statistical effect size and information-theoretic divergence, 11 candidate inequality measures are formulated. Statistical effect size quantifies the magnitude of the difference between two proportions (_{1} and P_{2} (_{1} and p_{2} by specifying P_{j} = (p_{j},1-p_{j}), for

As discussed in (_{1} and p_{2} within the unit interval [0,1]. To investigate the impact of their underlying distribution, the proportions are conceptualized as independent beta random variables. The beta family encompasses the uniform, U-shaped, and unimodal symmetric, right, or left-skewed densities. The paper argues that the magnitude of inequality will be impacted by the mean and variance of _{1},p_{2}) given the underlying distribution for p_{1} and p_{2}, and proposes the standard measurement unit shown in

Drawing from an empirical investigation of properties of two benchmark inequality measures, the absolute difference, |p_{1}-p_{2}|, and the ratio, p_{1}/p_{2}, this section abstracts some mathematical properties that may or may not be met by any given measure of inequality _{1},p_{2}). All 11 measures surveyed satisfy properties 1–3 and 5–7, below. Seven of the 11 measures also satisfy property 4; see

A nonnegative measure of inequality satisfies _{1},p_{2})≥0 for all p_{1} and p_{2}. The absolute difference, |p_{1}-p_{2}|, and the ratio, p_{1}/p_{2}, are nonnegative measures of inequality.

Absence of inequality postulates that, for some _{1},p_{2})=_{1}=p_{2}, reflecting attainment of equality between the two subgroup proportions.

Different values of _{1}=p_{2} if and only if |p_{1}-p_{2}|=0; thus, _{1}=p_{2} if and only if p_{1}/p_{2}=1; thus, _{1}=p_{2}=0.

The property that _{1},p_{2})≥_{1},p_{2})=_{1}-p_{2}|=0 (minimal absolute difference), is consistent with Property 2. The property that _{1},p_{2})≤_{1},p_{2})=_{1}-p_{2}|=1 (maximal absolute difference), is concerned with defining magnitude of inequality in the worst-case scenario |p_{1}-p_{2}|=1, when one proportion is zero and the other is one.

While |p_{1}-p_{2}|=1 is sufficient for _{1},p_{2})=_{1}-p_{2}|<1. For example, the (absolute value of the) logarithm of the odds ratio can be maximal (_{1}-p_{2}| remains small; see

The property that _{1},p_{2})≤_{1},p_{2})=_{1}-p_{2}|=1 is known as the “orthogonal maximum” property; see (

The measure _{1},p_{2}) is a continuous function of its arguments if, for δ>0, all four quantities _{1},p_{2}-δ), _{1},p_{2}+δ), _{1}-δ,p_{2}), and _{1}+δ,p_{2}) converge to _{1},p_{2}) as δ approaches 0.

The absolute difference |p_{1}-p_{2}| and ratio p_{1}/p_{2} are both continuous measures. A graph usually is sufficient for visual confirmation, but calculus techniques for demonstrating continuity are available (

There are two corollaries to Property 5, which allow for limiting forms of Properties 2 and 3.

Property 2′: For continuous measures, absence of inequality is understood to state that _{1},p_{2}) will approach minimal inequality, _{1}-p_{2}| approaches 0, _{2}
_{1}/p_{2} is undefined when both proportions are zero, yet it satisfies this limiting form of Property 2 (with

Property 3′: As with minimal inequality, for continuous measures, the maximal inequality property is understood to indicate that _{1},p_{2}) will approach _{1}-p_{2}| approaches 1.

This paper defines a

The measure _{1},p_{2}) is symmetric (or “undirected”) if _{1},p_{2})=_{2},p_{1}).

The absolute difference |p_{1}-p_{2}| is symmetric. The ratio p_{1}/p_{2} is not symmetric, since p_{2}/p_{1}≠p_{1}/p_{2}, and is therefore relative to the subgroup proportion used in the denominator.

As seen below and in (_{1},p_{2})+_{2},p_{1})]/2. Even though their magnitude may become difficult to interpret, symmetrization remains useful for priming various measures under consideration for a comparative assessment when directionality is only a secondary concern.

The measure _{1},p_{2}) is complement-invariant if _{1},p_{2})=_{2},q_{1}), where q_{j}=1-p_{j}, j=1,2, are the complementary proportions.

The absolute difference |p_{1}-p_{2}| is complement-invariant. The ratio p_{1}/p_{2} is not, since q_{2}/q_{1}≠p_{1}/p_{2}. The property of complement-invariance allows one to re-express inequality between proportions so that its magnitude is independent of whether the underlying health outcome is expressed as a favorable or an adverse outcome, which, as discussed previously, is a major limitation for the ratio p_{1}/p_{2}. As with lack of symmetry, lack of complement-invariance can be corrected, albeit at the expense of interpretability, e.g., using [_{1},p_{2})+_{2},q_{1})]/2, to prepare different inequality measures or health outcomes for comparison.

The 11 measures surveyed are formulated from the related concepts of statistical effect size and information-theoretic divergence. These two classes of measures are elucidated in

The ratio p_{1}/p_{2}, with _{1}>0 whenever p_{2}=0. Additionally, it is neither doubly symmetric nor readily corrected for lack thereof, hence it is excluded from further comparisons. Nonetheless, as seen in _{12},Ř_{21}) of the two ratios R_{12}=p_{1}/p_{2} and Ř_{21}=q_{2}/q_{1}. Thus, the selected measures, including the absolute difference, are seen as

The absolute difference |p_{1}-p_{2}| serves as a benchmark for interpreting the minimum and maximum of those measures. Some of the measures shown (e.g., the absolute logit difference), will attain their maximum _{1}-p_{2}| may be. If one wishes to avoid such an arbitrarily large “penalty” when proportions are near the boundary of the unit interval, then one may rule out those measures by requiring uniqueness of maximal inequality (property 4).

Another point of reference for comparing the selected measures is the rate of change in _{1},p_{2}) as |p_{1}-p_{2}| decreases toward zero. Thus, measures in _{1}-p_{2}| and whether/how their rate of change depends on the location of the two proportions on the unit interval (e.g., both near 0 or 1, or both near 0.5). Mathematical derivations are included in

Measures 5–8 in

_{2}—Standardized absolute difference, with pooled variance;

√Δ—Square root of triangle discrimination measure;

√_{2}—Square root of rescaled Jensen-Shannon divergence; and

As shown in _{2} is equal to √Δ. Agreement between the absolute difference _{2}, or √Δ is illustrated in

In _{2} for “small”, “medium”, and “large” effect sizes are shown. The latter are from the thresholds 0.2, 0.5, and 0.8, respectively, for Cohen’s h-index, defined as π×_{2} are also larger than _{2} corresponding to the three effect size thresholds (

The proportions p_{1} and p_{2} may be conceptualized _{1}≈p_{2}, values of p_{2} exist such that, in turn: (i) _{2}≤_{2}≤√Δ; (iii) _{2}≤√Δ; or (iv) _{2}≤√Δ; see _{1},p_{2})] for _{2}, or √Δ using numerical integration (values not shown here), one finds similar relationships: for the symmetric beta densities with α=β=0.5, 1, 2, 4, 8, or 16, expected values are ordered as in (i); for the skewed beta(1,3) and beta(2,6) densities, with mean=0.25, their order is as in (ii); for beta(2,13), with mean≈0.13, it is as in (iii); and for beta(1,6.5), with mean≈0.13, and beta(1,9) and beta(2,18), with mean=0.10, it is as in (iv).

Thus, of the inequality measures considered, only the square root of the triangle discrimination measure _{1}, _{2}) for all p_{1} and p_{2} and the distributional scenarios considered.

_{2}, _{1},p_{2}) for which _{1},p_{2}) is at 0, 1, and 2 standard deviations from its expected value _{1}-p_{2}| are shown at finer granularity.

Under the U-shaped beta(0.5,0.5) and uniform beta(1,1) densities (_{2}=p_{1}.

For the two unimodal symmetric densities beta(4,4) and beta(16,16) (

For the two unimodal skewed densities beta(2,6) and beta(2,13) (_{1},p_{2})=(0.6,0.2), with |p_{1}-p_{2}|=0.4, registers at 2 standard deviations away from the mean for the measure _{1} and p_{2} needs to increase further, to |p_{1}-p_{2}|=0.5, e.g., (p_{1},p_{2})=(0.7,0.2), for the other four measures to register at 2 standard deviations. Using beta(2,13), (p_{1},p_{2})=(0.45,0.2), with |p_{1}-p_{2}|=0.25, registers at 2 standard deviations for _{1}-p_{2}|=0.35, e.g., (p_{1},p_{2})=(0.55,0.2), is needed for the other four measures to register at 2 standard deviations.

Thus, the standard measurement unit _{1},p_{2}), with mean zero and variance one regardless of the choice of health inequality measure _{1},p_{2}) or the underlying distribution of p_{1} and p_{2}, allows assessment of inequality relative to what one would expect given prior information on the two proportions.

The paper evaluated 11 measures of inequality between proportions p_{1} and p_{2}. These measures were selected because they are continuous, nonnegative, equal to zero if and only if |p_{1}-p_{2}|=0, maximal when |p_{1}-p_{2}|=1, and doubly symmetric. To assess inter-measure agreement and develop a standard measurement unit for the magnitude of inequality, the absolute difference and four other measures were retained for further analysis, because, in addition to the mathematical properties they share with the remaining six measures, they are finite, and maximal if and only if |p_{1}-p_{2}|=1. For skewed underlying beta distributions, the retained measures, once standardized relative to their mean and variance, were more conservative than the absolute difference in their assessment of magnitude of inequality, demonstrating the potential impact of the underlying distribution.

This paper did not address the difficult methodological issue that different measures may lead to divergent assessments of changes in inequality over time. Theoretical bounds for the range of proportions where the selected measures converge in their assessment of trends are not readily available. Instead,

The proposed standard measurement unit depends on the specification of an underlying distribution. The paper adopted a Bayesian perspective and assumed that the two proportions p_{1} and p_{2} were independent and identically distributed beta random variables; thus, the assumed-known mean and variance of _{1},p_{2}) were calculated directly (albeit via numerical approximation) rather than estimated from data. In practice, analysts could elicit a prior distribution for p_{1} and p_{2} from consensus or expert opinion (_{j}/n_{j} as being overdispersed relative to their binomial variance, leading to the beta-binomial distribution (

The comparative analysis in the paper was restricted to mathematical properties that were formulated following an empirical investigation of two benchmark measures, the absolute difference and the ratio. In selecting from the 11 measures surveyed, the paper did not consider the interpretability and clinical or public health relevance of those measures, nor did it consider ease of communication to stakeholders. The standard measurement unit may offer an easily accessible gauge for the magnitude of inequality, useful in meta-analyses, but its dependence on a potentially elicited prior distribution may remain a barrier to interpretability.

Starting with Healthy People 2020, the Healthy People initiative has moved to considering a suite of measures to examine health disparities instead of relying on a single measure (

Funding: The authors have no funding to report.

Conflict of interest: The authors have no conflict of interest to declare.

Disclaimer: The findings and conclusions in this paper are those of the authors and do not necessarily represent the official position of the National Center for Health Statistics or the Centers for Disease Control and Prevention.

US Government Agreement: Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health 2020. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Possible Values of Selected Inequality Measures (Shaded Areas) as the Absolute Difference Varies from Zero to One

Possible Values of Selected Inequality Measures (Shaded Areas) as the Rescaled Absolute Arcsine Difference Varies from Zero to One, with Three Thresholds for Small, Medium, and Large Cohen’s Effect Sizes

The ranges of values of _{2} for “small”, “medium”, and “large” effect sizes are represented in A) through D), respectively, using the thick vertical line segments. Small, medium, and large effect sizes for the rescaled absolute arcsine difference (

Level Curves of Selected Inequality Measures After Standardization Relative to their Expected Values and Variances for Various Choices of the Underlying Beta Distribution for the Proportions

Level lines for the absolute difference (_{1}-p_{2}| at 0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, and 3.5 standard deviations from the mean E[|p_{1}-p_{2}|] under each of the selected beta distributions—A) beta(0.5,0.5); B) beta(1,1); C) beta(4,4); D) beta(16,16); E) beta(2,6); and F) beta(2,13)—and are labeled using the oblique set of labels shown above the x-axis in each graph. Level curves for the other four measures _{2}, _{1},p_{2}) at 0.0, 1.0, and 2.0 standard deviations from the mean E[_{1},p_{2})] and are labeled using the vertical set of labels shown to the right of each graph. Recall that √Δ denotes the square root of triangle discrimination measure, √_{2} the square root of rescaled Jensen-Shannon divergence,

Summary of Two Mathematical Properties of 11 Measures of Inequality Between Two Proportions^{a}

Measure of inequality _{1},p_{2}) between proportions p_{1} and p_{2} | Mathematical expression | Property 4: Uniqueness of maximal inequality, e.g., _{1},p_{2})=_{1}-p_{2}|=1? | Behavior for |p_{1}-p_{2}| near 0, e.g., approximate value of _{2}+δ,p_{2}) for δ approaching 0 from above. |
---|---|---|---|

1. Absolute difference | _{1},p_{2}) = |p_{1}–p_{2}| | Yes: | |

2. Standardized absolute difference, with pooled variance^{b} | Yes: | ||

3. Rescaled absolute arcsine difference^{c} | Yes: | ||

4. Standardized absolute difference^{d} | Yes: | ||

5. Absolute logit difference | No: _{1}-p_{2}|<1, e.g., if p_{1}=0. | ||

6. Absolute probit difference | No: _{1}-p_{2}|<1, e.g., if p_{1}=0. | ||

7. Symmetrized chi-squared divergence | No: _{1}-p_{2}|<1, e.g., if p_{1}=0. | ||

8. Jeffreys divergence | No: _{1}-p_{2}|<1, e.g., if p_{1}=0. | ||

9. Triangle discrimination measure^{e} | Yes: | ||

10. Rescaled Jensen-Shannon divergence^{f} | Yes: | ||

11. Hellinger distance | Yes: |

All 11 measures satisfy properties 1–3 and 5–7, and equal zero if and only if the two proportions p_{1} and p_{2} are equal. Given the two proportions p_{1} and p_{2}, we define: q_{1}=1-p_{1}, q_{2}=1-p_{2}, p*=(p_{1}+p_{2})/2, and q*=1-p*=(q_{1}+q_{2})/2; OR=(p_{1}/p_{2})/(q_{1}/q_{2}); logit(p)=p/(1-p); and probit(p)=Φ^{−1}(p) where Φ(x) is the standard normal distribution.

2×D_{2} appears in the classical test of the null hypothesis that p_{1}=p_{2}. Additionally, D_{2}=√Δ; see _{2} is resolved by continuity to 0 when the proportions are both 0 or both 1.

Cohen’s index of effect size for the difference in proportions is π×h; see (37, 38).

D_{1}(0,0) and D_{1}(1,1) are resolved by continuity to 0.

Δ(0,0) and Δ(1,1) are resolved by continuity to 0.

Using the convention 0×ln 0=0, S_{2}(0,0) and S_{2}(1,1) are resolved by continuity to 0.