PLoS OnePLoS ONEplosplosonePLoS ONE1932-6203Public Library of ScienceSan Francisco, CA USA314698486716631PONE-D-19-1450010.1371/journal.pone.0221433Research ArticleBiology and Life SciencesPsychologyBehaviorHabitsSmoking HabitsSocial SciencesPsychologyBehaviorHabitsSmoking HabitsBiology and Life SciencesPlant SciencePlant AnatomyWoodCorkComputer and Information SciencesComputer SoftwareScience PolicyOpen ScienceOpen DataBiology and Life SciencesBiochemistryBiomarkersBiology and Life SciencesAnatomyBody FluidsUrineMedicine and Health SciencesAnatomyBody FluidsUrineBiology and Life SciencesPhysiologyBody FluidsUrineMedicine and Health SciencesPhysiologyBody FluidsUrinePhysical SciencesMathematicsAlgebraPolynomialsBinomialsComputer and Information SciencesComputing MethodsCumulative ROC curves for discriminating three or more ordinal outcomes with cutpoints on a shared continuous measurement scaleCumulative ROC curves for cutpoints between three or more ordinal outcomeshttp://orcid.org/0000-0002-1610-5849deCastroB. ReyConceptualizationData curationFormal analysisInvestigationMethodologySoftwareValidationVisualizationWriting – original draftWriting – review & editing^{¤}*National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of AmericaParolariAlessandroEditorUniversity of Milano, ITALY
Competing Interests: The author has declared that no competing interests exist.
Current address: National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America
* E-mail: rdecastro@cdc.gov20193082019148e02214332252019682019This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Cumulative receiver operator characteristic (ROC) curve analysis extends classic ROC curve analysis to discriminate three or more ordinal outcome levels on a shared continuous scale. The procedure combines cumulative logit regression with a cumulative extension to the ROC curve and performs as expected with ternary (three-level) ordinal outcomes under a variety of simulated conditions (unbalanced data, proportional and non-proportional odds, areas under the ROC curve [AUCs] from 0.70 to 0.95). Simulations also compared several criteria for selecting cutpoints to discriminate outcome levels: the Youden Index, Matthews Correlation Coefficient, Total Accuracy, and Markedness. Total Accuracy demonstrated the least absolute percent-bias. Cutpoints computed from maximum likelihood regression parameters demonstrated bias that was often negligible. The procedure was also applied to publicly available data related to computer imaging and biomarker exposure science, yielding good to excellent AUCs, as well as cutpoints with sensitivities and specificities of commensurate quality. Implementation of cumulative ROC curve analysis and extension to more than three outcome levels are straightforward. The author’s programs for ternary ordinal outcomes are publicly available.
The author received no specific funding for this work.Data AvailabilityAll relevant data are within the manuscript and its Supporting Information files.Data Availability
All relevant data are within the manuscript and its Supporting Information files.
Introduction
Classic receiver operator characteristic (ROC) curve analysis addresses the relation of continuous measurements to binary outcomes [1], and enables selection of a cutpoint or threshold on the continuous measurement scale discriminating the outcome levels. From its origins in signal detection theory [2] and application in early radio detection and ranging systems, the technique has been used in fields as diverse as clinical chemistry [3], radiology [4], psychology [5], and machine learning [6–9].
Extension beyond binary outcomes would be desirable for the increased scope of applications. One readily implemented approach is to group multinomial outcome levels into binomial levels and run classic ROC curve analysis, but this loses information and biases test accuracy [10]. There have been other, more sophisticated proposals spanning a range of theoretical approaches [11–18], but the complexity of these noteworthy proposals has limited their application. Additional methods have been implemented and some enjoy broad use [19–22], yet theoretical justification may be sparse.
This paper proposes a two-stage, semiparametric approach combining conventional cumulative logit regression with a cumulative extension of ROC curve analysis to discriminate ordinal outcome levels. The performance of this approach is evaluated under simulation, with comparison of several criteria used with classic ROC curves to select cutpoints. Results from these criteria are compared to cutpoints computed from maximum likelihood estimates (MLEs) of the regression parameters. The procedure is also demonstrated with publicly available data.
Formulation
The classic empirical ROC curve is computed by comparing a binary outcome Y with a continuous measure X where each observed level of X is evaluated as a candidate cutpoint discriminating observed Y = 1 (positive) from Y = 2 (negative). Observations exceeding the candidate cutpoint are classified positive with respect to the continuous measurement, while those less than or equal to the cutpoint are classified negative. As in a 2 × 2 contingency table, the count of correct classifications among positive outcomes comprises the true positives (TP) and among negative outcomes the true negatives (TN). The count of incorrect classifications among negative outcomes comprises the false positives (FP) and among positive outcomes the false negatives (FN). These counts are used to compute: sensitivity, which is the probability that an observation with a positive outcome is correctly classified by a continuous measurement above a candidate cutpoint (sensitivity = TP/[TP+FN]); and specificity, which is the probability that an observed negative outcome is correctly classified by a continuous measurement at or below a candidate cutpoint (specificity = TN/[TN+FP]). Thence, coordinates for the empirical ROC curve are computed where the abscissa is 1 − specificity (= false positive rate; FPR) and the ordinate is sensitivity (= true positive rate; TPR).
The best cutpoint X* given the data may be identified from the ROC curve coordinates with a criterion that maximizes TPR and minimizes FPR. Cross-referencing the identified ROC curve coordinate with its observed continuous measurement yields the cutpoint distinguishing the binary outcomes. A variety of cutpoint criteria are available, such as the Youden Index, Matthews Correlation Coefficient, and Total Accuracy [23–25]. In addition, the ability of the continuous measurement to discriminate between outcome levels, which is equivalent to the strength of the association between the two, may be represented by the area under the ROC curve (AUC; also known as the c-statistic), which is the probability that an observation with a positive outcome will have a higher continuous measurement than an observation with a negative outcome.
Since the ROC curve describes the relationship between a binary outcome and continuous predictor, it is directly related to logistic regression [26–28]. For the binary outcome where Y = 2 is the reference outcome level, let π_{1} = Pr[Y = 1]. The univariate logistic model with continuous predictor X and linear parameters {α, β} is:
logit(π1)=log(π11-π1)=α+βX
Let π^1i be the probability that Y = 1 predicted by the regression model at the ith observation X_{i} for i = 1, …, N where N is the number of observations. Analogous to the approach above, each predicted probability may serve as a candidate cutpoint discriminating Y = 1 from Y = 2. Coordinates comprising the ROC curve may then be computed, except they are based on counts on the probability scale monotonically transformed by the regression model from the original continuous scale. As above, the best probability cutpoint π^* may be selected with a suitable optimality criterion. The cutpoint on the scale of the continuous predictor X* can be recovered by cross-referencing π^* with its corresponding observed measurement. Whether using the scale of the continuous predictor or the predicted probability, the resulting ROC curves are identical because the curve is rank-based and invariant to monotonic transformations of the continuous predictor [1]. The first stage of the proposed approach exploits this invariance through a generalization of the logistic model embodied by the cumulative logit model.
Stage 1: Cumulative logit model
The cumulative logit regression model predicts probabilities for an ordinal outcome Y = j with j = 1, …, J levels, where for demonstration J = 3 and the reference outcome level is Y = 3. Let π_{j} = Pr[Y ≤ j], then with continuous predictor X and linear parameters {α_{j}, β_{j}} the cumulative logit regression model is:
logit(πj)=log(πj1-πj)=αj+βjX,forj=1,…,J-1
For each jth outcome level up to J − 1, a cumulative logit is estimated with its own regression parameters {α^j,β^j}. The formulation in Eq 2 is known as non-proportional odds where the log-odds β^j differs between outcome levels. Simplification is possible with the proportional odds formulation where constant log-odds β_{j} = β is assumed among outcome levels [29]. Let π^ij=Pr^[Yij≤j|Xi], then in terms of the parameter estimates {α^j,β^j} the predicted cumulative probability for the jth outcome level at the ith observation is:
π^ij=exp(α^j+β^jXi)1+exp(α^j+β^jXi),forj=1,…,J-1
and the predicted individual probability is the difference of adjacent cumulative probabilities:
exp(α^j+β^jXi)1+exp(α^j+β^jXi)-exp(α^j-1+β^j-1Xi)1+exp(α^j-1+β^j-1Xi)
For J = 2 this model reduces to the logistic model, but the cumulative logit model is similarly able to transform the continuous predictor to the predicted probability scale except that each outcome level gets a predicted probability function.
We have so far recalled that ROC curves are invariant to monotonic transformation of the continuous measurement, including transformation by a logistic regression model to the predicted probability scale. In addition, we have reviewed the cumulative logit model and its transformation of a single continuous predictor to a series of separate predicted probabilities for each level of the ordinal outcome. These probabilities are comprehensive and mutually exclusive with respect to the outcome (∑jπ^ij=1) and are suitable for computing a series of “cumulative” ROC curves.
Stage 2: Cumulative ROC curves
Calculation of the classic ROC curve on the predicted probability scale can be readily extended to count TP, TN, FP, and FN for each cumulative logit, resulting in J − 1 cumulative ROC curves. For the cumulative logit associated with the jth outcome, let p_{jk} be the kth candidate cumulative probability cutpoint from among the π^ij, then one may count TP_{jk}, TN_{jk}, FP_{jk}, and FN_{jk} with the indicator function I(⋅) by comparing the outcome Y_{i} with π^ij vs. p_{jk} for i = 1, …, N; j = 1, …, J − 1; and k = 1, …, N:
TPjk=∑iI(π^ij>pjkANDYi≤j)TNjk=∑iI(π^ij≤pjkANDYi>j)FPjk=∑iI(π^ij>pjkANDYi>j)FNjk=∑iI(π^ij≤pjkANDYi≤j)
From these counts, the coordinates (FPR_{jk}, TPR_{jk}) for the jth cumulative ROC_{j} curve can be computed, where FPR_{jk} = 1 − (TN_{jk}/[TN_{jk} + FP_{jk}]) and TPR_{jk} = (TP_{jk}/[TP_{jk} + FN_{jk}]). Continuing with the case of the ternary ordinal outcome, p_{1k} is the kth candidate cutpoint from the first cumulative logit and p_{2k} from the second, so that the cumulative ROC_{1} curve discriminates between Y = 1 vs. Y = 2 or 3, and the cumulative ROC_{2} curve discriminates between Y = 1 or 2 vs. Y = 3. Analogous to the binary case, a probability cutpoint for the jth outcome level π^j* may be selected from its respective cumulative ROC_{j} curve using a suitable criterion. The cutpoint on the scale of the continuous predictor Xj* is recovered by cross-referencing π^j* with its corresponding observed measurement.
Alternatively, ROC curve analysis may be forgone altogether by computing cutpoints from the MLE cumulative logit regression parameters, where Xj*=−(α^j/β^j). Since this parametric cutpoint is the ratio of two model parameters, both the Delta Method and Fieller’s Method are applicable for computing the variance [30]. Fieller’s Method is favored, however, since it tends to provide better coverage despite potential asymmetry of the confidence interval [31, 32]. In addition, Hirschberg and Lye 2010 [31] recommend Fieller’s Method when the computed ratios are positive and correlation between the numerator and denominator is negative. Accordingly, the standard deviation s_{X*} is estimated here with Fieller’s Method [33] when computing confidence intervals for parametric cutpoints as ± (t_{df = 2,1−(α/2)} × s_{X*}).
Simulations
Cumulative ROC curve analysis for a ternary ordinal outcome was evaluated under conditions simulating AUCs = 0.70, 0.75, 0.85, 0.90, and 0.95. Cutpoints for the continuous predictor were set at X2*=−5 and X3*=5 by designating α_{j} and β_{j} based on the relationship Xj*=−(αj/βj). Random variates of the continuous predictor were obtained from a normal distribution X_{i} ∼ N(0, σ^{2} = 100) truncated at the 10th and 90th percentiles. Truncation improved the chances of obtaining random variates that would successfully converge to a maximum likelihood solution for the regression model. Random variates of the ternary outcome Y_{i} were then obtained from a multinomial distribution defined by probabilities computed from Eq 4 with α_{j}, β_{j}, and random variates X_{i}. For the proportional odds condition with AUC_{1} = AUC_{2} = 0.90, parameters were designated at α_{1} = −1.70, α_{2} = 1.70, and β = 0.34. For the first non-proportional odds condition (referred to as the NPO1 condition) with AUC_{1} = 0.75 and AUC_{2} = 0.85, parameters were designated at α_{1} = −0.75, β_{1} = 0.15 and α_{2} = 1.25, β_{2} = 0.25; and for the second (NPO2 condition) with AUC_{1} = 0.70 and AUC_{2} = 0.95, parameters were α_{1} = −0.70, β_{1} = 0.14 and α_{2} = 4.70, β_{2} = 0.94. For each condition, 10,000 datasets were simulated with nested sample sizes n = 75, 150, and 300 unequally allocated among the outcome levels. A cumulative logit regression model was fit to each dataset and cumulative ROC curves computed. Simulations were run with the FREQ, LOGISTIC, and SURVEYSELECT subroutines of the SAS^{®} software application, version 9.4 [34].
Several cutpoint selection criteria were evaluated for their ability to correctly identify designated cutpoints from cumulative ROC curves: the Youden Index (also known as Informedness and ΔP′), Matthews Correlation Coefficient, Total Accuracy, and Markedness (ΔP) [35, 36]. These criteria and their ranges are presented in Table 1. Each criterion embodies certain merits, but all achieve their optimal level at the ROC curve coordinate where the criterion is at its observed maximum. Cutpoints were also computed directly from MLE cumulative logit regression parameters.
10.1371/journal.pone.0221433.t001Cutpoint selection criteria based on evaluation of empirical ROC curves.
Criterion
Formula
Range
Youden Index (or Informedness, ΔP′)
sensitivity + specificity − 1
(0,1)
Matthews Correlation Coefficient
(TP×TN)−(FP×FN)([TP+FP][TP+FN][TN+FP][TN+FN])1/2
(−1,1)
Total Accuracy
TP+TNTP+FN+TN+FP
(0,1)
Markedness (ΔP)
TPTP+FP+TNTN+FN−1
(0,1)
Tables 2 and 3 confirm that distributions realized during the proportional odds and NPO1 simulations were approximately centered at the levels designated above for α_{j}, β_{j}, and AUC. Table 4 shows, however, that for the NPO2 condition the medians of the realized distributions for α_{j} and β_{j} were about 14–30 percent above designated levels for α_{1}, α_{2}, and β_{2} and about 14 percent below for β_{1}. Designated levels for all regression parameters, however, were between the 2.5th and 97.5th percentiles of their realized distributions, and the realized AUCs were centered on their designated values.
Parameter estimates and AUCs realized from cumulative ROC curve analysis of 10,000 simulated datasets parameterized with proportional odds and AUC_{1} = AUC_{2} = 0.90.
Tables 5–7 display the median, 2.5th and 97.5th percentiles, and percent-bias of cutpoints selected by each criterion across sample sizes n. Percent-biases are the median of percent-differences between realized cutpoints (selected and parametric) and designated cutpoints. The ROC curve-based cutpoint selection criteria exhibited a range of biases. Among both proportional (Table 5) and non-proportional (Tables 6 and 7) odds conditions, absolute values of the percent-biases ranged from 2.8–144.2 percent. Total Accuracy demonstrated the best performance with biases ranging from 2.8–11.7 percent, while the other criteria performed considerably worse.
Percentiles and biases of cutpoints selected from cumulative ROC curves with several criteria and computed parametrically: 10,000 simulated datasets parameterized for proportional odds and AUC_{1} = AUC_{2} = 0.90.
Percentiles and biases of cutpoints selected from cumulative ROC curves with several criteria and computed parametrically: 10,000 simulated datasets parameterized for non-proportional odds, AUC_{1} = 0.75, and AUC_{2} = 0.85.
Percentiles and biases of cutpoints selected from cumulative ROC curves with several criteria and computed parametrically: 10,000 simulated datasets parameterized for non-proportional odds, AUC_{1} = 0.70, and AUC_{2} = 0.95.
Cutpoint
Criterion
n = 75
n = 150
n = 300
%Bias
Median[2.5th, 97.5th %ile]
%Bias
Median[2.5th, 97.5th %ile]
%Bias
Median[2.5th, 97.5th %ile]
5.00
Youden Index
−42.6
2.87[−2.16, 8.43]
−43.4
2.83[−1.30, 7.12]
−45.5
2.73[−0.66, 6.25]
Total Accuracy
4.6
5.23[−2.74, 11.55]
5.0
5.25
3.3
5.17[1.28, 9.38]
Matthews Correlation
−28.3
3.59[−3.47, 10.90]
−30.3
3.48[−2.89, 10.14]
−32.9
3.36[−1.83, 8.74]
Markedness
105.2
10.26[−5.38, 12.72]
131.8
11.59[−5.76, 12.78]
143.9
12.2[−6.08, 12.80]
Parametric−(α^j/β^j)
−1.4
4.93[−8.12, 30.83]
1.5
5.07[0.94, 14.99]
2.3
5.11[2.55, 9.54]
−5.00
Youden Index
41.4
−2.93[−4.77, −0.28]
44.1
−2.79[−4.46, −0.59]
46.4
−2.68[−4.04, −0.96]
Total Accuracy
9.4
−4.53[−6.17, −2.73]
5.0
−4.75[−6.07, −3.38]
2.8
−4.86[−5.98, −3.75]
Matthews Correlation
24.5
−3.78[−5.77, −1.37]
20.1
−3.99[−5.58, −2.09]
19.7
−4.02[−5.31, −2.45]
Markedness
2.3
−4.88[−6.55, −2.06]
−8.0
−5.4[−6.59, −3.43]
−16.0
−5.8[−6.63, −4.35]
Parametric−(α^j/β^j)
1.5
−4.93[−5.96, −3.76]
−0.1
−5.00[−5.75, −4.20]
−1.3
−5.06[−5.59, −4.51]
Forgoing ROC curve analysis and calculating cutpoints from the MLE regression parameters yielded small, often negligible absolute percent-biases (<2.3 percent) for both proportional and non-proportional odds conditions. In addition, across all sample sizes, parametric cutpoints consistently out-performed ROC curve-based cutpoint selection criteria.
Notably, for the NPO2 condition (Table 7), divergence of the realized cumulative logit parameters from designated values did not entail discrepancies in realized cutpoints compared to the other simulation conditions, whether the cutpoints were selected by criteria or calculated parametrically.
Absolute percent-bias for Total Accuracy cutpoints usually diminished with increasing sample size from n = 75 to 300. The only exception was for the upper cutpoint (X = 5.00) in the NPO2 simulation condition, where absolute percent-bias worsened slightly at n = 150. In addition, the absolute percent-bias of lower Total Accuracy cutpoints were usually greater than for upper cutpoints, except for n = 150 and 300 of the NPO2 condition. For parametric cutpoints, absolute percent-bias was negligible at all sample sizes, except in the NPO2 condition, where although cutpoint bias was the lowest within the condition (0.1–2.3 percent), it was slightly greater compared to other conditions (0.0–0.3 percent).
Real-World Data
Two publicly available datasets with ternary ordinal outcomes were analyzed with the cumulative ROC curve approach where cutpoints were selected with the Total Accuracy criterion and computed parametrically. Fig 1 displays histograms for each dataset overlaid with Total Accuracy cutpoints, while Fig 2 shows the cutpoints on their respective cumulative ROC curves.
10.1371/journal.pone.0221433.g001Percent distributions overlaid with cutpoints (dashed lines) selected from cumulative ROC curves with the Total Accuracy criterion.
A: Cork Stopper Quality (N = 150): total defective area [px] by cork stopper quality levels. B: NHANES Tobacco Smoke Exposure (N = 16,900): natural log of urinary NNAL [ng/mL] by tobacco smoke exposure levels.
10.1371/journal.pone.0221433.g002Cumulative ROC curves (focused on upper-left quadrant) indicating coordinates for Total Accuracy cutpoints.
A: Cork Stopper Quality: poor vs. normal, superior quality (solid line) and poor, normal vs. superior (dashed line). B: NHANES Tobacco Smoke Exposure: none vs. second-hand smoke (SHS), smoker (solid line) and none, SHS vs. smoker (dashed line).
Tables 8 and 9 present the Total Accuracy and parametric cutpoints, as well as their sensitivities, specificities, and AUCs. Confidence intervals for parametric cutpoints were calculated with Fieller’s Method [33] and for AUCs with Wald’s Method [37]. Cumulative logit regression models and cumulative ROC curves were computed with the FREQ and LOGISTIC subroutines of the SAS software application, version 9.4 [34].
Cutpoints of total defective area [px] discriminating cork stopper quality levels were selected from cumulative ROC curves with Total Accuracy criterion and computed parametrically. Proportional odds were assumed among quality levels.
Cutpoints of the natural log of urinary NNAL [ng/ml] discriminating tobacco smoke exposure levels. Cutpoints were selected from cumulative ROC curves with Total Accuracy criterion and computed parametrically. Non-proportional odds were assumed among exposure levels.
Categories
Cutpoint
Sn: SensitivitySp: Specificity
AUC[95%CI]
Total Accuracy
Non-Exposed vs.
−4.092
Sn: 0.9597
0.9535
SHS, Exclusive Smokers
Sp: 0.8060
[0.9497, 0.9573]
Non-Exposed, SHS vs.
−3.168
Sn: 0.9689
0.9679
Exclusive Smokers
Sp: 0.8458
[0.9646, 0.9712]
Parametric [95CI]
Non-Exposed vs.
−4.053
Sn: 0.8029
—
SHS, Exclusive Smokers
[−4.108, −3.998]
Sp: 0.9602
Non-Exposed, SHS vs.
−3.264
Sn: 0.8537
—
Exclusive Smokers
[−3.319, −3.208]
Sp: 0.9661
Cork stopper quality
The data comprise measurements of material defects appearing in digital images of cork stoppers [38, 39], available in S1 File. An automated image processing system scanned cork defects and quantified several characteristics, including the number, area, and perimeter of the defects. Fifty cork stoppers were quantified in each of three quality levels subjectively assigned by human experts (N = 150), where Y = 1 (poor), 2 (normal), and 3 (superior). In the Stage 1 cumulative logit model, cork stopper quality was predicted by the total number of pixels with defects [px]. The score test for proportional odds (p-value = 0.31) supported a proportional odds configuration for the model. The parameter estimates are: α^1=−13.64, α^2=−7.05, and β^area=−0.036.
The ability of defect area to discriminate cork stopper quality is excellent, with AUCs > 0.97 for both cumulative ROC curves (Table 8). Total Accuracy identified cutpoints where the total number of pixels with defects were 205 (distinguishing poor or normal quality vs. superior) and 369 (poor vs. normal, superior), and both had excellent sensitivities and specificities > 0.93. Parametrically computed cutpoints were at 194.8 [95%CI: 177.1, 213.7] and 376.8 [352.9, 403.9] pixels, with sensitivities and specificities comparable to those of the Total Accuracy cutpoints, although specificity for the lower cutpoint and sensitivity for the upper cutpoint were somewhat attenuated.
NHANES tobacco smoke exposure
Human exposure to chemicals can be estimated from measurements of trace compounds in samples of human urine. Some of these compounds, known as biomarkers, are associated with exposure to tobacco smoke, which may arise either from direct inhalation while smoking, or from indirect inhalation of tobacco smoke present in the environment (i.e., second-hand tobacco smoke; SHS). One such biomarker is a tobacco-specific N-nitrosamine known as NNAL (4-[methylnitrosamino]-1-[3-pyridyl]-1-butanol; CAS No. 76014-81-8), which is present in both mainstream tobacco smoke and smokeless tobacco products. NNAL was measured in urine from a representative multi-level sample of the United States civilian population ≥ 12 years old (N = 16, 900) obtained during the 2007–2012 cycles of the National Health and Nutrition Examination Survey (NHANES) [40], available in S2 File. Subjects reported being in one of three ordinal exposure categories: non-exposed subjects who neither used tobacco products nor were exposed to SHS (Y = 1;n_{1} = 12, 372); SHS-exposed subjects who did not smoke tobacco (Y = 2;n_{2} = 927); and exclusive tobacco smokers (Y = 3;n_{3} = 3, 691). In order to eliminate a non-combustible source of NNAL, subjects were excluded from analysis if they reported using smokeless tobacco. The natural log of urinary NNAL concentration predicted self-reported exposure categories in the Stage 1 cumulative logit model. The score test (p-value <0.001) supported a non-proportional odds configuration for the model. The parameter estimates are: α^1=−4.60, α^2=−4.08, β^ln(NNAL),1=−1.13, and β^ln(NNAL),2=−1.25.
The ability of ln(NNAL) to discriminate ternary tobacco smoke exposure levels is excellent with AUCs > 0.95 for both cumulative ROC curves (Table 9). Total Accuracy identified cutpoints at ln(NNAL) concentrations of -4.092 (non-exposed vs. SHS-exposed, smokers) and -3.168 ng/mL (non-exposed, SHS-exposed vs. smokers). Exponentiated, the respective cutpoints are 16.71 and 42.09 pg/mL. Since the non-proportional odds configuration permits each cumulative ROC curve to differ in discriminatory power, the cumulative ROC curve associated with the lower cutpoint had an AUC of 0.9535 [95%CI: 0.9497, 0.9573], while the upper cutpoint’s curve had an AUC that was slightly, but significantly better at 0.9679 [0.9646, 0.9712]. Parametric cutpoints are at −4.053 [95%CI: −4.108, −3.998] and −3.264 [−3.319, −3.208] ng/mL (exponentiated: 17.37 [16.44, 18.35] and 38.24 [36.19, 40.44] pg/mL, respectively). Total Accuracy’s upper cutpoint was above the parametric upper cutpoint’s upper 95 percent confidence limit, but it is unclear which is preferable. The Total Accuracy upper cutpoint had excellent sensitivity (0.9689) and good specificity (0.8458), but this was reversed for the parametric upper cutpoint, which had good sensitivity (0.8537) and excellent specificity (0.9661). Another basis for comparison is the Total Accuracycriterion, which can be calculated for parametric cutpoints from their TP_{j}, TN_{j}, FP_{j}, and FN_{j}. This, too, failed to be conclusive since the criterion for the Total Accuracy vs. parametric upper cutpoints were hardly different at 0.9421 vs. 0.9417.
Discussion
The cumulative logit model subsumes multinomial ordinal outcome levels within a single model, yet each outcome level gets its own cumulative logit, so that predicted individual probabilities for each level (Eq 4) are mutually exclusive, comprehensive over the outcome levels, and sum to unity for each observation of the continuous measurement. Another appeal of the model is that its predicted probabilities (Eqs 3 and 4) change in direct proportion to the continuous measurement across all outcome levels. Even more, the ordinality of the outcome ensures that cutpoints separate successive pairs of adjacent outcome levels.
The assumption of proportional odds constrains the log-odds of the continuous predictor to be constant for all levels of the ordinal outcome. This imposes statistical equivalence on the AUCs of the cumulative ROC curves, so that the ROC curves will appear approximately overlapped. In contrast, when the log-odds of the predictor are non-proportional, which represents varying strength in the continuous predictor’s association at each outcome level, the AUCs of the cumulative ROC curves will differ and the curves will appear nested. Notably, the rank-order of the AUCs (and hence the order of nesting) is independent of the order of the ordinal outcome levels. This flexibility may be especially desirable in certain settings, such as in a clinical trial where a medication may be associated with greater potency at the worst level of the health outcome.
Evaluated under simulation, cumulative ROC curve analysis performed as expected for a variety of conditions, but with the qualification that if ROC curve-based cutpoint criteria are to be used, results from simulated unbalanced data indicate that Total Accuracy yields minimally biased cutpoints compared to the Youden Index, Matthews Correlation Coefficient, and Markedness. In contrast to cutpoints selected by criteria, parametric cutpoints have the advantage of being maximum likelihood and consequently had absolute percent-biases that were less than Total Accuracy’s and were often negligible.
The previously noted divergence of the cumulative logit parameters in the NPO2 simulation condition also suggests that caution may be warranted in some non-proportional odds situations, particularly when AUCs of the cumulative ROC curves are widely separated, as in NPO2. If, however, the primary aim is cutpoint estimation, the NPO2 condition indicates that estimated cutpoints were robust against divergence in the logit parameters. Moreover, qualitative results from the NPO2 condition regarding cutpoint selection criteria and parametric cutpoints were consistent with those from the proportional odds and NPO1 simulations.
Analysis of real-world data demonstrated that cumulative ROC curve analysis yields reasonable results. Continuous measurements in both datasets displayed varying degrees of overlap among the ternary outcome levels. The tobacco smoke exposure data were relatively large, but also strikingly unbalanced across the outcome levels, especially at the intermediate outcome level. The intermediate SHS-exposed category was small (5.5 percent) and the distribution was skewed toward the extreme exposure levels (72.8 percent non-exposed vs. 21.7 percent smokers). Notwithstanding, the cumulative ROC curve approach identified cutpoints with good to excellent sensitivity and specificity.
The cumulative ROC curve approach readily generalizes to more than three outcome levels through specification of the cumulative logit model. Nonetheless, discriminating discrete outcome levels postulates that the continuous measurement is associated with an a priori number of latent and ordinal classes. If the cumulative logit model in Stage 1 specifies an outcome with J > 2 ordinal levels, determination of cutpoints may be difficult if the outcome is actually binomial or is otherwise different than assumed. The magnitude of this difficulty may be revealed in exploratory data analysis, by poor model fit, and by cutpoints with poor sensitivity and specificity. For the tobacco smoke exposure data, although prior assumption of an intermediate outcome level (i.e., secondhand smoke-exposed) was plausible, there was cause for doubt since this level was observed infrequently. In addition, the infrequency of the SHS-exposed outcome level contrasts with the simulated data, where use of a normal distribution as a source of random variates for the continuous predictor leads to more frequent intermediate outcomes (~40 percent) compared to the extreme outcomes (~20 percent each). These considerations notwithstanding, the natural log of urinary NNAL was an excellent discriminator of the three tobacco smoke exposure levels.
The proposed approach admits alternative formulations of the Stage 1 model, where other multinomial models may be implemented through substitution of the cumulative logit link function. Alternative models for ordinal outcomes, such as adjacent categories and continuation ratio (including complementary log-log and Cox proportional hazards), and nominal outcomes (with the generalized logit) all predict probabilities entirely suitable for subsequent calculation of cumulative ROC curves. Conceptual interpretation of these alternative link functions, however, necessarily varies, sometimes substantially, and may therefore be less directly interpretable than the cumulative logit. Exploring the performance and utility of these alternative link functions may nonetheless be fruitful.
Cumulative ROC curve analysis appears to be efficacious for a univariate continuous predictor, and the regression framework may be extended with the addition of covariates to the Stage 1 cumulative logit model. This can be expected to enhance discriminatory power by accounting for other influential or potentially confounding influences [41]. In any particular case, however, it may not be clear whether additional covariates will adversely affect the overall concavity of the cumulative ROC curves for the continuous predictor of interest, thereby hindering selection of cutpoints. Stratification by discrete factors may be helpful in resolving some of these difficulties.
One challenge posed by the cumulative logit model is its sample size demands, which arise from the potentially numerous parameters that must be estimated. In the proportional odds configuration, the univariate cumulative logit model has J − 1 intercepts plus one slope for the continuous predictor, but this nearly doubles in the non-proportional odds configuration, which has 2 × (J − 1) regression parameters.
Conclusions
The cumulative ROC curve method comprises a straightforward combination of cumulative logit regression with ROC curve analysis, and is readily implemented with available statistical software. Cutpoint selection criteria from classic ROC curve analysis are still applicable, as well as established performance measures, such as sensitivity, specificity, and AUC. Cumulative ROC curve analysis performed as expected under simulation and with real-world data for a variety of conditions, including balanced and unbalanced data, proportional and non-proportional odds assumptions for the cumulative logit model, and AUCs associated with fair, good, and excellent performance (AUC = 0.70–0.95). Of the ROC curve-based cutpoint criteria, Total Accuracy was the least biased in simulation compared to the Youden Index, Mathews Correlation Coefficient, and Markedness. Calculation of cutpoints from cumulative logit regression parameters, which forgoes evaluation of cumulative ROC curves, demonstrated minimal bias, owing to parameter estimation with maximum likelihood methods. The author’s SAS programs implementing cumulative ROC curve analysis for ternary ordinal outcomes (J = 3) with parametric cutpoints are freely available for download in S1 Programs and from the author’s GitHub repository [42]: https://github.com/intelligo1466/cumRoc3.
Supporting information%cumRoc3—Cumulative ROC curve analysis of three-level ordinal outcomes.
A SAS macro that implements cumulative ROC curve analysis for three-level (ternary) ordinal outcomes, as described in this article. Requires SAS v9.4 or later.
(ZIP)
Click here for additional data file.
Data, cork quality.
Demonstration dataset comprising a ternary ordinal outcome representing levels of cork quality and a predictor representing the number of image pixels exhibiting defects.
(ZIP)
Click here for additional data file.
Data, NHANES NNAL tobacco smoke exposure.
Demonstration dataset comprising a ternary ordinal outcome representing levels of self-reported tobacco smoke exposure and a predictor representing measurements of a tobacco-specific biomarker in urine.
(ZIP)
Click here for additional data file.
The author is indebted to this journal’s associate editors and reviewers for their gracious and constructive comments that significantly improved this paper. Disclaimers: The findings and conclusions in this report are those of the author and do not necessarily represent the views of the Centers for Disease Control and Prevention. Use of trade names is for identification only and does not imply endorsement by the Centers for Disease Control and Prevention.
Cumulative ROC curves for discriminating three or more ordinal outcomes with cutpoints on a shared continuous measurement scale
PLOS ONE
Dear Dr. deCastro,
Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.
We would appreciate receiving your revised manuscript by Aug 18 2019 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.
To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols
Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.
Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.
We look forward to receiving your revised manuscript.
Kind regards,
Alessandro Parolari, MD, PhD
Academic Editor
PLOS ONE
Journal Requirements:
1. When submitting your revision, we need you to address these additional requirements.
Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at
http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf
2. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.
Comments to the Author
1. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.
Reviewer #1: Yes
Reviewer #2: Yes
**********
2. Has the statistical analysis been performed appropriately and rigorously?
Reviewer #1: Yes
Reviewer #2: Yes
**********
3. Have the authors made all data underlying the findings in their manuscript fully available?
The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.
Reviewer #1: Yes
Reviewer #2: Yes
**********
4. Is the manuscript presented in an intelligible fashion and written in standard English?
PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.
Reviewer #1: Yes
Reviewer #2: Yes
**********
5. Review Comments to the Author
Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)
Reviewer #1: The paper is well written, even if the subject is rather complex. The methodology is sound and the conclusions are in line with the results obtained. I have a few minor comments and modifications to suggest:
1. Line 192: it would be helpful to specify on which base (arbitrary or objective?) the three quality levels for the cork stoppers were established.
2. The discussion of the simulations is not deep enough (lines 268-274). For instance, the author should discuss the contrasting results of NPO1 and NPO2 (tables 4), where a substantial under- or over-estimation of the parameters occurr .
3. the Discussion includes some methodological specification that should be moved to a previous section: for example, 321-327 should be moved before the ‘Simulations’ section.
4. The same for the last two sentences of the Conclusions, which contain methodological specifications
Reviewer #2: I read with interest the manuscript. It's a very statistical summary of the employment of cumulative ROC curves for discriminating ordinal outcomes. I have no major comment
**********
6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.
Reviewer #1: No
Reviewer #2: No
[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]
While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.
10.1371/journal.pone.0221433.r002Author response to Decision Letter 0Submission Version1
11 Jul 2019
Please see uploaded PONE-D-19-14500_R1_Response to Reviewers.pdf
Submitted filename: PONE-D-19-14500_R1_Response to Reviewers.pdf
Cumulative ROC curves for discriminating three or more ordinal outcomes with cutpoints on a shared continuous measurement scale
PONE-D-19-14500R1
Dear Dr. deCastro,
We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.
Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.
Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.
If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.
Cumulative ROC curves for discriminating three or more ordinal outcomes with cutpoints on a shared continuous measurement scale
Dear Dr. deCastro:
I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.
If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.
For any other questions or concerns, please email plosone@plos.org.