Epidemiologic studies usually use database diagnoses or patient self-report to identify disease cohorts, but no previous research has examined the extent to which self-report of chronic disease agrees with database diagnoses in a Veterans Affairs (VA) health care setting.
All veterans who had a medical care visit from October 1, 1996, through May 31, 1998, at any of the Veterans Integrated Service Network 13 facilities were surveyed about physician diagnosis of chronic obstructive pulmonary disease (COPD)/asthma, depression, diabetes, and heart disease. Four administrative case definitions (data from VA databases) consisting of combinations of International Classification of Diseases, Ninth Revision, codes and disease-specific medication data were compared with self-report of each disease to assess sensitivity, specificity, positive and negative predictive values, area under receiver operating characteristics curve, and κ statistic.
Sensitivity for administrative definitions compared with self-report of physician diagnosis was 24% to 54% for COPD/asthma, 25% to 47% for depression, 27% to 59% for heart disease, and 64% to 78% for diabetes. Specificity was 88% to 100% for all diseases. The κ statistic showed fair agreement for COPD/asthma, depression, and heart disease and substantial agreement for diabetes.
Diagnoses identified from databases agree with self-report for diabetes but not COPD/asthma, depression, or heart disease in a VA health care setting.
Large epidemiologic or health services studies usually resort to using administrative or clinical databases (eg, Medicare, Medicaid, Veterans Affairs, health maintenance organizations) or patient self-report (eg, data from the Behavioral Risk Factor Surveillance System) for case detection. Few studies, however, have examined the extent to which self-reported diagnoses agree with those obtained from databases. Self-reported diagnoses have good agreement with those obtained from databases for hypertension (
The Veterans Health Administration is the largest integrated health care system in the United States. One Veterans Affairs (VA) study compared diagnostic accuracy in veterans with serious mental illness and found that they are less aware of comorbidities (
In a study of quality of life of veterans receiving health care in Veterans Integrated Service Network 13 (VISN-13), data were collected regarding patient self-report of chronic diseases, administrative diagnoses, and use of medications (
The Veterans' Quality of Life Study was a cohort study of all veterans who received inpatient or outpatient health care at any of the VISN-13 facilities (covering all of Minnesota, North Dakota, and South Dakota and selected counties in Iowa, Nebraska, Wisconsin, and Wyoming) from October 1, 1996, through May 31, 1998, and had a valid mailing address (
The survey included questions regarding 1) self-report of physician diagnosis of chronic conditions, including chronic obstructive pulmonary disease (COPD) or asthma, depression, diabetes, heart disease, hypertension, and arthritis; 2) demographic information, including sex, education level, and race/ethnicity; 3) smoking status; and 4) functional limitation as assessed by limitation of activities of daily living, such as bathing, dressing, eating, getting in and out of a chair, walking, and using the bathroom (
Prospective and retrospective cohort data were obtained for the year before and the year after the survey from the Patient Treatment File and the Outpatient Clinic data sets in the VA administrative databases at the Austin Automation Center in Austin, Texas. These data are reliable for demographic characteristics and most common diagnoses (
In addition to the above data, for this study International Classification of Diseases, Ninth Revision (ICD-9), codes and prescription data for 4 self-reported comorbidities (COPD/asthma, depression, diabetes, and heart disease), and Current Procedural Terminology codes for percutaneous transluminal coronary angioplasty and coronary artery bypass grafting only for heart disease, were extracted from each facility for the 2-year period, including the year before and the year after the survey (
The accuracy of various administrative database case definitions was calculated for each disease by comparing them with patient self-report of physician diagnosis for each condition. The agreement between database case definitions and self-report was assessed by calculating the κ statistic (
Multivariable logistic regression analyses were performed to determine the factors significantly associated with disagreement between self-report and administrative database definition for various chronic diseases. To avoid multiple analyses, the database case definition of ICD-9 code in the year before the survey was used. This definition was chosen for multiple reasons: 1) ICD-9 code is frequently used for case detection in large epidemiologic studies, 2) ICD-9 code is easy to extract from most large databases, and 3) this administrative case definition was associated with the most agreement (highest κ statistic) with the self-report case definition in most instances. The year before the survey was chosen for the definition because patients can report a disease only if they were told of the diagnosis by their physician before the survey (and not after the survey). Various predictor factors that were modeled in these regression analyses included demographics; clinical measures; health care use, access, and eligibility measures; and health and functional status. For the purpose of analysis, outpatient visits, physical component summary, and mental component summary scores were divided into tertiles. All participants were included in the main logistic regression analysis, and differences were considered significant at
Participants with various conditions had similar outpatient and inpatient use (
Fair-to-moderate agreement for COPD/asthma, depression, and heart disease and substantial agreement for diabetes were found (
Sensitivity and negative predictive value were highest for the administrative case definition of ICD-9 code or medication use, and specificity and positive predictive value were highest for the administrative case definition of ICD-9 code and medication use (
Lower number of outpatient visits, higher number of comorbidities, and lower physical component summary score were associated with higher odds of disagreement for most chronic diseases (
This study of elderly veterans in VISN-13 found fair-to-moderate agreement between administrative definitions and self-report of COPD/asthma, heart disease, and depression and substantial agreement for diabetes. High κ and positive predictive values for administrative database definitions versus self-report for diabetes confirm similar earlier findings of κ 0.70 to 0.93 (
The finding of a much higher level of agreement between self-report and administrative database diagnosis of diabetes as compared with COPD/asthma, depression, and heart disease confirms similar previous findings in heart disease (
That higher number of comorbidities and older age increased discrepancy (decreased κ) between self-report and database diagnoses confirmed an earlier finding of lower κ between self-report and medical records–based algorithms in women aged 65 years or older (
Findings of more disagreement in nonwhite and less educated patients confirm similar findings (
More physician visits are associated with more disagreement between self-report and medical record evidence for cardiovascular diseases (
This study has several limitations. The nonresponse bias and cohort characteristics (elderly veterans, predominantly male and white) may limit the generalizability. However, these data should be useful to VA epidemiologists who use computerized databases. Shortcomings in the questionnaire design may also have influenced the level of agreement, as previously described (
This study also has several strengths. The sample was large, and results were robust across database definitions, including various combinations of ICD-9 codes and prescription of disease-specific medication. The self-report definition in this study is, in fact, self-report of physician diagnosis, which is more accurate than self-report alone.
These findings also have clinical implications. The finding that 89% to 91% of elderly veterans with COPD/asthma, depression, or heart disease who are being treated for the condition (ICD-9 code plus medication) could identify their diagnosis implies that these veterans can be identified without accessing medical records and could be targeted for interventions at a community level, such as education on self-management, healthy lifestyles, and exercise and other nonpharmacologic interventions. These interventions may be even more relevant for patients with diabetes (exercise, weight reduction, foot care, and self-monitoring), since 98% can identify their disease.
In summary, agreement between self-report of physician diagnosis and database diagnoses differs by the diagnosis. Agreement is fair to moderate for COPD/asthma, heart disease, and depression and substantial for diabetes. The effect of patient demographic, clinical, health care use, and access measures underscores the limitation of common approaches that use patient self-report or administrative databases to identify disease cohorts. Further studies should develop algorithms to improve the methods of patient cohort selection.
Grant support was provided by the VA Upper Midwest Veterans Network (VISN-13). I thank Sean Nugent and Ann Bangerter of the Minneapolis VA's Center for Chronic Disease Outcomes Research for extracting data for this database.
The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the US Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors’ affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above. URLs for nonfederal organizations are provided solely as a service to our users. URLs do not constitute an endorsement of any organization by CDC or the federal government, and none should be inferred. CDC is not responsible for the content of Web pages found at these URLs.
ICD-9 Codes and Medications Used to Determine Disease Diagnoses
| Chronic obstructive pulmonary disease/asthma | 490 (bronchitis not specified as acute or chronic), 491 (chronic bronchitis), 492 (emphysema), 493 (asthma), 495 (extrinsic allergic alveolitis), 496 (chronic airway obstruction not elsewhere classified) | Albuterol inhaler/MDI, metaproterenol inhaler/MDI, formoterol inhaler/MDI, salmeterol inhaler/MDI, beclomethasone inhaler/MDI, flunisolide inhaler/MDI, fluticasone inhaler/MDI, budesonide inhaler/MDI, ipratropium bromide inhaler/MDI, cromolyn sodium inhaler/MDI, bitolterol mesylate aerosol, isoetharine aerosol, albuterol aerosol, pirbuterol aerosol, terbutaline, terbutaline sulfate, aminophylline, dyphylline, oxtriphylline, theophylline, theophylline SR, ephedrine sulfate, montelukast, nedocromil sodium, racemic epinephrine |
| Depression | 296.xx (affective psychoses), 300.4x (neurotic depression/dysthymic disorder), 301.1x (affective personality disorder), 298 (other nonorganic psychosis), 311 (depression disorder not elsewhere classified) | Doxepin, clomipramine, amoxapine, nortriptyline, trazodone, venlafaxine, amitriptyline, maprotiline, fluvoxamine, isocarboxazid, phenelzine, desipramine, tranylcypromine, paroxetine, fluoxetine, mirtazapine, nefazodone, trimipramine, imipramine, protriptyline, bupropion, sertraline, citalopram, escitalopram |
| Diabetes mellitus | 250 (diabetes mellitus) | Metformin, acarbose, insulin, repaglinide, glimepiride, glyburide, chlorpropamide, glipizide, troglitazone, pioglitazone, rosiglitazone, tolazamide, tolbutamide, acetohexamide |
| Heart disease | 410 (acute myocardial infarction), 411 (other acute and subacute forms of ischemic heart disease), 412 (old myocardial infarction), 413 (angina pectoris), 414 (other forms of chronic ischemic heart disease) | Isosorbide dinitrate, nitroglycerin, aspirin, enteric-coated aspirin |
Abbreviations: ICD-9, International Classification of Diseases, Ninth Revision; MDI, metered-dose inhaler; SR, sustained release.
For heart disease, some codes listed are Current Procedural Terminology (CPT) codes, as indicated.
Because some medications were dispensed as brand-name drugs, proprietary names were also included in the search, but only generic names are listed.
Demographic, Clinical, and Health Care Use Characteristics of Veterans With Self-Reported Chronic Diseases
| Characteristic | Chronic Disease [mean (SD) or %] | |||
|---|---|---|---|---|
| COPD/Asthma (9,309-10,135) | Depression (10,016-10,754) | Heart Disease (10,761-11,676) | Diabetes (6,469-7,066) | |
| 66 (12) | 62 (14) | 68 (11) | 68 (11) | |
| 96 | 95 | 98 | 97 | |
| 96 | 95 | 97 | 94 | |
| 65 | 57 | 70 | 66 | |
| Less than 8th grade | 23 | 18 | 23 | 23 |
| Some high school | 13 | 11 | 13 | 13 |
| High school graduate | 34 | 34 | 35 | 34 |
| At least some college | 29 | 37 | 29 | 29 |
| Employed | 25 | 27 | 24 | 23 |
| Unemployed | 19 | 28 | 16 | 16 |
| Retired | 51 | 40 | 55 | 55 |
| Unknown | 5 | 6 | 6 | 5 |
| 26 | 31 | 17 | 18 | |
| 17 | 17 | 18 | 17 | |
| 16 | 37 | 13 | 13 | |
| Primary care clinic | 4 (4) | 3 (4) | 4 (4) | 4 (4) |
| Specialty medicine clinic | 2 (4) | 2 (5) | 2 (5) | 2 (6) |
| Surgery clinic | 2 (3) | 2 (3) | 2 (3) | 3 (4) |
| 6 | 5 | 6 | 6 | |
| 11 | 13 | 12 | 11 | |
| 39 | 46 | 39 | 38 | |
Abbreviations: SD, standard deviation; COPD, chronic obstructive pulmonary disease.
Range of number of patients indicates the number of patients for whom survey responses or database information were available. The number varies by the question because some questions were skipped.
Data on race were available for only 67% of patients.
Accuracy of Administrative Case Definitions Compared With Self-Report of Chronic Diseases
| Chronic Disease | % (95% CI) | ROC Area (95% CI) | κ (95% CI) | |||
|---|---|---|---|---|---|---|
| Sensitivity | Specificity | PPV | NPV | |||
| ICD-9 code | 49 (48-49) | 94 (93-94) | 73 (72-73) | 84 (84-84) | 0.71 (0.71-0.72) | 0.47 (0.47-0.48) |
| Medication use | 29 (28-29) | 96 (95-96) | 70 (69-70) | 79 (79-80) | 0.62 (0.62-0.63) | 0.30 (0.30-0.31) |
| Either ICD-9 code or medication use | 54 (53-54) | 90 (90-91) | 66 (65-66) | 85 (84-85) | 0.72 (0.71-0.73) | 0.46 (0.46-0.47) |
| Both ICD-9 code and medication use | 24 (23-24) | 99 (99-99) | 91 (90-91) | 79 (78-79) | 0.62 (0.61-0.62) | 0.30 (0.30-0.31) |
| ICD-9 code | 34 (33-34) | 97 (97-98) | 83 (83-84) | 79 (78-79) | 0.65 (0.65-0.66) | 0.38 (0.37-0.39) |
| Medication use | 38 (38-39) | 94 (94-94) | 72 (71-72) | 80 (79-80) | 0.66 (0.66-0.67) | 0.38 (0.38-0.39) |
| Either ICD-9 code or medication use | 47 (46-47) | 93 (92-93) | 72 (71-72) | 82 (81-82) | 0.70 (0.69-0.70) | 0.44 (0.44-0.45) |
| Both ICD-9 code and medication use | 25 (25-26) | 99 (99-99) | 89 (89-89) | 77 (77-78) | 0.62 (0.61-0.63) | 0.31 (0.30-0.32) |
| ICD-9 code | 36 (35-36) | 97 (97-97) | 86 (86-87) | 75 (74-75) | 0.66 (0.66-0.67) | 0.38 (0.38-0.39) |
| Medication use | 50 (49-50) | 89 (89-90) | 70 (70-71) | 78 (77-78) | 0.70 (0.69-0.70) | 0.37 (0.36-0.38) |
| Either ICD-9 code or medication use | 59 (59-60) | 88 (87-88) | 71 (71-72) | 81 (80-81) | 0.73 (0.73-0.74) | 0.49 (0.48-0.50) |
| Both ICD-9 code and medication use | 27 (26-27) | 99 (98-99) | 90 (90-90) | 72 (72-73) | 0.63 (0.62-0.63) | 0.30 (0.30-0.31) |
| ICD-9 code | 76 (75-76) | 98 (98-98) | 91 (91-91) | 95 (94-95) | 0.87 (0.86-0.88) | 0.79 (0.79-0.80) |
| Medication use | 66 (66-67) | 100 (100-100) | 97 (97-98) | 93 (93-93) | 0.83 (0.82-0.84) | 0.75 (0.75-0.76) |
| Either ICD-9 code or medication use | 78 (78-79) | 98 (98-98) | 91 (91-91) | 95 (95-95) | 0.88 (0.88-0.89) | 0.81 (0.81-0.82) |
| Both ICD-9 code and medication use | 64 (63-64) | 100 (100-100) | 98 (97-98) | 92 (92-93) | 0.82 (0.81-0.82) | 0.73 (0.73-0.74) |
Abbreviations: CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; ROC, receiver operating characteristics; COPD, chronic obstructive pulmonary disease; ICD-9, International Classification of Diseases, Ninth Revision.
Factors Significantly Associated With Overall Discordance Between Self-Report and ICD-9 Code for Each Chronic Disease
| Predictor | Odds Ratio (95% Confidence Interval) | |||
|---|---|---|---|---|
| COPD/Asthma | Depression | Heart Disease | Diabetes | |
| ≤50 | NS | 1 [Reference] | 1 [Reference] | 1 [Reference] |
| 51-65 | NS | 0.9 (0.8-1.0) | 1.9 (1.7-2.1) | 1.4 (1.1-1.8) |
| >65 | NS | 0.8 (0.7-0.9) | 2.9 (2.6-3.3) | 1.9 (1.5-2.4) |
| Married | NS | 1 [Reference] | NS | NS |
| Unmarried | NS | 1.3 (1.2-1.4) | NS | NS |
| Less than 8th grade | NS | NS | 1 [Reference] | NS |
| Some high school | NS | NS | 0.9 (0.8-1.0) | NS |
| High school graduate | NS | NS | 0.9 (0.8-0.9) | NS |
| At least some college | NS | NS | 0.9 (0.8-1.0) | NS |
| White | NS | 1 [Reference] | 1 [Reference] | 1 [Reference] |
| Nonwhite | NS | 1.1 (1.0-1.3) | 1.2 (1.1-1.3) | 1.5 (1.2-1.9) |
| Employed | NS | 1 [Reference] | NS | NS |
| Unemployed | NS | 1.2 (1.1-1.3) | NS | NS |
| Retired | NS | 1.0 (0.9-1.1) | NS | NS |
| Nonsmoker | 1 [Reference] | NS | NS | NS |
| Smoker | 1.6 (1.5-1.8) | NS | NS | NS |
| 0 | NS | 1 [Reference] | 1 [Reference] | 1 [Reference] |
| 1 | NS | 1.4 (1.3-1.6) | 1.2 (1.1-1.3) | 1.3 (1.0-1.7) |
| 2 | NS | 1.6 (1.4-1.8) | 1.4 (1.3-1.6) | 1.7 (1.3-2.1) |
| ≥3 | NS | 2.2 (2.0-2.6) | 1.9 (1.7-2.1) | 2.1 (1.7-2.7) |
| 0 | 1 [Reference] | 1 [Reference] | NS | NS |
| 1 | 1.3 (1.2-1.5) | 0.7 (0.7-0.8) | NS | NS |
| 2 | 1.6 (1.4-1.8) | 0.7 (0.6-0.8) | NS | NS |
| ≥3 | 2.1 (1.8-2.4) | 0.9 (0.8-1.0) | NS | NS |
| 0 | 1 [Reference] | NS | NS | 1 [Reference] |
| ≥1 | 1.1 (1.0-1.2) | NS | NS | 0.8 (0.6-1.0) |
| Multiple site | NS | 1 [Reference] | NS | NS |
| Single site | NS | 1.2 (1.1-1.4) | NS | NS |
| 0 | NS | 1 [Reference] | NS | 1 [Reference] |
| 10-50 | NS | 1.1 (1.0-1.2) | NS | 1.1 (0.9-1.3) |
| >50 | NS | 1.4 (1.2-1.5) | NS | 1.4 (1.1-1.7) |
| Highest tertile | 1 [Reference] | 1 [Reference] | NS | 1 [Reference] |
| Middle tertile | 1.0 (0.9-1.1) | 1.1 (1.0-1.1) | NS | 1.2 (1.0-1.4) |
| Lowest tertile | 1.1 (1.0-1.2) | 1.4 (1.3-1.6) | NS | 2.2 (1.8-2.5) |
| Lowest tertile | 1 [Reference] | NS | 1 [Reference] | 1 [Reference] |
| Middle tertile | 0.8 (0.8-0.9) | NS | 0.8 (0.7-0.9) | 0.9 (0.8-1.1) |
| Highest tertile | 0.6 (0.5-0.6) | NS | 0.6 (0.5-0.6) | 0.7 (0.6-0.9) |
| Lowest tertile | 1 [Reference] | 1 [Reference] | NS | NS |
| Middle tertile | 1.0 (0.9-1.0) | 0.4 (0.3-0.4) | NS | NS |
| Highest tertile | 0.8 (0.7-0.9) | 0.1 (0.1-0.1) | NS | NS |
Abbreviations: ICD-9, International Classification of Diseases, Ninth Revision; COPD, chronic obstructive pulmonary disease; NS, not significant; ADL, activities of daily living; PCS, physical component summary; MCS, mental component summary.