Despite the critical importance of historical data in medical decision making
One such condition is domestic abuse,
Studies have shown that screening for domestic abuse, along with appropriate follow-up,
Screening for domestic abuse is particularly important in the emergency department, where victims are most often encountered.
Despite the growing evidence and official recommendations, actual screening rates remain low in practice,
Screening tools and scoring systems developed to assist doctors in detecting domestic abuse,
We evaluated the usefulness of commonly available longitudinal medical information for predicting a patient’s risk of receiving a future diagnosis of abuse. We developed intelligent histories—Bayesian models aimed at predicting the risk of an individual receiving a future diagnosis based on that individual’s diagnostic history.
Our modelling approach could form the basis for an early warning system that monitors longitudinal health data for long term indicators of abuse risk and alerts clinicians when high risk patients are identified. As a first step towards this goal, we describe a prototype risk visualisation we are developing to provide clinicians with instant overviews of longitudinal medical histories and related risk profiles at the point of care. In conjunction with alerts for high risk patients, this could enable clinicians to rapidly review and act on all available historical information by identifying important risk factors and long term trends.
We analysed longitudinal diagnostic histories of patients aged over 18 who had at least four years between their earliest and latest diagnoses recorded in an anonymised state-wide claims database covering six years of admissions to hospital, stays at hospitals for observation, and emergency department encounters. Some 561 216 patients met the inclusion criteria, having a total of 16 785 977 diagnoses among them.
Cases of abuse were identified according to ICD-9 (international classification of diseases, ninth revision) diagnostic codes, by using two different case definitions. The first, narrow case definition included all codes that explicitly refer to abuse (table 1
Abuse related ICD-9 codes comprising narrow case definition
| ICD-9 | Description |
|---|---|
| 995.5 | Child maltreatment syndrome |
| 995.50 | Child abuse, unspecified |
| 995.51 | Child emotional/psychological abuse |
| 995.52 | Child neglect (nutritional) |
| 995.53 | Child sexual abuse |
| 995.54 | Child physical abuse |
| 995.59 | Child abuse/neglect (not classified elsewhere) |
| 995.80 | Adult maltreatment, unspecified |
| 995.81 | Adult physical abuse |
| 995.82 | Adult emotional/psychological abuse |
| 995.83 | Adult sexual abuse |
| 995.84 | Adult neglect (nutritional) |
| 995.85 | Other adult abuse and neglect |
| E967.0 | Perpetrator of child and adult abuse: by father, stepfather, or boyfriend |
| E967.1 | Perpetrator of child and adult abuse: by other specified person |
| E967.2 | Perpetrator of child and adult abuse: by mother, stepmother, or girlfriend |
| E967.3 | Perpetrator of child and adult abuse: by spouse or partner |
| E967.4 | Perpetrator of child and adult abuse: by child |
| E967.5 | Perpetrator of child and adult abuse: by sibling |
| E967.6 | Battering by grandparent |
| E967.7 | Perpetrator of child and adult abuse: by other relative |
| E967.8 | Perpetrator of child and adult abuse: by non-related caregiver |
| E967.9 | Perpetrator of child and adult abuse: by unspecified person |
| V15.41 | History of physical abuse—rape |
| V15.42 | History of emotional abuse—neglect |
| V61.11 | Counselling for victim of spousal and partner abuse |
| V61.21 | Counselling for victim of child abuse |
Assault and intentional injury related ICD-9 codes added to codes in table 1 to form broader case definition
| E960 | Fight, brawl, rape |
|---|---|
| E960.0 | Unarmed fight or brawl |
| E960.1 | Rape |
| E961 | Assault—corrosive/caustic agent |
| E962.0 | Assault—poisoning with medical agent |
| E962.1 | Other solid and liquid substances |
| E962.2 | Assault—poisoning with gas/vapour |
| E962.9 | Unspecified poisoning |
| E963 | Assault—hanging/strangulation |
| E964 | Assault by submersion |
| E965.0 | Assault—handgun |
| E965.1 | Assault—shotgun |
| E965.3 | Assault—military firearms |
| E965.4 | Assault—firearm (not classified elsewhere) |
| E965.6 | Gasoline bomb |
| E965.8 | Assault—explosive (not classified elsewhere) |
| E965.9 | Unspecified explosive |
| E966 | Assault by cutting and piercing instrument |
| E968 | Assault by other and unspecified means |
| E968.0 | Assault—fire |
| E968.1 | Assault—push from high place |
| E968.2 | Assault—striking with object |
| E968.3 | Assault—hot liquid |
| E968.4 | Criminal neglect: abandonment of child, infant, or other helpless person with intent to injure or kill |
| E968.5 | Assault—transport vehicle |
| E968.6 | Assault—air gun |
| E968.7 | Human bite—assault |
| E968.8 | Assault (not classified elsewhere) |
| E968.9 | Assault (not otherwise specified) |
| E969 | Late effect assault |
In total, 5829 patients (1.04%) met the narrower case definition, with 511 659 diagnoses among them (average of 87.8 diagnoses per patient), and 555 387 patients did not meet the narrower case definition, with 16 774 318 diagnoses among them (average of 30.2 diagnoses per patient). Some 19 303 patients (3.44%) met the broader case definition, with 1 156 325 diagnoses among them (average of 59.9 diagnoses per patient), and 541 913 patients did not meet the broader case definition, with 15 629 652 diagnoses among them (average of 28.8 diagnoses per patient).
We developed Bayesian models to estimate a patient’s risk of receiving a future diagnosis of abuse based on the diagnostic history. We used naive Bayesian classifiers,
In summary, patients meeting the inclusion criteria were randomly assigned to a training set used to train the model (two thirds) or to a testing set used to validate it (one third). To account for sex specific differences in risk, we trained separate models for men and women. After training, we calculated a “partial risk score” for each diagnosis—the higher the partial risk score, the more predictive the diagnosis was of abuse. In addition to diagnoses, the model also incorporated the average number of visits a year recorded for the patient over the study period. This average number of visits, v, was categorised into one of six groups: v≤1, 1<v≤2, 2<v≤4, 4<v≤6, 6<v≤10, or v>10, and a partial risk score was calculated for each group.
We used the testing set, containing the remaining third of the patients, to validate the model. The model was applied retrospectively to the diagnostic histories of each patient in the testing set, analysing the data for each patient one visit at a time in chronological order and generating an “overall risk score” for the patient at the time of each new visit based on the sum of all the partial risk scores for that patient. These overall risk scores were interpreted with empirical thresholds determined according to desired specificity levels, and the corresponding sensitivity and timeliness levels were measured. To systematically gauge the actual trade-off between different levels of sensitivity and specificity in the testing set, the thresholds were set with the testing set. In an operational setting, users can set thresholds in advance based on the training set. In such a case, differences between the testing and training data might lead to a difference between desired specificity levels and actual specificity levels achieved.
In predicting the risk of patients receiving future abuse diagnoses, the intelligent history models achieved an area under the ROC curve of 0.88 for the narrower case definition and 0.82 for the broader case definition. Figure 1
Performance of intelligent histories models using narrower case definition of abuse and broader case definition of abuse, assault, or intentional injury
| Sensitivity (%) | Specificity (%) | PPV (%) | Mean days from detection to first abuse diagnosis |
|---|---|---|---|
| 1.8 | 99.9 | 14.4 | 280 |
| 3.5 | 99.8 | 14.3 | 331 |
| 3.9 | 99.75 | 13.0 | 350 |
| 6.5 | 99.5 | 10.9 | 390 |
| 10.3 | 99.0 | 8.9 | 459 |
| 17.5 | 98.0 | 7.6 | 501 |
| 21.1 | 97.5 | 7.4 | 523 |
| 35.5 | 95.0 | 6.3 | 613 |
| 50.8 | 92.5 | 6.0 | 661 |
| 64.2 | 90.0 | 5.7 | 749 |
| 82.6 | 85.0 | 4.9 | 890 |
| 87.3 | 80.0 | 4.0 | 898 |
| 0.7 | 99.9 | 18.9 | 382 |
| 1.4 | 99.8 | 18.6 | 364 |
| 1.7 | 99.75 | 17.6 | 398 |
| 2.8 | 99.5 | 15.0 | 421 |
| 5.5 | 99.0 | 14.8 | 435 |
| 9.6 | 98.0 | 13.0 | 501 |
| 11.5 | 97.5 | 12.6 | 509 |
| 20.9 | 95.0 | 11.6 | 564 |
| 29.2 | 92.5 | 10.9 | 585 |
| 37.3 | 90.0 | 10.5 | 620 |
| 51.2 | 85.0 | 9.7 | 696 |
| 64.7 | 80.0 | 9.2 | 775 |
PPV=positive predictive value.
The model could detect high levels of risk of abuse far in advance of the first diagnosis of abuse recorded in the system (fig 2
Examination of the internal parameters of the model showed interesting findings. Firstly, we examined the effects of frequency of visits. As described above, each range of average number of visits a year was assigned a partial risk score. Figure 3
Next, we examined the risks associated with different categories of illness. Figure 4
We also examined sex based differences in risk profiles. Figure 5
Partial risk scores* for women and men for select clinical categories
| Category† | Women (95% CI) | Men (95% CI) |
|---|---|---|
| Alcohol, substance related mental disorders | 1.455 (1.440 to 1.471) | 1.253 (1.235 to 1.271) |
| Injuries from external causes | 0.885 (0.843 to 0.925) | 0.175 (0.098 to 0.249) |
| Poisoning | 1.326 (1.279 to 1.373) | 1.039 (0.960 to 1.115) |
| Affective disorders | 1.435 (1.410 to 1.459) | 1.726 (1.688 to 1.764) |
| Other mental conditions | 1.283 (1.260 to 1.305) | 1.640 (1.606 to 1.673) |
| Other psychoses | 1.065 (0.980 to 1.148) | 1.326 (1.209 to 1.434) |
*The higher the partial risk score, the more predictive the category of diagnoses is of abuse.
†First three categories listed are more predictive of abuse in women than in men. Second three categories listed are more predictive of abuse in men than in women.
We also took the first steps towards describing how these models might form the basis of an early warning system to help doctors identify high risk patients for further screening. Figure 6
Longitudinal diagnostic data commonly available in electronic health information systems can be valuable for predicting a patient’s risk of receiving a future diagnosis of abuse. Unlike previous approaches to estimating risk,
We found significant differences in longitudinal patterns of diagnoses between abused and non-abused individuals, and these differences can be used for early identification—up to years in advance—of individuals at high risk for receiving a future diagnosis of abuse. Certain broad categories of diagnoses, like psychological related conditions, were highly associated with risk of abuse. This is noteworthy as screening rates in practice have actually been found to be lower among patients presenting with psychological conditions compared with other conditions.
Risk characteristics of specific diagnoses varied across sexes, and it is therefore useful to construct separate sex specific models of abuse risk. Abused patients had a higher average number of visits a year,
We used a state-wide dataset covering six years of admissions to hospital, observation stays in hospital, and encounters in emergency departments. Any visits taking place outside this state, beyond this time period, or in a different care setting were not included. As a result, certain diagnoses that would have helped or hindered in identifying high risk patients might not be recorded in the dataset, thus affecting the results for that patient. Furthermore, certain people might have received a diagnosis of abuse that was not recorded in the dataset, and these people might have been misclassified as not meeting the case definition or as meeting the case definition at a different time than they actually did. Our dataset did include comprehensive coverage of all encounters in emergency departments in the state. As described above, the emergency department is where abused patients are most often encountered,
Our case definition includes codes highly specific for abuse, assault, and intentional injury. As with all real world data, however, some visits might have been miscoded. Such omissions and inaccuracies in the data might reduce the performance of the model, but the demonstration of the utility of this approach using real world data has the potential to catalyse additional efforts in generating accurate diagnostic coding for each care episode.
Depending on the case definition used and the desired levels of specificity, the model can yield low to moderate positive predictive values (up to 14.4% for the narrow case definition and 18.9% for the broader case definition, see table 3). This is to be expected with conditions having a low prevalence (in the present case, 1.04% with the narrow definition and 3.44% with the broad definition), as the positive predictive value is directly proportional to prevalence of the condition being detected. These levels could be clinically useful in settings where the model is being used to identify patients for whom standard screening should be performed, especially when screening rates in practice remain below desired levels.
We focused on predicting the risk of future diagnoses of abuse, and the model is trained on patients who have been diagnosed in a clinical setting. Potential differences between cases of abuse that typically get diagnosed versus those cases that typically do not get diagnosed might serve as an important bias and might hinder the model’s ability to detect the latter. As mentioned above, however, domestic abuse often goes undiagnosed or is diagnosed only after considerable delay. Given the current high levels of underdiagnosis, it is likely that use of the model in a clinical setting would lead to the detection of some of the cases that are currently not typically diagnosed. The effect of implementing such a model in clinical practice is an important empirical question for future research.
Differences in care and coding practices might affect the generalisability of models from one health environment to another. We therefore recommend the training of a specific model for each healthcare environment. We expect the modelling approach to be generalisable to other settings inside and outside the US, as the minimal set of data elements (ICD-9 codes, dates of visits) used by the model are commonly stored throughout many countries with electronic medical record systems or claims systems. In countries that do not yet have electronic medical record systems, these models would be difficult to implement, though with time, electronic medical record systems are being deployed more widely throughout the world.
Our goal was to predict a patient’s risk of receiving a future diagnosis of abuse, based on the patient’s longitudinal diagnostic record to date. This prediction can help care givers to identify individuals who fall into either of two categories: those who may be currently experiencing abuse but have yet to be diagnosed and those who are not yet experiencing abuse but are at a high risk of being abused in the future. Currently, the model does not differentiate between these two types, though this is an important area for future research, as such a differentiation might enable explicit attempts to estimate time to event.
Further aspects are worthy of future study. Currently, the risk associated with each diagnosis is modelled separately. More complex models can be developed to explicitly incorporate the relations between multiple diagnostic codes—for example, the presence of diagnosis A together with diagnosis B might be more or less predictive of abuse risk than the combination of the individual risks of A or B alone.
While the present analysis relied on claims data, the structured information and text available in more comprehensive electronic health information systems can provide a richer substrate for future intelligent history models. Explicitly modelling temporality, such as the order in which visits occurred and the intervals of time between certain diagnoses, might further improve performance.
With proper integration into the clinical workflow, the intelligent history could aid the already overloaded clinician in identifying high risk patients who warrant further in-depth screening by the clinician. Such screening must always take place in the context of proper training for physicians in handling abuse and an environment that offers appropriate resources and referrals for abused patients.
Potential next steps towards the development of an early warning system for clinicians would include automation of the intelligent history as a service-oriented tool, and rigorous design work on the human interface to refine and test the numerical and visual presentation in creating an early warning system for clinicians. The approach would work as follows. A patient’s longitudinal medical history accumulates over time inside an electronic health record system. Whenever new information is recorded for the patient, the intelligent histories model re-analyses the information accumulated to date to estimate the patient’s risk of receiving a future diagnosis of abuse. The patient’s physician is notified if the patient is at high risk of abuse. The physician uses the visualisation to quickly review the patient’s past diagnoses and identify important long term trends in the patient’s history. The risk estimate, together with the high level view of the patient’s diagnostic history, enables the physician to make a better informed decision about whether to proceed with further screening of the patient. In this way, the intelligent histories model could improve screening by helping physicians to identify high risk patients who might otherwise be missed.
In conclusion, our findings suggest that the vast quantities of longitudinal data accumulating in electronic health information systems present an untapped opportunity for improving medical screening and diagnosis. In addition to the direct implications for prediction of risk of abuse, the general modelling framework presented here has far reaching potential implications for automated screening of other clinical conditions where longitudinal historical information can be useful for estimating clinical risk.
Domestic violence is a dangerous condition that is difficult to detect, and screening rates are low
Diagnostic histories might be useful in identifying patients who are at high risk of abuse, but physicians typically do not have time to thoroughly review this information during the course of a clinical visit
Longitudinal medical information commonly available in electronic health systems can be useful for predicting the risk of a patient receiving a future diagnosis of abuse
The Bayesian models used can serve as the basis for a future early warning system that could help doctors to identify high risk patients for further screening
We thank Karen Olson for preparing the dataset for analysis.
Contributors: BYR designed the study, developed the models, analysed the results, wrote the manuscript, and is guarantor. ISK contributed to study design and writing the manuscript and advised on clinical issues. KDM contributed to study design and writing the manuscript and advised on clinical issues.
Funding: This work was supported by the US Centers for Disease Control and Prevention (grant R01 PH000040) and the National Library of Medicine (grants R01 LM009879, R01 LM007677, and G08LM009778). The funders have no involvement with the research.
Statement of independence of researchers from funders: The authors and the research are completely independent of the funders.
Competing interests: None declared.
Ethical approval: This study was approved by the institutional review board approval.
Cite this as: