<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" article-type="abstract"><?properties open_access?><front><journal-meta><journal-id journal-id-type="nlm-ta">Online J Public Health Inform</journal-id><journal-id journal-id-type="iso-abbrev">Online J Public Health Inform</journal-id><journal-id journal-id-type="publisher-id">OJPHI</journal-id><journal-title-group><journal-title>Online Journal of Public Health Informatics</journal-title></journal-title-group><issn pub-type="epub">1947-2579</issn><publisher><publisher-name>University of Illinois at Chicago Library</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="pmc">6087926</article-id><article-id pub-id-type="publisher-id">ojphi-10-e13</article-id><article-id pub-id-type="doi">10.5210/ojphi.v10i1.8328</article-id><article-categories><subj-group subj-group-type="heading"><subject>ISDS 2018 Conference Abstracts</subject></subj-group></article-categories><title-group><article-title>Free-Text Mining to Improve Syndrome Definition Matching Across
Emergency Departments</article-title></title-group><contrib-group><contrib contrib-type="author"><name><surname>Arkin</surname><given-names>Kristin</given-names></name><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref><xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref></contrib><aff id="aff1"><label>1</label><institution>Centers for Disease Control and
Prevention</institution>, <addr-line>Atlanta, GA</addr-line>,
<country>USA</country>; </aff><aff id="aff2"><label>2</label><institution>Idaho Division of Public Health,</institution>
<addr-line>Boise, ID</addr-line>, <country>USA</country></aff></contrib-group><author-notes><corresp id="cor1"><label>*</label>Kristin Arkin E-mail: <email xlink:href="Kristinaarkin@gmail.com">Kristinaarkin@gmail.com</email></corresp></author-notes><pub-date pub-type="epub"><day>30</day><month>5</month><year>2018</year></pub-date><pub-date pub-type="collection"><year>2018</year></pub-date><volume>10</volume><issue>1</issue><elocation-id>e13</elocation-id><permissions><license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by-nc/3.0/"><license-p>ISDS Annual Conference Proceedings 2018. This is an Open Access
article distributed under the terms of the Creative Commons
Attribution-Noncommercial 3.0 Unported License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/3.0/">http://creativecommons.org/licenses/by-nc/3.0/</ext-link>), permitting all
non-commercial use, distribution, and reproduction in any medium, provided the
original work is properly cited.</license-p></license></permissions><kwd-group kwd-group-type="author"><title>Keywords </title><kwd>Syndromic</kwd><kwd>Syndrome Definition</kwd><kwd>Free-text Mining</kwd><kwd>ESSENCE</kwd><kwd>ILI</kwd></kwd-group></article-meta></front><body><sec><title>Objective</title><p>We sought to use free text mining tools to improve emergency department (ED) chief
complaint and discharge diagnosis data syndrome definition matching across
facilities with differing robustness of data in the Electronic Surveillance System
for the Early Notification of Community-based Epidemics (ESSENCE) application in
Idaho&#x02019;s syndromic surveillance system.</p></sec><sec sec-type="intro"><title>Introduction</title><p>Standard syndrome definitions for ED visits in ESSENCE rely on chief complaints.
Visits with more words in the chief complaint field are more likely to match
syndrome definitions. While using ESSENCE, we observed geographic differences in
chief complaint length, apparently related to differences in electronic health
record (EHR) systems, which resulted in disparate syndrome matching across Idaho
regions. We hypothesized that chief complaint and diagnosis code co-occurrence among
ED visits to facilities with long chief complaints could help identify terms that
would improve syndrome match among facilities with short chief complaints.</p></sec><sec sec-type="methods"><title>Methods</title><p>The ESSENCE-defined influenza-like illness (ILI) chief complaint syndrome was used as
the base syndrome for this analysis. Syndrome- matched visits were defined as visits
that match the syndrome definition. We assessed chief complaints and diagnosis code
co-occurrence of syndrome-matched visits using the RCRAN TidyText package and
developed a bigram network from normalized, concatenated chief complaint and
diagnosis code (CCDD) fields and normalized diagnosis code (DD) fields per
previously described methodologies.<sup>1</sup> Common connections were defined by a
natural break in frequency of pair occurrence for CCDD pairs (30 occurrences) and DD
pairs (5 occurrences). The ESSENCE syndrome was revised by adding relevant bigram
network clusters and logic operators. We compared time series of the percent of ED
visits matched to the ESSENCE syndrome with those matched to the revised syndrome.
We stratified the time series by facilities grouped by short (average &#x0003c; 4 words,
&#x0201c;Group A&#x0201d;) and long (average &#x02265; 4 words, &#x0201c;Group
B&#x0201d;) chief complaint fields (Figure 1). Influenza season start was defined as
two consecutive weeks above baseline, or the 95% upper confidence limit of percent
syndrome- matched visits outside of the CDC ILI surveillance season. Season trends
and influenza-related deaths in Idaho residents were compared.</p></sec><sec sec-type="results"><title>Results</title><p>During August 1, 2016 through July 31, 2017, 1,587 (1.17%) of 135,789 ED visits
matched the ESSENCE syndrome. Bigram networks of CCDD fields produced clusters
already included by the ESSENCE syndrome. The bigram network of DD fields (Figure 2)
produced six clusters. The revised syndrome definition included the ESSENCE
syndrome, 3 single DD terms, and 3 two DD terms combined. The start of influenza
season was identified as the same week for both ILI syndrome definitions (ESSENCE
baseline 0.70%; revised baseline 2.21%). The ESSENCE syndrome indicated the season
peaked during Morbidity and Mortality Weekly Report (MMWR) week 2017-05 with the
season ending MMWR week 2017-14. The revised syndrome indicated 2017-20 as the
season end. Multiple peaks seen with the revised syndrome during MMWR weeks 2017-02,
2017-05, and 2017-10 mirrored peaks in influenza-related deaths during MMWR weeks
2017-03, 2017-06, and 2017-11. ILI season onset was five weeks earlier with the
revised syndrome compared with the ESSENCE syndrome in Group A facilities, but
remained the same in Group B. The annual percentage of ED visits related to ILI was
more uniform between facility groups under the revised syndrome than the ESSENCE
syndrome. Unlike the trend seen with the ESSENCE syndrome, the revised syndrome
shows low- level ILI activity in both groups year-round.</p></sec><sec sec-type="conclusions"><title>Conclusions</title><p>In Idaho, dramatic differences in ED visit chief complaint word counts were seen
between facilities; bigram networks were found to be an important tool to identify
diagnosis codes and logical operators that built more inclusive syndrome definitions
when added to an existing chief complaint syndrome. Bigram networks may aid
understanding the relationship between chief complaints and diagnosis codes in
syndrome-matched visits. Use of trade names and commercial sources is for
identification only and does not imply endorsement by the Centers for Disease
Control and Prevention, the Public Health Service, or the U.S. Department of Health
and Human Services.</p><fig id="f1" fig-type="figure" orientation="portrait" position="float"><label>Figure 1</label><caption><p>Percent of influenza-like illness-related emergency department visits by MMWR
week for the original ESSENCE syndrome (grey) and revised syndrome (blue)
grouped by facilities with short (top) and long (bottom) chief complaint
fields.</p></caption><graphic xlink:href="ojphi-10-e13-g001"/></fig><fig id="f2" fig-type="figure" orientation="portrait" position="float"><label>Figure 2</label><caption><p>A bigram network displaying common diagnosis code pairs for emergency
department visits matched to the ESSENCE influenza-like illness
syndrome.</p></caption><graphic xlink:href="ojphi-10-e13-g002"/></fig></sec></body><back><ref-list><title>References</title><ref id="r1"><label>1</label><mixed-citation publication-type="book">Silge J, Robinson D. (2017). &#x0201c;Text
Mining with R&#x0201d;. O&#x02019;Reilly.</mixed-citation></ref></ref-list></back></article>