Computer Algorithm for Automated Work Group Classification from Free Text: The DREAM Technique
-
2007/01/01
-
Details
-
Personal Author:
-
Description:Objective: This study developed and tested a computer method to automatically assign subjects to aggregate work groups based on their free text work descriptions. Methods: The Double Root Extended Automated Matcher (DREAM) algorithm classifies individuals based on pairs of subjects' free text word roots in common with those of standard classification systems and several explicitly defined linkages between term roots and aggregates. Results: DREAM effectively analyzed free text from 5887 participants in a multisite chronic obstructive pulmonary disease prevention study (Lung Health Study). For a test set of 533 cases, DREAMs classifications compared favorably with those of a four-human panel. The humans rated the accuracy of DREAM as good or better in 80% of the test cases. Conclusions: Automated text interpretation is a promising tool for analyzing large data sets for applications in data mining, research, and surveillance. Work descriptive information is most useful when it can link an individual to aggregate entities that have occupational health relevance. Determining the appropriate group requires considerable expertise. This article describes a new method for making such assignments using a computer algorithm to reduce dependence on the limited number of occupational health experts. In addition, computer algorithms foster consistency of assignments. [Description provided by NIOSH]
-
Subjects:
-
Keywords:
-
ISSN:1076-2752
-
Document Type:
-
Funding:
-
Genre:
-
Place as Subject:
-
CIO:
-
Topic:
-
Location:
-
Pages in Document:41-49
-
Volume:49
-
Issue:1
-
NIOSHTIC Number:nn:20058524
-
Citation:J Occup Environ Med 2007 Jan; 49(1):41-49
-
Contact Point Address:Philip Harber, MD, MPH, Division of Occupational and Environmental Medicine, David Geffen School of Medicine at UCLA, 10880 Wilshire Boulevard, Suite 1800, Los Angeles, CA 90024
-
Email:pharber@mednet.ucla.edu
-
Federal Fiscal Year:2007
-
Performing Organization:University of California Los Angeles
-
Peer Reviewed:True
-
Start Date:20050701
-
Source Full Name:Journal of Occupational and Environmental Medicine
-
End Date:20270630
-
Collection(s):
-
Main Document Checksum:urn:sha-512:77bf44f27a8a6e58c1df12e1c023caef467ddfb8296c04e0c4f708462eadd047a36e4eb1d23e12301905bc64c7b4cd1beb0029e15dadc415243911ba702c0aa9
-
Download URL:
-
File Type:
ON THIS PAGE
CDC STACKS serves as an archival repository of CDC-published products including
scientific findings,
journal articles, guidelines, recommendations, or other public health information authored or
co-authored by CDC or funded partners.
As a repository, CDC STACKS retains documents in their original published format to ensure public access to scientific information.
As a repository, CDC STACKS retains documents in their original published format to ensure public access to scientific information.
You May Also Like