Online Program

Using text summarization for surveillance: A case study involving OSHA fatality and catastrophe reports

Monday, November 4, 2013 : 9:15 a.m. - 9:30 a.m.

Robin Ackerman, JD, SM, MSN, Office of the Solicitor, U.S. Department of Labor (listed for the purposes of affiliation only - I am participating in an unofficial capacity only), Boston, MA
Luke Miratrix, PhD, Department of Statistics, Harvard University, Cambridge, MA
Background: OSHA and NIOSH recently released a Hazard Alert on methylene chloride and bathroom refinishing, following the deaths of at least 14 workers since 2000 in related circumstances. Although OSHA maintains a database of narrative reports describing fatalities and catastrophes, similar patterns of preventable exposure to occupational hazards may be difficult to identify efficiently through manual review alone, given the large number of narratives in this database. Using methylene chloride as a case study, we consider whether text mining techniques can help identify important patterns in circumstances of hazardous exposures. Methods: Using keywords entered by compliance officers to categorize narrative reports, we apply our text algorithms to generate summaries for each category. By manually examining the resulting words and phrases associated with methylene chloride, we consider whether our text summarization tool effectively characterizes the circumstances of the bathroom refinishing fatalities. Initial and Expected Results: Based on initial results and earlier work using the same algorithms, we expect that our text summarization tools will accurately portray the differences and unique characteristics of each keyword category. This presentation will explore whether these automated summaries are useful for surveillance purposes in particular. Preliminary Conclusions: While text mining algorithms are unlikely to identify all meaningful patterns in narrative reports, the computational advantages of text summarization tools may complement manual review. The summaries themselves, for example, can serve as a launching point for inquiry by raising potential flags that can direct further investigation.

Learning Areas:

Communication and informatics
Occupational health and safety

Learning Objectives:
Evaluate the utility of automated text summarization for surveillance purposes in the context of OSHA fatality and catastrophe reports. Discuss whether text summarization algorithms are likely to facilitate the detection of recurring patterns of preventable exposure to occupational hazards.

Keyword(s): Data/Surveillance, OSHA

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I recently completed a year-long special assignment at OSHA focusing on new uses and analyses of administrative datasets (including fatality/catastrophe reports) for evaluation, targeting, and surveillance. Last year I presented a poster on text mining at a meeting sponsored by OSHA, NIOSH, and BLS. Within the Department of Labor, I have worked as a health scientist, special assistant, and attorney.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.