Online Program

Learning from Big Epidemiology Data: A Practical Approach

Tuesday, November 5, 2013 : 4:50 p.m. - 5:10 p.m.

David Dunson, PhD, Department of Statistical Science, Duke University, Durham, NC
In biomedical research it is now standard to collect tons of information on patients under study making it daunting to conduct the analysis. This is true not only in genomics but also in environmental epidemiology and other settings. For example, we are motivated by data from a birth defects epidemiology study containing information on many different types of defects, occupations, exposures and covariates. In practice in such settings, it remains standard to apply logistic regression but unfortunately standard approaches for model fitting based on maximum likelihood require the number of variables in the logistic regression model to be small. For this reason, epidemiologists typically examine one birth defect type and one exposure type at a time, while also limiting the number of possible confounders to include. Even when adjusting significance levels to guard against false positives, such an independent screening approach has major limitations and it is clearly preferred to consider simultaneous analyses of all the variables. This can be accomplished using recently developed Bayesian methods for multiway tables, applying a simple Gibbs sampler that can be implemented routinely in R or Matlab without need to fully understand what's involved "under the hood". This talk is designed to be accessible to epidemiologists and focus on the practical advantages and issues in applying such "modern" methods, with birth defect applications providing a concrete illustration.

Learning Areas:

Biostatistics, economics

Learning Objectives:
Define 'big data' in epidemiological studies Describe a case of 'big data' in birth defects epidemiology Explain statistical methods that efficiently handle 'big data' in epidemiological studies

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I am qualified to present because I am a Professor of Biostatistics.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.

Back to: 4383.0: Big data solutions