Abstract
Presentation Description :
In this talk, we consider whether Big Data are really a revolution or simply hype. Big Data present exciting opportunities to better understand risk factors, build improved predictors, and elucidate causal relationships. Along with inspiring new computing and statistical tools, the continuing explosion of diverse data generates new questions about the demographic, clinical, contextual, and social drivers of health and well-being. Yet, there are many sources of association between two variables: direct effects, indirect effects, measured confounding, unmeasured confounding, and selection bias. Methods to delineate causation from correlation are perhaps more pressing now than ever. In this Big Data era, we highlight the importance of the Causal Roadmap to assure the parameters being estimated match the questions posed, explicitly state and evaluate the assumptions, and harness recent advances in machine learning. In an application to HIV prevention in East Africa, we illustrate how the common challenges of differential measurement and censoring can still bias our estimates even when we have a sample size of n=77,774 adults.
This abstract was presented at the 2019 ARVO Annual Meeting, held in Vancouver, Canada, April 28 - May 2, 2019.