New public health threats and recent advances in the capture and transmission of electronic health data are transforming public-health surveillance. Public health agencies now have access to large volumes of real-time data from clinical and other settings. While these data offer great potential to identify emerging public health threats, their large volume and non-specific nature pose new challenges for analysis. To take advantage of these novel data sources, many public health agencies operate surveillance systems that monitor data automatically in real-time. As the adoption of these systems increases, understanding the determinants of their effectiveness is essential and a principled means of selecting and evaluating aberrancy detection algorithms is needed to ensure the effectiveness of these systems. Our central hypothesis is that, through empirical experimentation, we can discover fundamental knowledge about aberrancy detection that will guide process of selecting algorithms for use in automated surveillance systems. This evidence will allow system developers to select an appropriate algorithm, given a data stream and a surveillance goal.
The goals of this project are to develop fundamental knowledge about the performance of aberrancy detection algorithms used in public health surveillance and to develop methodologies and tools that will use this knowledge to automatically select and apply surveillance algorithms. We have developed a software system called BioSTORM (Biological Spatio-Temporal Outbreak Reasoning Module) to allow the specification and deployment of aberrancy detection algorithms in the contexts of evaluation and on-line surveillance. BioSTORM enables concurrent application and high-throughput evaluation of aberrancy detection algorithms.