Project

Real-Time Data Mining and Perturbation

Complex data�such as neurophysiological recordings, or measures of human behavior, Internet, and general network data�are extremely difficult to analyze because of the dynamic nature of the high-dimensional set of interacting processes that generates the data. Accordingly, traditional statistical and data analysis methods�clustering, correlation, and so forth�can rarely create models sophisticated enough to explain the data without trying to explain noise, demanding astronomically sized datasets, or requiring enormous amounts of hand-tuning by insightful labor. We propose to design and develop a system that continuously generates novel data-modeling hypotheses and evaluates them in real time, testing models of ever-increasing complexity on data as it comes in.