Automating Data Clearning, Merging, Processing, and Visualization in Real Time


This was a poster presentation covering work performed for Virginia Tech during the pandemic as part of their effort to combat Covid-19 outbreaks on campus.


During the COVID-19 pandemic, Virginia Tech tried to be proactive instead of reactive in regard to outbreaks on campus. Using wastewater collected from dormitory outflow locations, models were fit with the hope that upticks in COVID-19 infection could be predicted early enough that resources such as extra tests could be allocated intelligently, instead of randomly. In order to accomplish this, various data streams such as dormitory swipe card data, wastewater test results, and university isolation and quarantine information, all needed to be kept up to date so that decision-makers had access to the most recent visualizations and predictions at any moment. To this end, the entire process of data acquisition, cleaning, merging of data streams, processing, modeling, and visualization was automated so that no human interaction was needed daily. In addition, this all needed to be done in a way that was HIPAA compliant, since student health records were an important part of the modeling process. This presentation focuses on the steps taken to achieve the goal of complete automation, from automatically collecting new data when streams were updated, to providing updated visualizations and model results to decision-makers in the form of an R Shiny app whenever they needed it.