Background: Influenza virus is responsible for a yearly epidemic in much of the world. To better predict short-term, seasonal variations in flu infection rates and possible mechanisms of yearly infection variation, we trained a Long Short-Term Memory (LSTM)-based deep neural network on historical Influenza-Like-Illness (ILI), climate, and population data. Methods: Data were collected from the Centers for Disease Control and Prevention (CDC), the National Center for Environmental Information (NCEI), and the United States Census Bureau. The model was initially built in Python using the Keras API and tuned manually. We explored the roles of temperature, precipitation, local wind speed, population size, vaccination rate, and vaccination efficacy. The model was validated using K-fold cross validation as well as forward chaining cross validation and compared to several standard algorithms. Finally, simulation data was generated in R and used for further exploration of the model. Results: We found that temperature is the strongest predictor of ILI rates, but also found that precipitation increased the predictive power of the network. Additionally, the proposed model achieved a +1 week prediction mean absolute error (MAE) of 0.1973. This is less than half of the MAE achieved by the next best performing algorithm. Additionally, the model accurately predicted simulation data. To test the role of temperature in the network, we phase-shifted temperature in time and found a predictable reduction in prediction accuracy. Conclusions: The results of this study suggest that short term flu forecasting may be effectively accomplished using architectures traditionally reserved for time series analysis. The proposed LSTM-based model was able to outperform comparison models at the +1 week time point. Additionally, this model provided insight into the week-to-week effects of climatic and biotic factors and revealed potential patterns in data series. Specifically, we found that temperature is the strongest predictor of seasonal flu infection rates. This information may prove to be especially important for flu forecasting given the uncertain long-term impact of the SARS-CoV-2 pandemic on seasonal influenza.
All Science Journal Classification (ASJC) codes
- Public Health, Environmental and Occupational Health
- Machine learning