A Cluster-Aggregate-Pool (CAP) Ensemble Algorithm for Improved Forecast Performance of influenza-like illness

Authors: Ningxi Wei, Xinze Zhou, Wei-Min Huang, Thomas McAndrew

License: CC BY 4.0

Abstract: Seasonal influenza causes on average 425,000 hospitalizations and 32,000 deaths per year in the United States. Forecasts of influenza-like illness (ILI) -- a surrogate for the proportion of patients infected with influenza -- support public health decision making. The goal of an ensemble forecast of ILI is to increase accuracy and calibration compared to individual forecasts and to provide a single, cohesive prediction of future influenza. However, an ensemble may be composed of models that produce similar forecasts, causing issues with ensemble forecast performance and non-identifiability. To improve upon the above issues we propose a novel Cluster-Aggregate-Pool or `CAP' ensemble algorithm that first clusters together individual forecasts, aggregates individual models that belong to the same cluster into a single forecast (called a cluster forecast), and then pools together cluster forecasts via a linear pool. When compared to a non-CAP approach, we find that a CAP ensemble improves calibration by approximately 10% while maintaining similar accuracy to non-CAP alternatives. In addition, our CAP algorithm (i) generalizes past ensemble work associated with influenza forecasting and introduces a framework for future ensemble work, (ii) automatically accounts for missing forecasts from individual models, (iii) allows public health officials to participate in the ensemble by assigning individual models to clusters, and (iv) provide an additional signal about when peak influenza may be near.

Submitted to arXiv on 29 Dec. 2023

Explore the paper tree

Click on the tree nodes to be redirected to a given paper and access their summaries and virtual assistant

Also access our AI generated Summaries, or ask questions about this paper to our AI assistant.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.