clusterBMA: Bayesian model averaging for clustering

AI-generated keywords: Bayesian model averaging unsupervised clustering ensemble clustering probabilistic interpretation model-based uncertainty

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • The paper introduces clusterBMA, a method for weighted model averaging across multiple unsupervised clustering algorithms.
  • Bayesian model averaging (BMA) is proposed to combine results from multiple models, providing a probabilistic interpretation of the combined cluster structure and quantifying model-based uncertainty.
  • Internal validation criteria are used to approximate posterior model probabilities for weighting results from each model.
  • A consensus matrix is constructed to represent a weighted average of clustering solutions across models, with symmetric simplex matrix factorization applied to calculate final probabilistic cluster allocations.
  • The method outperforms other ensemble clustering methods on simulated data and offers unique features such as probabilistic allocation to averaged clusters, combining 'hard' and 'soft' clustering algorithms, and measuring model-based uncertainty in averaged cluster allocation.
  • The innovative method is implemented in an accompanying R package named [package name].
  • Overall, the paper provides a comprehensive framework for combining inference across multiple sets of results for unsupervised clustering through Bayesian model averaging, addressing uncertainties associated with model selection in ensemble clustering literature.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Owen Forbes, Edgar Santos-Fernandez, Paul Pao-Yen Wu, Hong-Bo Xie, Paul E. Schwenn, Jim Lagopoulos, Lia Mills, Dashiell D. Sacks, Daniel F. Hermens, Kerrie Mengersen

License: CC BY-NC-ND 4.0

Abstract: Various methods have been developed to combine inference across multiple sets of results for unsupervised clustering, within the ensemble clustering literature. The approach of reporting results from one `best' model out of several candidate clustering models generally ignores the uncertainty that arises from model selection, and results in inferences that are sensitive to the particular model and parameters chosen. Bayesian model averaging (BMA) is a popular approach for combining results across multiple models that offers some attractive benefits in this setting, including probabilistic interpretation of the combined cluster structure and quantification of model-based uncertainty. In this work we introduce clusterBMA, a method that enables weighted model averaging across results from multiple unsupervised clustering algorithms. We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model. From a consensus matrix representing a weighted average of the clustering solutions across models, we apply symmetric simplex matrix factorisation to calculate final probabilistic cluster allocations. In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters, combining allocation probabilities from 'hard' and 'soft' clustering algorithms, and measuring model-based uncertainty in averaged cluster allocation. This method is implemented in an accompanying R package of the same name.

Submitted to arXiv on 09 Sep. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2209.04117v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The paper "clusterBMA: Bayesian model averaging for clustering" by Owen Forbes, Edgar Santos-Fernandez, Paul Pao-Yen Wu, Hong-Bo Xie, Paul E. Schwenn, Jim Lagopoulos, Lia Mills, Dashiell D. Sacks, Daniel F. Hermens, and Kerrie Mengersen introduces a method for weighted model averaging across results from multiple unsupervised clustering algorithms. The traditional approach of selecting the 'best' model out of several candidate clustering models often overlooks the uncertainty arising from model selection. This can lead to sensitive inferences that are dependent on specific models and parameters chosen. <br/><br/> (BMA) is proposed as a solution to effectively combine results across multiple models by providing a probabilistic interpretation of the combined cluster structure and quantifying model-based uncertainty. The method utilized in this paper utilizes internal validation criteria to approximate posterior model probabilities for weighting results from each model.<br/><br/> By constructing a consensus matrix that represents a weighted average of clustering solutions across models,, symmetric simplex matrix factorization is applied to calculate final probabilistic cluster allocations. Notably, outperforms other ensemble clustering methods on simulated data and offers unique features such as probabilistic allocation to averaged clusters, combining allocation probabilities from both 'hard' and 'soft' clustering algorithms, and measuring model-based uncertainty in averaged cluster allocation.<br/><br/> This innovative method presented in the paper is implemented in an accompanying R package named . Overall, the paper provides a comprehensive framework for combining inference across multiple sets of results for unsupervised clustering through Bayesian model averaging, offering significant advancements in addressing uncertainties associated with model selection in ensemble clustering literature.
Created on 30 Jan. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.