SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors

AI-generated keywords: Scientific simulations

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Scientific simulations generate massive amounts of data, requiring data reduction.
  • Error-bounded lossy compression is an effective solution.
  • Customization and optimization are needed for the best-fit compression method.
  • SZ3 is a modular framework for composing prediction-based error-bounded compressors.
  • SZ3 offers easy integration of new compression modules.
  • It supports multialgorithm predictors and selects the most suitable predictor for each data block.
  • Users can compose different compression pipelines on demand.
  • SZ3 achieved up to a 20% improvement in compression ratios compared to state-of-the-art approaches while maintaining the same level of data distortion.
  • The framework addresses challenges posed by large-scale scientific simulations' data volume.
  • It improves compression quality, performance, and flexibility.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Xin Liang, Kai Zhao, Sheng Di, Sihuan Li, Robert Underwood, Ali M. Gok, Jiannan Tian, Junjing Deng, Jon C. Calhoun, Dingwen Tao, Zizhong Chen, Franck Cappello

13 pages

Abstract: Today's scientific simulations require a significant reduction of data volume because of extremely large amounts of data they produce and the limited I/O bandwidth and storage space. Error-bounded lossy compressor has been considered one of the most effective solutions to the above problem. In practice, however, the best-fit compression method often needs to be customized/optimized in particular because of diverse characteristics in different datasets and various user requirements on the compression quality and performance. In this paper, we develop a novel modular, composable compression framework (namely SZ3), which involves three significant contributions. (1) SZ3 features a modular abstraction for the prediction-based compression framework such that the new compression modules can be plugged in easily. (2) SZ3 supports multialgorithm predictors and can automatically select the best-fit predictor for each data block based on the designed error estimation criterion. (3) SZ3 allows users to easily compose different compression pipelines on demand, such that both compression quality and performance can be significantly improved for their specific datasets and requirements. (4) In addition, we evaluate several lossy compressors composed from SZ3 using the real-world datasets. Specifically, we leverage SZ3 to improve the compression quality and performance for different use-cases, including GAMESS quantum chemistry dataset and Advanced Photon Source (APS) instrument dataset. Experiments show that our customized compression pipelines lead to up to 20% improvement in compression ratios under the same data distortion compared with the state-of-the-art approaches.

Submitted to arXiv on 04 Nov. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2111.02925v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Today's scientific simulations generate massive amounts of data, necessitating the reduction of data volume due to limited I/O bandwidth and storage space. One effective solution to this problem is error-bounded lossy compression. However, finding the best-fit compression method often requires customization and optimization based on diverse dataset characteristics and user requirements for compression quality and performance. To address these challenges, this paper introduces a novel modular framework called SZ3 for composing prediction-based error-bounded compressors. <br/> <br/> SZ3 offers three significant contributions. Firstly, it features a modular abstraction that allows easy integration of new compression modules into the framework. Secondly, SZ3 supports multialgorithm predictors and automatically selects the most suitable predictor for each data block using a designed error estimation criterion. Lastly, SZ3 enables users to compose different compression pipelines on demand, enhancing both compression quality and performance according to specific dataset requirements.<br/> <br/> The authors evaluate several lossy compressors created with SZ3 using real-world datasets, including the GAMESS quantum chemistry dataset and Advanced Photon Source (APS) instrument dataset. The experiments demonstrate that the customized compression pipelines achieved up to a 20% improvement in compression ratios compared to state-of-the-art approaches while maintaining the same level of data distortion.<br/> <br/> Overall, this paper presents an innovative approach to address the challenges posed by large-scale scientific simulations' data volume through a modular and composable compression framework. The proposed framework not only improves compression quality and performance but also provides flexibility in adapting to diverse datasets and user requirements.
Created on 10 Feb. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.