PFT-SSR: Parallax Fusion Transformer for Stereo Image Super-Resolution

AI-generated keywords: Stereo Image Super-Resolution Cross-View Fusion Transformer Intra-View Refinement Transformer Parallax Fusion Transformer State-of-the-Art

AI-generated Key Points

  • Stereo image super-resolution is a technique that enhances the performance of image super-resolution by utilizing additional information provided by binocular systems.
  • Previous methods did not fully utilize cross-view and intra-view information.
  • A Parallax Fusion Transformer (PFT) module was proposed to address this issue, consisting of Cross-view Fusion Transformer (CVFT) and Intra-view Refinement Transformer (IVRT).
  • CVFT merges different parallaxes and extracts features from both left and right images while IVRT refines the intra-view features by removing noise and enhancing texture details.
  • PFT was combined with Swin Transformer as the backbone for feature extraction and SR reconstruction to form a pure Transformer architecture called PFT-SSR.
  • Extensive experiments were conducted to evaluate the effectiveness of PFT-SSR compared to other state-of-the-art methods such as EDSR, RCAN, StereoSR, PASSRnet, iPASSR, and SSRDE-FNett.
  • The results showed that PFT–SSR outperformed most SOTA methods on various datasets, especially on Flickr102.
  • An ablation study was conducted on the choice of cross–view interaction technology to demonstrate the strong stereo image fusion ability of PFT which improved model performance more effectively than Swin Transformer Blocks or biPAM.
  • In conclusion, PFT–SSR is a promising approach for stereo image super–resolution that fully utilizes cross–view and intra–view information through its innovative Parallax Fusion Transformer module.
  • Experiments demonstrated its superiority over other state–of–the–art methods in terms of accuracy in reconstructing edges and texture details.
  • The source code for PFT–SSR is available on GitHub.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Hansheng Guo, Juncheng Li, Guangwei Gao, Zhi Li, Tieyong Zeng

ICASSP 2023
5 pages, 3 figures
License: CC BY 4.0

Abstract: Stereo image super-resolution aims to boost the performance of image super-resolution by exploiting the supplementary information provided by binocular systems. Although previous methods have achieved promising results, they did not fully utilize the information of cross-view and intra-view. To further unleash the potential of binocular images, in this letter, we propose a novel Transformerbased parallax fusion module called Parallax Fusion Transformer (PFT). PFT employs a Cross-view Fusion Transformer (CVFT) to utilize cross-view information and an Intra-view Refinement Transformer (IVRT) for intra-view feature refinement. Meanwhile, we adopted the Swin Transformer as the backbone for feature extraction and SR reconstruction to form a pure Transformer architecture called PFT-SSR. Extensive experiments and ablation studies show that PFT-SSR achieves competitive results and outperforms most SOTA methods. Source code is available at https://github.com/MIVRC/PFT-PyTorch.

Submitted to arXiv on 24 Mar. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2303.13807v1

Stereo image super-resolution is a technique that utilizes the additional information provided by binocular systems to enhance the performance of image super-resolution. Previous methods have not fully utilized the cross-view and intra-view information. To address this issue, researchers proposed a Parallax Fusion Transformer (PFT) module consisting of two components: Cross-view Fusion Transformer (CVFT) and Intra-view Refinement Transformer (IVRT). CVFT merges different parallaxes and extracts features from both left and right images while IVRT refines the intra-view features by removing noise and enhancing texture details. The team then combined PFT with Swin Transformer as the backbone for feature extraction and SR reconstruction to form a pure Transformer architecture called PFT-SSR. Extensive experiments and ablation studies were conducted to evaluate the effectiveness of PFT-SSR compared to other state-of-the art methods such as EDSR, RCAN, StereoSR, PASSRnet, iPASSR, and SSRDE–FNett. The results showed that PFT–SSR outperformed most SOTA methods on various datasets, especially on Flickr102. Additionally, an ablation study was conducted on the choice of cross–view interaction technology to demonstrate the strong stereo image fusion ability of PFT which improved model performance more effectively than Swin Transformer Blocks or biPAM. In conclusion, PFT–SSR is a promising approach for stereo image super–resolution that fully utilizes cross–view and intra–view information through its innovative Parallax Fusion Transformer module. Experiments demonstrated its superiority over other state–of–the–art methods in terms of accuracy in reconstructing edges and texture details. The source code for PFT–SSR is available on GitHub.
Created on 03 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.