I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic Segmentation

AI-generated keywords: Semantic Segmentation

AI-generated Key Points

  • Unsupervised domain adaptation (UDA) in semantic segmentation eliminates the need for extensive annotation efforts.
  • Challenges in UDA arise from domain variations in low-level image statistics and high-level contexts, affecting segmentation performance in the target domain.
  • The proposed UDA pipeline integrates image-level and feature-level adaptation techniques for semantic segmentation.
  • Image-level domain shifts are addressed through global photometric alignment and global texture alignment modules.
  • Feature-level domain shifts are handled by performing global manifold alignment of pixel features from both domains onto the source domain's feature manifold.
  • Category centers in the source domain are regularized using a category-oriented triplet loss, while target domain consistency regularization is applied over augmented target images.
  • Experimental results show a significant improvement over previous methods, with an 8% increase in mean Intersection over Union (mIoU) achieved on the GTA5→Cityscapes task using Deeplab V3+ as the backbone model.
  • The proposed method outperforms state-of-the-art techniques by effectively addressing both image-level and feature-level adaptations in UDA for semantic segmentation tasks.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Haoyu Ma, Xiangru Lin, Yizhou Yu

To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI)
License: CC BY 4.0

Abstract: Unsupervised domain adaptation (UDA) for semantic segmentation is a promising task freeing people from heavy annotation work. However, domain discrepancies in low-level image statistics and high-level contexts compromise the segmentation performance over the target domain. A key idea to tackle this problem is to perform both image-level and feature-level adaptation jointly. Unfortunately, there is a lack of such unified approaches for UDA tasks in the existing literature. This paper proposes a novel UDA pipeline for semantic segmentation that unifies image-level and feature-level adaptation. Concretely, for image-level domain shifts, we propose a global photometric alignment module and a global texture alignment module that align images in the source and target domains in terms of image-level properties. For feature-level domain shifts, we perform global manifold alignment by projecting pixel features from both domains onto the feature manifold of the source domain; and we further regularize category centers in the source domain through a category-oriented triplet loss and perform target domain consistency regularization over augmented target domain images. Experimental results demonstrate that our pipeline significantly outperforms previous methods. In the commonly tested GTA5$\rightarrow$Cityscapes task, our proposed method using Deeplab V3+ as the backbone surpasses previous SOTA by 8%, achieving 58.2% in mIoU.

Submitted to arXiv on 03 Jan. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2301.01149v1

, , , , In the field of semantic segmentation, unsupervised domain adaptation (UDA) is a crucial task that eliminates the need for extensive annotation efforts. Challenges arise due to domain variations in low-level image statistics and high-level contexts, hindering segmentation performance in the target domain. To address this issue, this paper introduces a novel UDA pipeline for semantic segmentation that integrates both image-level and feature-level adaptation. Specifically, for addressing image-level domain shifts, the proposed approach includes a global photometric alignment module and a global texture alignment module. These modules align images from the source and target domains based on their properties. For handling feature-level domain shifts, the method performs global manifold alignment by projecting pixel features from both domains onto the feature manifold of the source domain. Additionally, category centers in the source domain are regularized using a category-oriented triplet loss, while target domain consistency regularization is applied over augmented target domain images. Experimental results demonstrate significant improvement over previous methods. In a commonly tested GTA5→Cityscapes task utilizing Deeplab V3+ as the backbone model leads to an 8% increase in mean Intersection over Union (mIoU), achieving 58.2%. The proposed method outperforms state-of-the-art techniques by effectively addressing both image-level and feature-level adaptations in UDA for semantic segmentation tasks. Furthermore, qualitative comparisons with existing methods highlight its superiority in various categories such as 'road', 'sidewalk', 'building', 'fence', 'vegetation', 'terrace', 'person', 'car', 'rider', 'truck', 'train', 'bus', 'motor' and 'bike'. is a crucial task that eliminates the need for extensive annotation efforts. Challenges arise due to domain variations in low-level image statistics and high-level contexts, hindering segmentation performance in the target domain. To address this issue, this paper introduces a novel UDA pipeline for that integrates both and . Specifically, for addressing image-level domain shifts, the proposed approach includes a global photometric alignment module and a global texture alignment module. These modules align images from the source and target domains based on their properties. For handling feature-level domain shifts, the method performs global manifold alignment by projecting pixel features from both domains onto the feature manifold of the source domain. Additionally, category centers in the source domain are regularized using a category-oriented triplet loss, while target domain consistency regularization is applied over augmented target domain images. Experimental results demonstrate significant improvement over previous methods. In a commonly tested GTA5→Cityscapes task utilizing Deeplab V3+ as the backbone model leads to an 8% increase in mean Intersection over Union (mIoU), achieving 58.2%. The proposed method outperforms state-of-the-art techniques by effectively addressing both image-level and feature-level adaptations in UDA for semantic segmentation tasks. Furthermore, qualitative comparisons with existing methods highlight its superiority in various categories such as 'road', 'sidewalk', 'building', 'fence', 'vegetation', 'terrace', 'person', 'car', 'rider', 'truck', 'train', 'bus', 'motor' and 'bike'.
Created on 17 Jun. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.