Foundation Models for Remote Sensing and Earth Observation: A Survey

AI-generated keywords: Remote Sensing

AI-generated Key Points

  • Remote Sensing (RS) is crucial for observing and interpreting the planet, with applications in various fields.
  • Artificial intelligence (AI), particularly deep learning, has made strides in RS but faces challenges due to Earth's complexity and diverse sensor modalities.
  • Recent advancements in large Foundation Models (FMs) have led to the emergence of Remote Sensing Foundation Models (RSFMs) tailored for Earth Observation tasks.
  • Challenges in developing RSFMs include domain discrepancies, limited pre-training datasets, lack of specialized architectures, and unique RS applications.
  • Efforts are underway to address these challenges by developing advanced RSFMs and integrating FMs within the RS domain.
  • The paper provides a comprehensive survey of recent advancements in RSFMs categorized into Visual Foundation Models (VFMs), Visual-Language Models (VLMs), Large Language Models (LLMs), and generative FMs for RS.
  • Key contributions include a systematic review of advancements in RSFMs across different model types and sensor modalities, benchmarking performance on various tasks, and identifying research challenges for future exploration.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Aoran Xiao, Weihao Xuan, Junjue Wang, Jiaxing Huang, Dacheng Tao, Shijian Lu, Naoto Yokoya

License: CC BY-NC-SA 4.0

Abstract: Remote Sensing (RS) is a crucial technology for observing, monitoring, and interpreting our planet, with broad applications across geoscience, economics, humanitarian fields, etc. While artificial intelligence (AI), particularly deep learning, has achieved significant advances in RS, unique challenges persist in developing more intelligent RS systems, including the complexity of Earth's environments, diverse sensor modalities, distinctive feature patterns, varying spatial and spectral resolutions, and temporal dynamics. Meanwhile, recent breakthroughs in large Foundation Models (FMs) have expanded AI's potential across many domains due to their exceptional generalizability and zero-shot transfer capabilities. However, their success has largely been confined to natural data like images and video, with degraded performance and even failures for RS data of various non-optical modalities. This has inspired growing interest in developing Remote Sensing Foundation Models (RSFMs) to address the complex demands of Earth Observation (EO) tasks, spanning the surface, atmosphere, and oceans. This survey systematically reviews the emerging field of RSFMs. It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts. It then categorizes and reviews existing RSFM studies including their datasets and technical contributions across Visual Foundation Models (VFMs), Visual-Language Models (VLMs), Large Language Models (LLMs), and beyond. In addition, we benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions in this rapidly evolving field.

Submitted to arXiv on 22 Oct. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2410.16602v1

, , , , Remote Sensing (RS) plays a crucial role in observing, monitoring, and interpreting our planet, with applications spanning geoscience, economics, humanitarian fields, and more. Artificial intelligence (AI), particularly deep learning, has made significant strides in RS but faces challenges due to the complexity of Earth's environments, diverse sensor modalities, and varying resolutions. Recent advancements in large Foundation Models (FMs) have shown promise in various domains but struggle with RS data of non-optical modalities. This has led to the emergence of Remote Sensing Foundation Models (RSFMs) tailored for Earth Observation (EO) tasks. Developing RSFMs presents challenges such as domain discrepancies between natural and RS data, limited pre-training datasets, lack of specialized architectures, and unique RS applications. Efforts are underway to address these challenges by developing advanced RSFMs and integrating FMs within the RS domain. However, the field lacks a comprehensive survey on RSFMs. This paper aims to fill this gap by providing an extensive survey of recent advancements in RSFMs. It categorizes existing methods into Visual Foundation Models (VFMs), Visual-Language Models (VLMs), Large Language Models (LLMs), and generative FMs for RS. The survey covers learning paradigms, datasets, technical approaches, benchmarks, and future research directions. Key contributions include a systematic review of recent advancements in RSFMs across different model types and sensor modalities. The paper benchmarks and analyzes the performance of RSFMs on various tasks and identifies research challenges for future exploration. The structure of the survey includes background knowledge on RSFMs in Section 2, foundations of RSFMs in Section 3, reviews of VFMs in Section 4, VLMs in Section 5, other types of RSFMs in Section 6. Performance comparisons across benchmark datasets are presented in Section 7 with future research directions outlined in Section 8. Additionally, early insights from Manvi et al. revealed that LLMs possess spatial knowledge but struggle with accurate predictions for geospatial indicators like population density. To address this limitation, GeoLLM was introduced to fine-tune LLMs using prompts enriched with auxiliary map data from OpenStreetMap. Generative models for RS have also been explored for image generation tasks like inpainting and colorization but face challenges due to the unique characteristics of multi-spectral RS data. Overall, this detailed summary highlights the importance of developing specialized AI models like RSFMs for effectively utilizing large-scale geospatial data and addressing complex Earth surface dynamics while offering insights into current advancements and future research directions in this rapidly evolving field.
Created on 19 Feb. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.