Evaluating Integrative Strategies for Incorporating Phenotypic Features in Spatial Transcriptomics

AI-generated keywords: Spatial transcriptomics MERFISH variational autoencoder biological variation multi-modal integration

AI-generated Key Points

  • Spatial transcriptomics (ST) technologies allow analysis of intact biological samples in a spatially informed manner
  • Integration of ST with imaging is a current challenge
  • Researchers used murine ileum MERFISH dataset to explore the potential of a minimally tuned variational autoencoder (VAE)
  • VAE successfully extracted informative low-dimensional representations from cell crops
  • Evaluation methods included PERMANOVA, cross-validated classification, and unsupervised Leiden clustering
  • VAE-derived latent spaces (LSs) captured meaningful biological variation and improved label recovery for specific cell types
  • LS2, trained on morphological input, showed moderate predictive power for select genes
  • Combining transcript counts (TC) with LSs through multiplex clustering enhanced cluster homogeneity
  • Features derived from CellProfiler underperformed compared to LSs, emphasizing the advantage of learned representations
  • VAEs can extract biologically relevant signals from imaging data even under constrained conditions
  • VAEs offer promise for multi-modal integration in spatial transcriptomics research
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Levin M Moser, Ahmad Kamal Hamid, Esteban Miglietta, Nodar Gogoberidze, Beth A Cimini

arXiv: 2507.22212v1 - DOI (q-bio.QM)
License: CC BY 4.0

Abstract: Spatial transcriptomics (ST) technologies not only offer an unprecedented opportunity to interrogate intact biological samples in a spatially informed manner, but also set the stage for integration with other imaging-based modalities. However, how to best exploit spatial context and integrate ST with imaging-based modalities remains an open question. To address this, particularly under real-world experimental constraints such as limited dataset size, class imbalance, and bounding-box-based segmentation, we used a publicly available murine ileum MERFISH dataset to evaluate whether a minimally tuned variational autoencoder (VAE) could extract informative low-dimensional representations from cell crops of spot counts, nuclear stain, membrane stain, or a combination thereof. We assessed the resulting embeddings through PERMANOVA, cross-validated classification, and unsupervised Leiden clustering, and compared them to classical image-based feature vectors extracted via CellProfiler. While transcript counts (TC) generally outperformed other feature spaces, the VAE-derived latent spaces (LSs) captured meaningful biological variation and enabled improved label recovery for specific cell types. LS2, in particular, trained solely on morphological input, also exhibited moderate predictive power for a handful of genes in a ridge regression model. Notably, combining TC with LSs through multiplex clustering led to consistent gains in cluster homogeneity, a trend that also held when augmenting only subsets of TC with the stain-derived LS2. In contrast, CellProfiler-derived features underperformed relative to LSs, highlighting the advantage of learned representations over hand-crafted features. Collectively, these findings demonstrate that even under constrained conditions, VAEs can extract biologically meaningful signals from imaging data and constitute a promising strategy for multi-modal integration.

Submitted to arXiv on 29 Jul. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2507.22212v1

Spatial transcriptomics (ST) technologies provide a unique opportunity to analyze intact biological samples in a spatially informed manner and enable integration with other imaging-based modalities. However, the optimal way to leverage spatial context and integrate ST with imaging remains an ongoing challenge. In this study, researchers utilized a murine ileum MERFISH dataset to investigate the potential of a minimally tuned variational autoencoder (VAE) in extracting informative low-dimensional representations from cell crops of spot counts, nuclear stain, membrane stain, or their combinations. The resulting embeddings were evaluated through various methods such as PERMANOVA, cross-validated classification, and unsupervised Leiden clustering. These embeddings were compared with classical image-based feature vectors obtained using CellProfiler. While transcript counts (TC) generally performed better than other feature spaces, the VAE-derived latent spaces (LSs) successfully captured meaningful biological variation and improved label recovery for specific cell types. Particularly noteworthy was LS2, trained solely on morphological input, which exhibited moderate predictive power for select genes in a ridge regression model. Combining TC with LSs through multiplex clustering consistently enhanced cluster homogeneity. This trend persisted even when augmenting only subsets of TC with stain-derived LS2. In contrast, features derived from CellProfiler underperformed relative to LSs, underscoring the advantage of learned representations over manually crafted features. Overall, these findings highlight the capability of VAEs to extract biologically relevant signals from imaging data even under constrained conditions. The study demonstrates that VAEs represent a promising strategy for multi-modal integration in spatial transcriptomics research. The refined analysis sheds light on the importance of leveraging advanced computational techniques to enhance our understanding of complex biological systems at a spatial level.
Created on 02 Aug. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.