Evaluating Integrative Strategies for Incorporating Phenotypic Features in Spatial Transcriptomics

AI-generated keywords: Spatial transcriptomics MERFISH variational autoencoder biological variation multi-modal integration

AI-generated Key Points

Spatial transcriptomics (ST) technologies allow analysis of intact biological samples in a spatially informed manner
Integration of ST with imaging is a current challenge
Researchers used murine ileum MERFISH dataset to explore the potential of a minimally tuned variational autoencoder (VAE)
VAE successfully extracted informative low-dimensional representations from cell crops
Evaluation methods included PERMANOVA, cross-validated classification, and unsupervised Leiden clustering
VAE-derived latent spaces (LSs) captured meaningful biological variation and improved label recovery for specific cell types
LS2, trained on morphological input, showed moderate predictive power for select genes
Combining transcript counts (TC) with LSs through multiplex clustering enhanced cluster homogeneity
Features derived from CellProfiler underperformed compared to LSs, emphasizing the advantage of learned representations
VAEs can extract biologically relevant signals from imaging data even under constrained conditions
VAEs offer promise for multi-modal integration in spatial transcriptomics research

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Levin M Moser, Ahmad Kamal Hamid, Esteban Miglietta, Nodar Gogoberidze, Beth A Cimini

arXiv: 2507.22212v1 - DOI (q-bio.QM)

License: CC BY 4.0

Abstract: Spatial transcriptomics (ST) technologies not only offer an unprecedented opportunity to interrogate intact biological samples in a spatially informed manner, but also set the stage for integration with other imaging-based modalities. However, how to best exploit spatial context and integrate ST with imaging-based modalities remains an open question. To address this, particularly under real-world experimental constraints such as limited dataset size, class imbalance, and bounding-box-based segmentation, we used a publicly available murine ileum MERFISH dataset to evaluate whether a minimally tuned variational autoencoder (VAE) could extract informative low-dimensional representations from cell crops of spot counts, nuclear stain, membrane stain, or a combination thereof. We assessed the resulting embeddings through PERMANOVA, cross-validated classification, and unsupervised Leiden clustering, and compared them to classical image-based feature vectors extracted via CellProfiler. While transcript counts (TC) generally outperformed other feature spaces, the VAE-derived latent spaces (LSs) captured meaningful biological variation and enabled improved label recovery for specific cell types. LS2, in particular, trained solely on morphological input, also exhibited moderate predictive power for a handful of genes in a ridge regression model. Notably, combining TC with LSs through multiplex clustering led to consistent gains in cluster homogeneity, a trend that also held when augmenting only subsets of TC with the stain-derived LS2. In contrast, CellProfiler-derived features underperformed relative to LSs, highlighting the advantage of learned representations over hand-crafted features. Collectively, these findings demonstrate that even under constrained conditions, VAEs can extract biologically meaningful signals from imaging data and constitute a promising strategy for multi-modal integration.

Submitted to arXiv on 29 Jul. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2507.22212v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

Spatial transcriptomics (ST) technologies provide a unique opportunity to analyze intact biological samples in a spatially informed manner and enable integration with other imaging-based modalities. However, the optimal way to leverage spatial context and integrate ST with imaging remains an ongoing challenge. In this study, researchers utilized a murine ileum MERFISH dataset to investigate the potential of a minimally tuned variational autoencoder (VAE) in extracting informative low-dimensional representations from cell crops of spot counts, nuclear stain, membrane stain, or their combinations. The resulting embeddings were evaluated through various methods such as PERMANOVA, cross-validated classification, and unsupervised Leiden clustering. These embeddings were compared with classical image-based feature vectors obtained using CellProfiler. While transcript counts (TC) generally performed better than other feature spaces, the VAE-derived latent spaces (LSs) successfully captured meaningful biological variation and improved label recovery for specific cell types. Particularly noteworthy was LS2, trained solely on morphological input, which exhibited moderate predictive power for select genes in a ridge regression model. Combining TC with LSs through multiplex clustering consistently enhanced cluster homogeneity. This trend persisted even when augmenting only subsets of TC with stain-derived LS2. In contrast, features derived from CellProfiler underperformed relative to LSs, underscoring the advantage of learned representations over manually crafted features. Overall, these findings highlight the capability of VAEs to extract biologically relevant signals from imaging data even under constrained conditions. The study demonstrates that VAEs represent a promising strategy for multi-modal integration in spatial transcriptomics research. The refined analysis sheds light on the importance of leveraging advanced computational techniques to enhance our understanding of complex biological systems at a spatial level.

- Spatial transcriptomics (ST) technologies allow analysis of intact biological samples in a spatially informed manner
- Integration of ST with imaging is a current challenge
- Researchers used murine ileum MERFISH dataset to explore the potential of a minimally tuned variational autoencoder (VAE)
- VAE successfully extracted informative low-dimensional representations from cell crops
- Evaluation methods included PERMANOVA, cross-validated classification, and unsupervised Leiden clustering
- VAE-derived latent spaces (LSs) captured meaningful biological variation and improved label recovery for specific cell types
- LS2, trained on morphological input, showed moderate predictive power for select genes
- Combining transcript counts (TC) with LSs through multiplex clustering enhanced cluster homogeneity
- Features derived from CellProfiler underperformed compared to LSs, emphasizing the advantage of learned representations
- VAEs can extract biologically relevant signals from imaging data even under constrained conditions
- VAEs offer promise for multi-modal integration in spatial transcriptomics research

Summary1. Spatial transcriptomics (ST) helps scientists study samples while considering where each part is located. 2. Combining ST with images is a challenge researchers are working on. 3. Scientists used data from mouse intestines to test a special computer program called VAE. 4. VAE successfully found important patterns in cells using less information. 5. VAEs are helpful tools that can improve how we understand genes and cells in our bodies. Definitions- Spatial transcriptomics (ST): A method that allows studying biological samples while considering their location. - Variational autoencoder (VAE): A type of computer program that can find patterns and representations in data efficiently. - Dataset: A collection of data or information for analysis or research purposes. - Representation: Showing or describing something in a specific way to understand it better. - Gene: A unit of heredity that carries instructions for making proteins and determining traits in living organisms.

Spatial transcriptomics (ST) is a rapidly evolving field that combines traditional gene expression analysis with spatial information to gain a deeper understanding of biological systems. This innovative technology allows researchers to analyze intact tissue samples in a spatially informed manner, providing valuable insights into the complex interactions between cells and their microenvironment. One of the key challenges in ST research is how to effectively integrate this data with other imaging-based modalities. In this study, published in Nature Communications, researchers utilized a murine ileum MERFISH dataset to explore the potential of using variational autoencoders (VAEs) for integrating ST data with imaging. The goal of this study was to investigate whether VAEs could extract informative low-dimensional representations from cell crops of spot counts, nuclear stain, membrane stain, or their combinations. The resulting embeddings were then evaluated through various methods such as PERMANOVA, cross-validated classification, and unsupervised Leiden clustering. These embeddings were also compared with classical image-based feature vectors obtained using CellProfiler. The results showed that while transcript counts (TC) generally performed better than other feature spaces, the VAE-derived latent spaces (LSs) successfully captured meaningful biological variation and improved label recovery for specific cell types. LS2, which was trained solely on morphological input, exhibited moderate predictive power for select genes in a ridge regression model. This suggests that even under constrained conditions where only limited information is available from imaging data alone, VAEs can still extract biologically relevant signals. Furthermore, combining TC with LSs through multiplex clustering consistently enhanced cluster homogeneity. This trend persisted even when only subsets of TC were augmented with stain-derived LS2. In contrast, features derived from CellProfiler underperformed relative to LSs, highlighting the advantage of learned representations over manually crafted features. Overall, these findings demonstrate the capability of VAEs to extract biologically relevant signals from imaging data and integrate them with ST data. This is a promising strategy for multi-modal integration in spatial transcriptomics research, as it allows for a more comprehensive understanding of complex biological systems at a spatial level. The study also sheds light on the importance of leveraging advanced computational techniques to enhance our understanding of complex biological systems. VAEs have shown great potential in this regard, and their use in ST research could greatly improve our ability to analyze and interpret large datasets. In conclusion, this study highlights the power of VAEs in extracting meaningful information from imaging data and integrating it with ST data. The results demonstrate the potential of using these techniques to gain deeper insights into the spatial organization and interactions within biological systems. As technology continues to advance, we can expect further developments in this field that will ultimately lead to a better understanding of complex diseases and potentially new treatment strategies.

Created on 02 Aug. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

56.9%

The Latent Space Hypothesis: Toward Universal Medical Representation Learning

q-bio.QM

55.4%

Large language models in bioinformatics: applications and perspectives

q-bio.QM

54.6%

How to Build the Virtual Cell with Artificial Intelligence: Priorities and Op…

q-bio.QM

54.2%

Revisiting the thorny issue of missing values in single-cell proteomics

q-bio.QM

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.