Generative Modeling Reveals the Connection between Cellular Morphology and Gene Expression
COSMIC (Cross-modal generation between Single-cell RNA-seq and MICroscopy images) is a bidirectional generative framework that quantitatively links single-cell nuclear morphology and gene expression.
The understanding of how transcriptional programs give rise to cellular morphology, and how morphological features reflect and influence cell identity and function remains limited. This is due in part to the lack of large-scale datasets pairing the two modalities as well as the absence of computational frameworks capable of modeling their cross-modal structure. Here, we introduce COSMIC, a bidirectional generative framework that enables quantitative decomposition of transcriptional variance reflected in morphology and morphological variance explained by gene expression. COSMIC builds on a foundation model trained on over 21 million segmented nuclei and couples it with existing transcriptomic embeddings. To enable cross-modal learning, we leveraged a newly generated multimodal dataset acquired using IRIS, a technology that captures high-resolution images and transcriptomes from the same single cells at scale. COSMIC accurately modeled cell type identity, as well as continuous dynamics such as cell-cycle progression, establishing a quantitative link between morphological phenotypes and underlying gene expression. In prostate cancer cells, COSMIC identified morphological and transcriptomic differences between chemotherapy drug treatment-responsive and -resistant cells, and revealed morphology-associated genes linked to tumor state. Together, these results demonstrate that generative modeling powered by paired single-cell measurements can capture the bidirectional flow of information between cellular form and gene expression, opening new avenues for mechanistic discovery and predictive modeling in both basic and translational cell biology.
Publication
Generative Modeling Reveals the Connection between Cellular Morphology and Gene Expression
Shuo Wen, Ramon Vinas Torne, Johannes Bues, Camille Lucie Lambert, Nadia Grenningloh, Timothee Ferrari, Elisa Bugani, Joern Pezoldt, Jillian Rose Love, Wouter Karthaus, Bart Deplancke, and Maria Brbic
Overview of COSMIC: a bidirectional generative framework that quantitatively links single-cell nuclear morphology and gene expression.
COSMIC is built on two key ingredients:
(1) a foundation model of nuclear morphology pretrained on over 21 million segmented nuclei, and
(2) conditional diffusion models trained on paired single-cell images and transcriptomes generated using the IRIS platform.
By learning to translate from transcriptomes to nuclear images and from nuclear images to transcriptomes, COSMIC captures how transcriptional programs manifest in cellular form and how morphological variation reflects underlying gene expression. This unified framework enables generation, prediction, and discovery of morphology-associated genes at single-cell resolution.
COSMIC generates realistic nuclear images from gene expression
COSMIC accurately synthesizes high-resolution nuclear images conditioned on single-cell transcriptomic profiles. Generated images closely match real nuclei in size, shape, and texture, and preserve cell-type-specific morphology. In the embedding space, real and generated nuclei overlap strongly.
Generated images retain sufficient biological signal to support accurate cell type classification, approaching the performance achieved on real microscopy images. COSMIC generalizes to unseen batches, new donors, and even across species.
COSMIC predicts transcriptomic profiles from nuclear morphology
In the reverse direction, COSMIC infers biologically meaningful transcriptomic profiles directly from nuclear images. Predicted transcriptomes recover the global structure of gene expression space, preserve cell-type separation, and enable accurate downstream cell-type classification.
At the gene level, COSMIC identifies a subset of genes whose expression can be robustly predicted from morphology, including cell-type marker genes with high correlation to ground truth. These results demonstrate that nuclear morphology encodes precise information about specific transcriptional programs.
COSMIC captures continuous cell-cycle dynamics
Beyond discrete cell types, COSMIC learns continuous biological processes. In mouse fibroblasts, COSMIC accurately recovers cell-cycle dynamics from both modalities. Predicted transcriptomes reproduce phase-specific expression patterns of canonical cell-cycle genes, while generated nuclear images reflect expected morphological changes across the cell cycle, such as systematic variation in nuclear size.
COSMIC identifies morphology-associated genes in cancer
Applied to prostate cancer cells treated with chemotherapy, COSMIC uncovers genes whose expression is tightly coupled to nuclear morphology. Using a cycle-consistent generative strategy, COSMIC identifies morphology-associated genes linked to treatment response and cell-cycle arrest.
Code
A PyTorch implementation of COSMIC is available on GitHub.
Contributors
The following people contributed to this work:
Ramon Vinas Torne
Johannes Bues
Camille Lucie Lambert
Nadia Grenningloh
Timothee Ferrari
Elisa Bugani
Joern Pezoldt
Jillian Rose Love
Wouter Karthaus
Bart Deplancke






