AI for Science: Peter Koo "Toward interpretable and generalizable AI for virtual biology"

Deep learning models trained on large-scale biological data, from genome-wide functional assays to digitized histopathology slides, now achieve strong predictive performance across a wide range of tasks. This predictive capability has opened the door to two further uses: designing new biological entities with desired properties, and interpreting what these models have learned about underlying mechanisms. In this talk, I will present efforts from my group along each of these directions. First, I will introduce a causal refinement framework that improves the generalization of pre-trained genomic models by integrating locus-specific perturbation data from MPRA and CRISPRi screens through continual learning, allowing models to incorporate causal information while preserving previously learned knowledge. Next, I will present DNA Discrete Diffusion (D3), a generative framework for designing regulatory DNA with tunable, cell type-specific activity that also yields representations useful for downstream supervised tasks. Finally, I will describe PICASSO, an approach that uses sparse dictionary learning to interpret and steer histopathology foundation models through counterfactuals over learned morphological concepts. Together, these efforts show how predictive, generative, and interpretive AI can advance biological discovery across modalities, from regulatory DNA to histopathology.

Bio:
Dr. Peter Koo is an Associate Professor at the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory, where he leads a research group at the intersection of machine learning and genomics. His lab develops interpretable and generalizable deep learning frameworks to decode the regulatory genome, spanning innovations in biologically informed model architectures, robust training strategies, and explainable AI methods for uncovering how DNA sequences control gene expression. Beyond genomics, his group also advances interpretable methods for foundation models in histopathology, aiming to reveal biologically meaningful representations in complex tissue images. By tightly integrating advances in machine learning with fundamental questions in gene regulation and disease biology, his research seeks to transform how models generate and test biological hypotheses, paving the way for more mechanistic and predictive genomics. Dr. Koo received his Ph.D. in Physics from Yale University and completed his postdoctoral training at Harvard University, where he transitioned into deep learning for biological sequence analysis.