Calendar
Module I — The Generalization Gap in Biohealth: Why “Scale” Fails
Objective: Understand why predictive success often fails to translate to deployment for two distinct reasons — generalization failure (models break under distribution shift) and identification failure (models capture associations, not causal mechanisms).
Week 1: Predictive Success vs. Causal Validity
- Mar 30
- Lecture 1 Why causality matters in biohealth
- Course framing: identification and generalization as scientific validity in real-world clinical and biological practice
- “World models” vs shortcut predictors; predictive accuracy vs counterfactual validity; getting to mechanistic validity.
- Canonical failure modes across domains: proxy learning, selection/measurement bias, and feedback loops
- Course overview and logistics
- Apr 1
- Lecture 2 Dataset shift and identification, reframed causally
- Shift taxonomy (covariate/label/concept) : what changed in the data-generating process?
- Selection vs. sampling; collider bias and selection; Berkson’s bias
- Feedback and performativity: when deployment changes the data-generating process
- Identification failure within a single population
- RCTs as the gold standard for identification: what they solve (confounding) and what they don’t (transportability, mechanism)
Week 2: Foundations & Aspirations
- Apr 6
- Lecture 3 : TA-led Causal inference primer
- DAGs, the do-operator, confounding, d-separation, backdoor criterion
- Identification strategies: adjustment, instrumental variables, front-door criterion
- ATE, ATT, CATE: the estimands that matter in biomedicine
- Apr 8
- Lecture 4 : Guest lecture Virtual cell models — hope vs. hype
- Objectives and evaluation of “virtual cells” / “digital twins”
- Interpolation vs. extrapolation in perturbation space
- Evaluation beyond reconstruction: interventional prediction, transport across labs, mechanistic sanity checks
Module II — Mechanistic & Hybrid Models
Objective: Integrate mechanistic knowledge with ML to improve both identification (constraining models toward causal mechanisms) and generalization (enabling extrapolation beyond the training distribution, e.g., to new interventions and contexts).
Week 3: Inductive Bias and the Hybrid Modeling Toolkit
- Apr 13
- Lecture 5 Inductive bias taxonomy through case studies
- Taxonomy: architectural / regularization / data / evaluation, with biomedical examples (equivariance, pathway priors, biological data augmentation, benchmark leakage)
- Cautionary tales: Mechanism-aligned bias vs. “bias toward the wrong story”
- How bias choice connects to both failure modes: shift-robust features and mechanism-aligned representations
- Student presentation inductive bias in biomedical ML (e.g., equivariance in molecular models, graph-structured priors, or evaluation-as-bias)
- Apr 15
- Lecture 6 The hybrid modeling toolkit
- The hybrid spectrum: pure mechanistic → gray-box → pure data-driven
- Neural ODEs, universal differential equations, physics-informed neural networks
- Case studies: glucose dynamics (CGM), pharmacokinetics, wearable biosignals
- When hybrids help (extrapolation, sample efficiency, identifiability, interpretability) vs. when they mislead (compensating errors)
- Student presentation hybrid modeling (e.g., neural ODE for clinical trajectories, PK/PD, mechanistic pathway integration, or gray-box approaches in biological systems)
Module III — Causal Representations & Learning from Interventions
Objective: Learn representations that capture causal structure rather than associational shortcuts; leverage interventional data to validate and improve them.
Week 4: Causal Representation Learning
- Apr 20
- Lecture 7 From pixels and counts to causal state
- Why representation is the bottleneck for both generalization and identification
- Invariance across environments; identifiability of latent causal variables
- Causal disentanglement; representations as hypotheses tested by interventional and OOD probes
- Student presentation hybrid or mechanistic modeling (e.g., structured dynamics, physics-informed approaches to clinical data, or domain-knowledge-constrained learning)
- Apr 22
- Lecture 8 : Student presentations Causal representation learning (3 papers)
- Invariant/causal representations across environments, or causal foundation models
- Non-identifiability, nuisance leakage, or representation failure
- Causal disentanglement, independent mechanism analysis, or identifiability in single cells
Week 5: Learning from Interventional Data — Perturbation Biology as Causal Inference
- Apr 27
- Lecture 9 Perturbation biology, multimodal representations, and interpretability
- Estimands in perturbation biology
- Perturbation screens as the biological analogue of RCTs, with their own identification challenges (batch/plate and CRISPR non-targeting confounders)
- CRISPR as “intent-to-treat”: PerturbVI
- Multimodal learning from unpaired data
- Counterfactual inference in single cells; the benchmarking challenge (linear baselines vs. deep models)
- Student presentation perturbation biology (e.g., response prediction, counterfactual inference, or benchmarking)
- Apr 29
- Lecture 10 : Student presentations Perturbation biology, counterfactual inference & causal discovery (3 papers)
- Perturbation response prediction or counterfactual inference in single cells
- Causal structure learning from interventional data
- Experimental design or active learning for perturbation screens
- May 1
- Project proposal due
- 1-page proposal (teams of up to 2)
Week 6: Foundation Models, Generative Approaches, and Evaluation
- May 4
- Lecture 11 : Guest lecture CellFlux — flow matching for perturbation prediction
- CellFlux: flow matching for modeling morphological responses to perturbations
- SDE extension with Bayesian treatment for improved generalization and OOD detection
- CellFluxRL: RL-based post-training with biologically anchored rewards
- Student presentation generative modeling or flow matching for biological data
- May 6
- Lecture 12 : Student presentations Foundation models, evaluation & benchmarking (3 papers)
- Foundation models for single-cell or perturbation data
- Evaluation methodology and benchmarking
- Multimodal biological learning or mechanistic interpretability
Module IV — Decision-Making and Moving Models Across Domains
Objective: Learn and evaluate treatment policies from observational data; formalize when and how causal effects transfer across populations and biological systems.
Week 7: Policy Learning — Off-Policy Evaluation & Treatment Decisions
- May 11
- Lecture 13 Estimating the value of a policy you’ve never run
- The decision problem: learning a treatment policy from observational data
- Why naive evaluation fails; inverse propensity weighting and its instability
- Doubly robust estimation; learning individualized treatment rules
- Biomedical applications: adaptive treatment strategies, personalized dosing
- Student presentation clinical policy learning or off-policy evaluation
- May 13
- Lecture 14 : Student presentations Policy learning & experimental design (3 papers)
- Clinical policy learning or off-policy evaluation
- Active learning or Bayesian experimental design
- Treatment effect estimation or confounding-robust evaluation
Week 8: Causal Transportability
- May 18
- Lecture 15 When can you trust a model trained elsewhere?
- Pearl’s transportability framework vs. domain adaptation; selection diagrams as a tool for reasoning about what must be invariant
- Two failure modes at the transport level: distribution shift vs. misidentified mechanism
- The biological evidence ladder as a transportability problem: cell lines → organoids → animal models → patients; transportability across cellular contexts
- Practical transportability across institutions and populations: what target-site data and operational constraints are needed
- Student presentations cross-site/cross-population transfer; external validity across cellular contexts or populations
- May 18
- Project midway (stress test) report due
- One negative control + one domain shift / robustness experiment
- May 20
- Lecture 16 : Guest lecture Transportability in clinical development
- Synthetic control arms, real-world evidence (RWE), bridging RCTs and observational data
- FDA’s evolving stance on external controls; “virtual twin” approaches
- Student presentation synthetic control arms, RWE, or external validity in clinical trials
Module V — Frontiers & Course Wrap-up
Objective: Evaluate foundation models, AI agents, and “world models” as scientific tools in biohealth; synthesize the course’s dual “identification + generalization” framework into a practical audit checklist.
Week 9: Agentic AI and Scientific Reasoning
- May 25
- No class (Memorial Day)
- May 27
- Lecture 17 : Guest lecture Can LLMs and AI agents reason causally about biology?
- Where foundation models help: representation, multimodal alignment, hypothesis generation, protocol writing
- Where they fail: hallucination, implicit selection bias, weak causal grounding
- Evaluation: stress tests under shift, counterfactual probes, calibration of scientific claims
- Student presentation AI agents for science, or evaluation of foundation models in biomedicine
Week 10: Course Synthesis & Final Presentations
- Jun 1
- Lecture 18 Integrative synthesis
- Integrative synthesis: what we learned about inductive bias, state representation, interventions, and transport
- The dual thesis: every model claim stress-tested against identification and generalization
- A “checklist for mechanistic generalization claims” to carry into research
- Open problems and where the field is headed
- Student presentations LLMs and AI agents for causal reasoning in biology (2 papers)
- Jun 3
- Final project presentations
- Short talks or poster session (TBD)
Week 11: Final Report Submission
- Jun 8
- Final project report due
- 8 page report (plus references) including a “generalization and identification contract” section