Course Description
Over the last decade, massively parallel DNA sequencing has markedly impacted the practice of modern biology and is being utilized in the practice of medicine. The constant improvement of these platforms means that costs and data generation timelines have been reduced by orders of magnitude, facilitating investigators to conceptualize and perform sequencing-based projects that heretofore were time-, cost-, and sample number-prohibitive. Furthermore, the application of these technologies to answer questions previously not experimentally approachable is broadening their impact and application. However, data analysis remains a complex and often vexing challenge, especially as data volumes grow and AI-assisted tools rapidly reshape how that analysis is performed.
This intensive two-week course provides a comprehensive introduction to genomics technologies and bioinformatics, with a strong emphasis on hands-on data analysis. The appropriateness of different sequence data types and issues of data quality will be emphasized. Participants will explore short- and long-read platforms, single-cell and spatial transcriptomics, bulk RNA-seq, epigenomics, and germline and somatic variant analysis (including structural and copy-number variation) through a combination of lectures, labs, and group exercises. Most analyses are run in the cloud, with topics including experimental design, command-line basics, differential expression, variant calling, batch correction, pathway analysis, data visualization and more in Rand Python. Drawing on more than 100 software tools and packages, students gain experience with widely used tools and file formats (e.g., FASTQ, BAM, VCF, IGV,GATK, DESeq2, Seurat/Scanpy, Cell Ranger, Loupe) through interactive assignments that mirror real-world applications in cancer genomics, agriculture, clinical diagnostics, and more.
Central to the course is a structured framework for the responsible use of AI in genomic analysis: through dedicated lectures and integrated exercises, students apply AI chatbots (e.g., Claude, ChatGPT), AI-assisted coding tools (e.g., Claude Code, Codex, Copilot), and notebook-based agents to draft, debug, and interpret analysis code while critically evaluating its output, reproducibility, and ethical implications. Daily “bring your own data” office hours, team-based assignments, alumni and guest lectures, and a dedicated Slack community further support a rigorous, application-driven curriculum designed to prepare students for high-impact genomic research.
We encourage applicants from a diversity of scientific backgrounds including molecular evolution, development, neuroscience, medicine, cancer, plant biology and microbiology.
An overview of the course schedule can be found here. Please note that schedules are subject to change.
Matthew Attreed, Oxford Nanopore Technologies, New York, NY
Katie Campbell, University of California, Los Angles, Los Angles, CA
Bimal Chaudhari, Nationwide Children's Hospital, Columbus, OH
Lara Ianov, University of Alabama at Birmingham, Birmingham, AL
Justin Kinney, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
Zachary Lippman, CSHL/HHMI, Cold Spring Harbor, NY
Jessica Mozersky, Washington University in St Louis, St Louis, MO
Michael Schatz, Johns Hopkins University, Baltimore, MD
Alex Wagner, Nationwide Children's Hospital, Columbus, OH