You are viewing a past event. Please visit the course homepage for current courses.
Statistical Analysis of Genome Scale Data
June 6 - 20, 2025

Key Dates
Application DeadlineMarch 1, 2025
Arrival: June 6th by 6pm EST
Departure: June 20th around 12pm EST

CSHL courses are intensive, running all day and often including evenings and weekends; students are expected to attend all sessions and reside on campus for the duration of the course.

Instructors:
Harmen Bussemaker, Columbia University
Sean Davis, University of Colorado Anschutz School of Medicine
Hans Tomas Rube, University of California Merced
Min Zhang, University of California, Irvine


See the roll of honor - who's taken the course in the past

Over the past decade, high-throughput assays have become pervasive in biological research due to both rapid technological advances and decreases in overall cost. To properly analyze the large data sets generated by such assays and thus make meaningful biological inferences, both experimental and computational biologists must understand the fundamental statistical principles underlying analysis methods. This course is designed to build competence in statistical methods for analyzing high-throughput data in genomics and molecular biology.

Topics Include:
  • The R environment for statistical computing and graphics
  • Introduction to Bioconductor
  • Review of basic statistical theory and hypothesis testing
  • Experimental design, quality control, and normalization
  • High-throughput sequencing technologies
  • Expression profiling using RNA-seq and microarrays
  • In vivo protein binding using ChIP-seq
  • High-resolution chromatin footprinting using ATAC-seq
  • Integrative analysis of data from parallel assays
  • Representations of DNA binding specificity and motif discovery algorithms
  • Predictive modeling of gene regulatory networks using machine learning

Format: Detailed lectures and presentations by instructors and guest speakers will be combined with hands-on computer tutorials. The methods covered in the lectures will be applied to example high-throughput data sets.

2024 Speakers:
Leonardo Collado Torres, Lieber Institute for Brain Development, Baltimore, MD 
Ludwig Geistlinger, Harvard Medical School, Boston,  
Jussi Taipale, University of Cambridge, United Kingdom  
Julia Zeitlinger, Stowers Institute for Medical Research, Kansas City, MO 


This course is supported with funds provided by the National Human Genome Research Institute of the National Institutes of Health

Support & Stipends:

On average, 50% of trainees receive financial support on a needs-basis.

Stipends are available to offset tuition costs as follows:

        

Please indicate your eligibility for funding in your stipend request submitted when you apply to the course. Stipend requests do not affect selection decisions made by the instructors. 

Cost (including board and lodging): $4,560 USD

No fees are due until you have completed the full application process and are accepted into the course.

Before applying, ensure you have:
  1. Personal statement/essay;
  2. Letter(s) of recommendation;
  3. Curriculum vitae/resume (optional);
  4. Financial aid request (optional).
    More details.

If you are not ready to fully apply but wish to express interest in applying, receive a reminder two weeks prior to the deadline, and tell us about your financial aid requirements, click below: