The Genome Access Course (TGAC)
April 2 - 4, 2023
The April Genome Access Course is full but you can still register to be waitlisted. Please register with "Fees Waived" and we will contact you should you be able to attend.
Delphine Fagegaltier, Merck Research
Emily Hodges, Vanderbilt University School of Medicine
Benjamin King, University of Maine
Steven Munger, The Jackson Laboratory
COVID-19: All participants planning to attend in-person will be required to attest to recent COVID vaccination with an FDA or WHO approved vaccine. Additional safety measures will be in line with current NY and Federal Guidelines applicable in Spring 2023.
The Genome Access Course (TGAC) is an intensive two-day introduction to bioinformatics. Participants are expected to arrive by 6 p.m. on the first day (Sunday April 2) with the course running two full days until 5 p.m. on the third day (Tuesday April 4).
TGAC is broken into modules that are each designed to give a broad overview of a given topic, with ample time for examples chosen by the instructors. Each module features a brief lecture describing the theory, methods and tools followed by a set of worked examples that students complete. Students are encouraged to engage instructors during the course with specific tasks or problems that pertain to their own research.
The core of the course is the analysis of sequence information framed in the context of completed genome sequences. Featured resources and examples primarily come from mammalian species, but concepts can be applied to any species. The course also features methods to assist the analysis and prioritization of gene lists from large scale microarray gene expression and proteomics experiments. The topics covered in each two-day iteration of TGAC are taken from the following list.
NCBI and Model Organism Database Resources
- NCBI sequence, gene, and protein resources
- Gene Ontology
- Model organism databases: Mouse Genome Informatics, ZFIN
UCSC and Ensembl Genome Browsers
- Overview and comparison of resources
- Adding custom tracks
- Bulk genome analysis tools: BioMart, UCSC Table Browser
- Types of sequence and structural variation
- Large-scale biobanks: UK BioBank, All of Us
- SNP resources: dbSNP, gnomAD
- GWAS resources: GWAS Catalog
- Phenome-wide association resources: PheWeb, Open Targets
- Prioritizing variants by predicting variant effects: Ensembl VEP
Functional Genomic Elements and ENCODE
- Functional genomic resources: ENCODE
- Viewing ENCODE data using UCSC Genome Browser TrackHubs
High-Throughput Sequence Data Analysis
- Reference-based analysis workflows
- Common file formats: FASTQ, SAM, BAM
- Quality control and diagnostic analyses
- Repositories of high-throughput sequence data: GEO and SRA
RNA Sequencing Data Analysis
- Experimental design
- Analysis workflows
Hands-On Analysis of Sequence Variation Data Using Galaxy
- Mapping reads to a reference sequence
- Generating read counts to infer gene expression levels
- Generating lists of differentially expressed genes
Hands-On Introduction to the R Statistical Computing Environment
- Overview of R packages, Bioconductor and R Studio
- Basic R syntax for data analysis and plotting
Hands-On Analysis of Bulk RNA Sequencing Data using the R/DESeq Package
- Analyze bulk RNA sequencing read count data using the R/DESeq2 package in R Studio
- Generate diagnostic plots, MA plot, volcano plot and heatmaps
Gene Set Enrichment and Pathway Analysis
- Gene set enrichment analysis tools: DAVID
- Pathway resources: KEGG
- Protein interaction resources: STRING
Each student will be provided with a laptop (if needed) and internet access for the duration of the course. You can also bring your own laptop to the course provided it meets the following requirements: 1) run R and R Studio software (installation instructions will be provided), 2) a standard browser (Chrome, Internet Explorer, Firefox, etc.) that is up-to-date with security patches and bug fixes, 3) wireless internet capacity, and 4) the ability to view and modify plain text files and spreadsheets (e.g., Microsoft Word and Excel). Both PCs and Macs are acceptable as long as they're updated with all security patches and bug fixes.
The Genome Access Course is open to all on a first-come, first-served registration system. It is most beneficial for bench scientists transitioning into projects that require intensive analysis or integration of large data sets. The course will introduce you to publicly available resources, and it will also help you develop a vocabulary that can be used to collaborate with computational scientists.
If you already have significant programming or data analysis experience, TGAC is not appropriate for you. For more detailed curriculum on methods used in computational biology, please see the Computational Genomics course. Students interested in the practical aspects of software development are encouraged to apply to the course on Programming for Biology. Students who would like in-depth training in the analysis of next-generation sequencing data (e.g., genome assembly and annotation, SNP calling, and the detection of structural variants) may be interested in the course on Advanced Sequencing Technologies. Finally, please see the course on Statistical Methods of Genome Scale Data if you would like training in the statistical analysis of high-throughput genomics data.
Major support is provided by the Helmsley Charitable Trust. Limited financial aid is available; please apply in writing to Olivia Mulligan describing your need for financial support.
Academic Package (two nights of housing/Single Room/Private Bath): $1,185
Academic Package (two nights of housing/Single Room/Communal Bath): $1,135
Academic Package (two nights of housing/Double Occupancy): $1,060
Corporate Package (two nights of housing/Single Room/Private Bath): $1,880
Academic No-Housing Package: $875
Corporate No-Housing Package: $1,570
Extra nights at $285 per night including food
All packages cover registration, food, coffee breaks, and a reception. Transportation to and from Cold Spring Harbor is not included. Full payment is due three weeks prior to the course.