Programming for Biology
October 10 - 25, 2016
Application Deadline: August 19 2016
Simon Prochnik, DOE Joint Genome Institute
Sofia Robb, Stowers Institute for Medical Research

See the roll of honor - who's taken the course in the past

Web-based tools are no longer enough for today's biologist, who must access and analyze large data sets from myriad sources in disparate formats. The need to design and implement custom analysis pipelines is becoming ever more important as new technologies increase the already-exponential rate at which biological data are generated. Designed for lab biologists with little or no programming experience, this course will give students the bioinformatics and scripting skills necessary to exploit this abundance of biological data. The only prerequisite for the course is basic knowledge of UNIX; some scripting experience is also helpful. Lectures and problem sets from previous years are available online, and students are welcome to study this background material before starting the course.

This course teaches Perl, a scripting language that is easy to learn and efficient to use. Perl also has a vast array of ready-built tools such as Bioperl that are designed to solve common biological problems. The course begins with one week of introductory coding, continues with a survey of available biological libraries and practical topics in bioinformatics, and ends with a final group project. Formal instruction is provided on every topic by the instructors, teaching assistants, and invited experts. Students will work together to solve problem sets covering common scenarios in the acquisition, validation, integration, analysis, and visualization of biological data. They will learn how to design, construct, and run powerful and extensible analysis pipelines in a straightforward manner. For their final projects, students will pose problems using their own data and work with each other and the faculty to solve them. In the past, final projects have formed the basis of publications and freely available resources (see, for example, the Deobfuscator module in BioPerl). Students are provided with a library of Perl reference books that they can home with them.

Note that the primary focus of this course is to provide students with practical programming experience, rather than to present a detailed description of the algorithms used in computational biology. For the latter, we recommend the Computational & Comparative Genomics course.

Support & Financial Aid

Major support is generously provided by the National Human Genome Research Institute

Access to cloud computational resources may be supported by an education grant from Amazon Web Services

Financial aid is available to help offset tuition costs as follows:


1) Financial aid for US applicants is provided by the NIH National Human Genome Research Institute
2) Financial aid for international applicants is provided by the Howard Hughes Medical Institute
3) Interdisciplinary Fellowships (transitioning from outside biology) & Scholarships (transitioning from other biological disciplines) are provided by the Helmsley Charitable Trust

Please indicate your eligibility for any of these funding sources in a financial aid request submitted as part of your application materials. Financial aid requests do not affect selection decisions made by the instructors.

Cost (including board and lodging): $4,080

The following button links to a short form which confirms your interest in the course, but is not an official application form. To apply to the course, click the Application tab above. No fees are due until you have completed the full application process and are accepted into the course.

Students accepted into the course should plan to arrive by early evening on October 9 and plan to depart after lunch on October 25.