PROGRAMMING
FOR BIOLOGY
October 18 - 31, 2006
Application Deadline: July 15, 2006
Instructors:
Suzanna
Lewis, University of California, Berkeley
Simon
Prochnik, University of California, Berkeley
Lincoln
Stein, Cold Spring Harbor Laboratory
James
Tisdall, Dupont Corporation & Biocomputing Associates
Today,
the computer is an indispensable part of a research biologist's
toolkit. The success of the human and other organism genome
projects has created terabytes of data on everything from
genetic linkage mapping, to nucleotide sequences, to protein
structures, stashed away in databases around the globe.
Large-scale technologies such as DNA microarrays and high-throughput
genotyping have transformed the nature of laboratory experimentation.
Furthermore, even when biologists are not generating large
data sets of their own, they will want to collect and analyze
data from myriad sources in the pursuit of novel candidates
or even entire research avenues. A few years ago it might
have been sufficient to use Excel spreadsheets for managing
laboratory data and canned Web interfaces for searching,
but as the volume of data grows and the subtlety of analysis
increases, these techniques, even supplemented by some simple
programming skills, have become inadequate. Modern biologists
must be adept at juggling disparate data sets in order to
pursue their research. Designed for students and researchers
with some prior programming experience, the two-week Advanced
Bioinformatics program will give biologists the expanded
bioinformatics skills necessary to construct computational
systems that can exploit this increasingly complex information
landscape, with an emphasis on fitting the wide range of
existing analysis tools into extensible bioinformatics systems.
The course combines formal lectures with hands-on sessions
in which students work to solve a series of problem sets
covering common scenarios in the acquisition, validation,
integration, analysis and visualization of biological data.
For their final projects, students will pose problems using
their own data and work with each other and the faculty
to solve them. The prerequisites for the course are basic
knowledge of UNIX, procedural Perl programming, HTML document
creation and the database query language, SQL. Lectures
and problem sets covering this background material are available
online and students can study this material before starting
the course.
Note
that the primary focus of this course is to provide students
with practical programming experience, rather than to present
a detailed description of the algorithms used in computational
biology. For the latter, we recommend the Computational
Genomics course.
Speakers
in the 2005 course included:
Emina Begovic, University of California, Berkeley
Peter Brokstein, DOE Joint Genome Institute
Roderic Guigo, Institut Municipal d'Investigacio Medica,
Spain
Winston Hide, University Western Cape, South Africa
Gabor Marth, Boston College
Sheldon McKay, Cold Spring Harbor Laboratory
Chris Mungall, Berkeley Drosophila Genome Project
Lior Pachter, University of California, Berkeley
William Pearson, University of Virginia
Jason Stajich, Duke University
Paul Thomas, Applied Biosystems
Olga Troyanskay, Princeton University
This course is supported by the National
Human Genome Research Institute