PROGRAMMING
FOR BIOLOGY
October 15 - 28, 2008
Application Deadline: July 15, 2008
Instructors:
Suzanna
Lewis, University of California, Berkeley
Simon
Prochnik, University of California, Berkeley
James
Tisdall, Dupont Corporation & Biocomputing Associates
A
computer is already an indispensible tool for database searches,
but the use of web-based tools alone is not enough for today’s
biologist who needs to acess and work with data from myriad
sources in disparate formats. This need will become ever
more important as new technologies increase the already
exponential rate at which biological data is generated.
Designed for students and researchers with little or no
prior programming experience, the two-week Programming
for Biology course will give biologists the bioinformatics
skills necessary to exploit this abundance of biological
data.
The course is based around the Perl scripting language,
because of its ease of learning and its incredible wealth
of ready-built code modules (e.g. bioperl) designed to solve
common biological problems. Starting with introductory coding,
and continuing with a survey of available biological libraries
and practical topics in bioinformatics, students end by
learning how to construct and run powerful and extensible
analysis pipelines in a straightforward manner. The course
combines formal lectures with hands-on sessions in which
students work to solve problem sets covering common scenarios
in the acquisition, validation, integration, analysis and
visualization of biological data. For their final projects,
which run during the second week of the course, students
will pose problems using their own data and work with each
other and the faculty to solve them. Final projects have
formed the basis of publications as well as public biological
websites (see, for example: http://bio.perl.org/wiki/Deobfuscator).
The prerequisites for the course are basic knowledge of
UNIX. Lectures and problem sets covering this background
material are available online from previous years and students
can study this material before starting the course. Note
that the primary focus of this course is to provide students
with practical programming experience, rather than to present
a detailed description of the algorithms used in computational
biology. For the latter, we recommend the Computational
Genomics course.
Speakers in the 2007 course included: Tyler
Alioto, Emina Begovic, Scott John Cain, George Hartzell,
Matt Hibbs, Gabor Marth, Jason Stajich, Paul Thomas &
Mark Yandell.
This course is supported by the National
Human Genome Research Institute