ECS 124 THEORY AND PRACTICE OF BIOINFORMATICS
(4) III

Course Information

Lecture: Tuesday, Thursday 1:30 - 3:00
1128 Hart Hall

Laboratory: Friday 12:00 - 1:00, with spillover from 1:00 - 2:00
Lab Location: 73 Hutchison Hall (Enter through 75 Hutchison in the basement)

Instructor

Dan Gusfield, gusfield@cs.ucdavis.edu
2125 Kemper
Office hours will be posted next week.

Teaching Assistant

Marisano James, marisano@yahoo.com
Office hours: 1-2 Monday and 3-5 Weds. in 55 Kemper Hall; 1-2 Friday in 73 Huchison.

Class Webpage:

cs124.cs.ucdavis.edu

Lecture Videos:

Lecture Videos

These are lecture videos taken in 2002. I will provide an index of the lectures as we go along, but they roughly correspond to the order of the lectures this year. However, not all lectures that will be given this year will correspond to lectures in 2002, and conversely, there are lectures there that cover material we will not cover. Be aware that in the first lecture, where the URL for the 2002 class is given, and the password for access to class material is given, those are no longer correct. See the 2008 material for that information.

Prerequisites:

Some exposure to computer programming at the level of ECS 10 or 30 or E 5 or E 6; Stat 12 or 13 or 32 or 100 or Math 131/Stat 131A; Some exposure to genetics/molecular biology at the level of Bio Sci 1A or MCB 10

Grading:

In order to pass the course, it is necessary to do the labs, but most parts of the labs will not be intensely graded, just checked to see that they were done. The grade will come from those parts of the labs that are intensely graded (20% of the grade); quizzes/midterm (40%); final (40%)

Catalog Description:

Fundamental biological, mathematical and algorithmic models underlying bioinformatics; sequence analysis, database search, gene prediction, molecular structure comparison and prediction, phylogenetic trees, high throughput biology, massive datasets; applications in molecular biology and genetics; use and extension of common bioinformatics tools.

Goals:

I. Understanding the role and utility of bioinformatics in modern biology

II. Understanding basic biological, mathematical and algorithmic concepts, techniques and models underlying bioinformatics tools

III. Mastery of common bioinformatics tools

IV. Simple programming in Perl to extend the utility of common bioinformatics tools

 

Expanded Course Description:

I. Initial examples of the power of bioinformatics in modern biology

A. The importance of sequence and structure comparison and of database search
B. The use of sequence analysis in laboratory protocols
C. The use of phylogenetics in evolution and non-evolutionary areas of biology

 

II. Sequence analysis

A. Probabilistic and biological models underlying sequence alignment
B. Computational efficiency and the need for compromises in the models
C. The general technique of Dynamic Programming
D. Pairwise sequence alignment - algorithms for global, local alignment and variations
E. Algorithms for multiple sequence alignment and the identification/use of motifs
F. Database search - FASTA, BLAST, PSI-BLAST, scoring matrices, statistical significance and its significance
G. Creation and use of motif models
H. Novel uses of sequence analysis in studying DNA, RNA and proteins
I. Sequence analysis in genomics and high throughput biology

 

III. Phylogenetic algorithms

A. Probabilistic and ideal-data models underlying phylogenetic algorithms
B. Distance-based methods
C. Character/parsimony-based methods
D. Maximum-likelihood methods
E. PHYLIP
F. Evolutionary and non-evolutionary uses for phylogenetics
G. The interaction of phylogenetics and sequence analysis

 

IV. Protein and RNA structure comparison and prediction

A. Ideal-data models underlying structure comparison and prediction

B. Algorithms for RNA folding

C. Methods and problems in protein structure comparison and prediction

D. Biological use of structure prediction and comparison tools

E. Overview of common bioinformatics utilities and web-based resources such as GCG and Entrez

 

Textbooks:

Recommended Texts:

Elements of Programming with Perl. A Johnson, Manning Publications, 2000

Notes for an Undergraduate Course on Bioinformatics,
D. Gusfield and K. Stevens, 200?, distributed on-line.

Additional library readings. Many available on-line, some on

Supplemental Texts

R. Durbin et al., Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge Press, 1998

A. Baxevanis and B. Ouellette, Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Wiley-Interscience 1998

D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer
Science and Computational Biology, Cambrige Press, 1997.

P. Clote, Introduction to Computational Molecular Biology, Wiley (2000)

Bioinformatics for Dummies,
Claverie and Notredame

Blast
O'Reilly (2003)
I. Korf and M. Yandell and J. Bedell

Basic Perl for Bioinformatics,
Tisdale, O'Reilly (2002)

Homework:

Each homework set includes creative problems as well as recitation problems to strengthen understanding and discover new material.

Computer Usage:

The lab portion of the class will emphasize practical computer exercises using both established bioformatics software and writing simple programs in Perl.

The course is aimed both at biology and computer science students. It is expected that the typical biology student will have a stronger background in molecular biology, genetics and biochemistry (not listed as a prerequisite) than is reflected in the prerequisite list, and that the computer science student will have a stronger background in programming and mathematics than is listed in the prerequisites. Some of the laboratory assignments will be done by groups mixing biology and computer science students whose backgrounds should complement each other.

The laboratory portions of the course will teach the hands-on computer tools and some programming in Perl, while the lectures will focus on the fundamental biological, mathematical and algorithmic chain of reasoning leading to the models that underlie these tools. Thus, the course requires some sophistication in mathematics, and some intuitive understanding of what computer programming is (a prior exposure to some computer programming is required, and a prior exposure to Unix is desirable), but we do not assume an extensive background in programming. No prior experience in Perl is assumed. Facility in using a web-browser is assumed. Almost all the lab assignments can be done on any machine that has web access and Perl. The scheduled time for the computer lab is intended to get students started on their computer work, but additional computer work outside the lab time is expected.