|
ECS 124 THEORY AND PRACTICE OF BIOINFORMATICS
(4) III
Course Information
Lecture:
Tuesday, Thursday 1:30 - 3:00
1128 Hart Hall
Laboratory:
Friday 12:00 - 1:00, with spillover from 1:00 - 2:00
Lab Location: 73 Hutchison Hall (Enter through 75 Hutchison in the basement)
Instructor
Dan Gusfield, gusfield@cs.ucdavis.edu
2125 Kemper
Office hours will be posted next week.
Teaching Assistant
Marisano James, marisano@yahoo.com
Office hours: 1-2 Monday and 3-5 Weds. in 55 Kemper Hall; 1-2 Friday in 73 Huchison.
Class Webpage:
cs124.cs.ucdavis.edu
Lecture Videos:
Lecture Videos
These are lecture videos taken in 2002. I will provide an index of the lectures as we go along, but they
roughly correspond to the order of the lectures this year. However, not all lectures that will be given this year will
correspond to lectures in 2002, and conversely, there are lectures there that cover material
we will not cover. Be aware that in the first lecture, where the URL for the 2002 class is given, and the
password for access to class material is given, those are no longer correct. See the 2008 material for that information.
Prerequisites:
Some exposure to computer programming at the level of ECS 10 or 30 or E 5 or E 6; Stat 12
or 13 or 32 or 100 or Math 131/Stat 131A; Some exposure to genetics/molecular biology at the level of
Bio Sci 1A or MCB 10
Grading:
In order to pass the course, it is necessary to do
the labs, but most parts of the labs will not be intensely graded, just checked to see that they were
done. The grade will come from those parts of the labs that are intensely graded (20% of the grade);
quizzes/midterm (40%); final (40%)
Catalog Description:
Fundamental biological, mathematical and
algorithmic models underlying bioinformatics; sequence analysis,
database search, gene prediction, molecular structure comparison
and prediction, phylogenetic trees, high throughput biology,
massive datasets; applications in molecular biology and genetics;
use and extension of common bioinformatics tools.
Goals:
I. Understanding
the role and utility of bioinformatics in modern biology
II. Understanding
basic biological, mathematical and algorithmic concepts, techniques
and models underlying bioinformatics tools
III. Mastery
of common bioinformatics tools
IV. Simple
programming in Perl to extend the utility of common bioinformatics
tools
Expanded Course Description:
I. Initial
examples of the power of bioinformatics in modern biology
A. The
importance of sequence and structure comparison and of database
search
B. The use of sequence analysis in laboratory protocols
C. The use of phylogenetics in evolution and non-evolutionary
areas of biology
II. Sequence
analysis
A. Probabilistic
and biological models underlying sequence alignment
B. Computational efficiency and the need for compromises
in the models
C. The general technique of Dynamic Programming
D. Pairwise sequence alignment - algorithms for global,
local alignment and variations
E. Algorithms for multiple sequence alignment and the
identification/use of motifs
F. Database search - FASTA, BLAST, PSI-BLAST, scoring
matrices, statistical significance and its significance
G. Creation and use of motif models
H. Novel uses of sequence analysis in studying DNA, RNA
and proteins
I. Sequence analysis in genomics and high throughput biology
III. Phylogenetic
algorithms
A. Probabilistic
and ideal-data models underlying phylogenetic algorithms
B. Distance-based methods
C. Character/parsimony-based methods
D. Maximum-likelihood methods
E. PHYLIP
F. Evolutionary and non-evolutionary uses for phylogenetics
G. The interaction of phylogenetics and sequence analysis
IV. Protein
and RNA structure comparison and prediction
A. Ideal-data
models underlying structure comparison and prediction
B. Algorithms
for RNA folding
C. Methods
and problems in protein structure comparison and prediction
D. Biological
use of structure prediction and comparison tools
E. Overview
of common bioinformatics utilities and web-based resources such
as GCG and Entrez
Textbooks:
Recommended Texts:
Elements of Programming with Perl. A Johnson,
Manning Publications, 2000
Notes for an Undergraduate Course on Bioinformatics,
D. Gusfield and K. Stevens, 200?, distributed on-line.
Additional library readings. Many available
on-line, some on
Supplemental Texts
R. Durbin et al., Biological Sequence Analysis:
Probabilistic Models of Proteins and Nucleic Acids, Cambridge
Press, 1998
A. Baxevanis and B. Ouellette, Bioinformatics:
A Practical Guide to the Analysis of Genes and Proteins, Wiley-Interscience
1998
D. Gusfield, Algorithms on Strings, Trees
and Sequences: Computer
Science and Computational Biology, Cambrige Press, 1997.
P. Clote, Introduction to Computational
Molecular Biology, Wiley (2000)
Bioinformatics for Dummies,
Claverie and Notredame
Blast
O'Reilly (2003)
I. Korf and M. Yandell and J. Bedell
Basic Perl for Bioinformatics,
Tisdale,
O'Reilly (2002)
Homework:
Each homework set includes creative problems
as well as recitation problems to strengthen understanding and
discover new material.
Computer Usage:
The lab portion of the class will emphasize
practical computer exercises using both established bioformatics
software and writing simple programs in Perl.
The course is aimed both at biology and
computer science students. It is expected that the typical biology
student will have a stronger background in molecular biology,
genetics and biochemistry (not listed as a prerequisite) than
is reflected in the prerequisite list, and that the computer
science student will have a stronger background in programming
and mathematics than is listed in the prerequisites. Some of
the laboratory assignments will be done by groups mixing biology
and computer science students whose backgrounds should complement
each other.
The laboratory portions of the course will
teach the hands-on computer tools and some programming in Perl,
while the lectures will focus on the fundamental biological,
mathematical and algorithmic chain of reasoning leading to the
models that underlie these tools. Thus, the course requires some
sophistication in mathematics, and some intuitive understanding
of what computer programming is (a prior exposure to some computer
programming is required, and a prior exposure to Unix is desirable),
but we do not assume an extensive background in programming.
No prior experience in Perl is assumed. Facility in using a web-browser
is assumed.
Almost all the lab assignments can be done on any machine that
has web access and Perl. The scheduled time for the computer
lab is intended to get students started on their computer work,
but additional computer work outside the lab time is expected.
|