Selected Topics in Computational Biology (SS 2007)
    
    
      - Lectures:
 Dr. Jens Ernst
       
- Module:
	IN2127
	
       
- 
	3 hrs of lecture per week (in English)
 
       
- Location and Times:
 Fri. 10:45 - 13:00, MI Room 03.11.18
 
      
- Audience:
 Students of Bioinformatics
       
- Prerequisites:
 Discrete Structures
 Fundamental Algorithms or Efficient Algorithms
 Randomized Algorithms
 Algorithms for Bioinformatics I,II
 
       
- Information and Course Material:
 
       
- Topics:
 - 
	     -  1. Pairwise Sequence Alignment
- 
	         -  1.1. Preliminaries 
- 
	            -  1.1.1. Basic Definitions
-  1.1.2. Edit Distances 
-  1.1.3. Alignments and their Relations to Edit Distances
 
-  1.2. Needleman Wunsch Algorithm
- 
	            -  1.2.1. Overview and Definitions
-  1.2.2. Inductive Derivation
-  1.2.3. Hirschberg Optiomization
 
-  1.3. Gapped Alignments
- 
	            -  1.3.1. End-gap Free Alignments
-  1.3.2. Smith Waterman Algorithm
-  1.3.3. Special Gap Penalties - Introduction
-  1.3.4. Byers Algorithm
-  1.3.5. Affine Gap Penalties (Gotoh)
-  1.3.6. Convex Gap Penalties (Myers, Miller)
-  1.3.7. Four-Russians Speedup
 
 
-  2. Suffix Trees - Theory, Implementation and Applications
- 
	         -  2.1. Ukkonen's Online Linear-Time Algorithm 
- 
	            -  2.1.1. Constructing Suffix Tries 
-  2.1.2. Structure and Representation of Suffix Trees 
-  2.1.3. Efficient Construction of Suffix Trees 
 
-  2.2. McCreight's Algorithm (Outline Only)
- 
	            -  2.2.1. Overview and Definitions
-  2.2.2. Linear-time Algorithm
-  2.2.3. Hash Code Implementation
 
 
-  3. Applications of Suffix Trees in Computational Biology 
- 
	         -  3.0. Classification of Exact String Matching Problems  
- 
	            -  3.0.1. Indexing Problems
-  3.0.2. Dictionary Matching Problems
 
-  3.1. Indexing - Obvious Application of Suffix Trees 
-  3.2. Indexing with One Error  
- 
	            -  3.2.1. Central Idea
-  3.2.2. Efficient Computation of Set Intersections
-  3.2.3. Indexing With One Error in the Presence of Exact Matches
 
-  3.3. Using Suffix Trees for Dictionary Matching
-  3.4. Dictionary Matching with One Error
- 
	            -  3.4.1. Central Idea
-  3.4.2. Set Intersection on Tree Paths 
 
-  3.5. Longest Common Substrings of Two or More Strings
 
-  4. Suffix Arrays - Outline 
- 
	         -  4.1. Basic Definitions 
-  4.2. Indexing using Accelerants and Super-Accelerants 
 
 
-  5. Selected Algorithms for Gene Expression Analysis 
                  -  5.1. Introduction 
                         -  5.1.1. Basics of Gene Expression 
-  5.1.2. DNA Microarray Technology 
-  5.1.3. Sources of Systematic and Random Error in Microarray Experiments 
-  5.1.4. Gene Expression Profiling (Type-II Experiments) 
-  5.2. Similarity and Dissimilarity Measures for Gene Expression Profiles 
                         -  5.2.1. Definition for Similarity and Distance Measures 
-  5.2.2. Minkowski Distance 
-  5.2.3. Pearson Correlation Coefficient 
-  5.2.4. Spearman Rank-Order Correlation 
-  5.2.5. Kendall's Tau 
-  5.2.6. Jackknife Correlation 
-  5.2.7. Mutual Information 
-  5.3. Elementary Clustering Algorithms 
                         -  5.3.1. Hierarchical Clustering
                             
	                     -  5.3.1.1. Single Linkage 
-  5.3.1.2. Complete Linkage 
-  5.3.1.3. Average Linkage 
 
-  5.3.2. K-Means 
-  5.3.3. HCS
                         
-  5.3.4. Self-Organizing Maps 
-  5.4. Probabilistic Algorithms 
                          -  5.4.1. Short Reminder On Probability Theory
                          
-  5.4.2. Introduction to Tail Inequalities
                              
	                      -  5.4.2.1. Markov and Chebyshev 
-  5.4.2.2. Chernoff 
 
-  5.4.3. CAST
                          
-  5.4.4. CLICK
                          
	  (Note: The information given here is incomplete, preliminary and will be subject to change.)
      
    
      
    
    
Suggested Reading:
	
    	
    
    
Selected Literature:
	
	-  "The smallest atuomaton recognizing the subwords of a text", A. Blumer et. al., Theoretical Computer Science 40 (1985), 31-55 
-  "A comparison of imperative and purely functional suffix tree constructions", R. Giegerich, S. Kurz, Science of Computer Programming 25 (1995), 187-218 
-  "From Ukkonen to McCreight and Weiner: A Unifying View to Linear-Time Suffix Tree Construction", R. Giegerich, S. Kurtz, Algorithmica 19 (1997), 331-353
	
-  "Algorithms on Strings, Trees and Sequences", Dan Gusfield Cambridge University Press 
-  "Suffix Arrays: A new method for on-line string searches", U. Manber, G. Myers, SIAM J. Comput., Vol 22, No 5 (1993), 935-948 
-  "A Space-Economical Sufix Tree Construction Algoritm", E. McCreight, Journal of the ACM, Vol 23, No 2 (1976), 262-272 
-  "Practical Suffix Tree Construction", S. Tata et. al., Proceedings of the 30th VLDB Conference (2004) 
-  "On-Line Construction of Suffix Trees", E. Ukkonen Algorithmica (1995), 249-260 
    
      Office hours:
	by appointment