Oberseminar Lehrstuhl Mayr
-
Speicherplatzeffiziente q-Gramm-Indexe
Raum: 03.09.058
Zeit: Donnerstag, 23.02.2012, 11:00–11:30
Vortragender: Johannes Merkle
Ein wichtiges Forschungsgebiet der Bioinformatik ist das Durchsuchen von Sequenzen,
die biologische Daten repräsentieren, beispielsweise DNA. Da diese Sequenzen sehr groß
sein können, ist es nötig, effiziente Verfahren bei der Arbeit mit diesen Daten zu verwenden.
q-Gramm-Indexe sind Indexstrukturen mit der man effizient in großen Texten Zeichenketten
suchen kann. Sie haben allerdings den Nachteil, dass sie selbst sehr viel Speicher
benötigen, wenn der Text sehr groß ist. Deshalb wurden diverse Ansätze entwickelt, deren
Ziel es ist, die Größe von q-Gramm-Indexen zu reduzieren, darunter der n-Gram/2L sowie
der VGRAM-Index. Beide sollen im Rahmen dieser Bachelorarbeit vorgestellt und
- mit Fokus auf ihren Speicherbedarf - analysiert werden. Als Vergleich wird die Implementierung
des q-Gramm-Index der SeqAn-Programmbibliothek verwendet, in welcher
die beiden Verfahren auch implementiert wurden.
-
Inequalities for the Number of Walks in Graphs
Raum: 03.11.018
Zeit: Mittwoch, 21.12.2011, 13:00–14:00
Vortragender: Jeremias Weihmann
We investigate the growth of the number w_k of directed walks of length k in undirected graphs as well as related inequalities.
In that regard we generalize the inequalities of Erdös/Simonovits and Dress/Gutman using an inequality of Blakley/Dixon. An eigenwert approach yields the sandwich theorem which generalizes the inequalities of Dress/Gutman, Lagarias et al. and the inequality w_{2a} w_{2(a+b+c)} >= w_{2a+c} w_{2(a+b)+c} which is implied by a matrix theorem of Marcus/Newman.
We show a family of lower bounds for the largest eigenvalue of the adjacency matrix which contains Nikiforovs family of lower bounds as a special case. Using the sandwich theorem, we prove the monotonicity of this general family of lower bounds.
Finally, we show that the inequality w_0 w_3 >= w_1 w_2 holds for all trees while it isn't valid for general graphs. This demonstrates that there are nontrivial special cases of generalizations of the sandwich theorem that are valid if we consider nontrivial restricted graph families.
-
Implementation and comparison of suffix tree representations
Raum: 03.11.018
Zeit: Dienstag, 20.12.2011, 10:30–11:15
Vortragender: Alexander Aumann
Searching through large amounts of sequence data plays an important role in computer
science in general and in bio informatics specifically. Thus efficient indexing techniques are
required, one of them being the suffix tree. Suffix trees are very large though in relation to
the input sequence and this becomes a growing problem as the gap between disk storage
capacity and memory size increases. Luckily there has been intensive research on this topic
in the last 20 years and the purpose of this paper is to present some of the latest ideas for
disk based suffix trees as well as implementing one memory based and one of the proposed
disk based algorithms within the SeqAn library which is free to use for everybody who
in need for sequence analysis tools. These implementations are finally evaluated, with a
strong focus on the implementation of the disk based algorithm, the STTD64 proposed by
Halachev, to show that they are of practical usage.
-
Distributed Verification and Hardness of Distributed Approximation
Raum: 03.11.018
Zeit: Mittwoch, 16.11.2011, 14:00–15:15
Vortragender: Stephan Holzer (ETH Zürich)
We study the verification problem in distributed networks and show that in a distributed setting e.g. verifying that a subgraph H of a graph G is a minimum spanning tree takes much more time then actually computing a minimum spanning tree of G. We introduce a framework for a systematic study of distributed verification, and give almost tight lower bounds on the running time of distributed verification algorithms for many fundamental problems such as connectivity, spanning connected subgraph, and s - t cut verification.
We then show applications of these results in deriving strong unconditional time lower bounds on the hardness of distributed approximation for many classical optimization problems including minimum spanning tree, shortest paths, and minimum cut. Many of these results are the first non-trivial lower bounds for both exact and approximate distributed computation and they resolve previous open questions. Our result implies that there can be no distributed approximation algorithm for minimum spanning tree that is significantly faster than the current exact algorithm, for any approximation factor.