LEA

Oberseminar Lehrstuhl Mayr

  • Speicherplatzeffiziente q-Gramm-Indexe

    Raum: 03.09.058
    Zeit: Donnerstag, 23.02.2012, 11:00–11:30
    Vortragender: Johannes Merkle

    Ein wichtiges Forschungsgebiet der Bioinformatik ist das Durchsuchen von Sequenzen, die biologische Daten repräsentieren, beispielsweise DNA. Da diese Sequenzen sehr groß sein können, ist es nötig, effiziente Verfahren bei der Arbeit mit diesen Daten zu verwenden. q-Gramm-Indexe sind Indexstrukturen mit der man effizient in großen Texten Zeichenketten suchen kann. Sie haben allerdings den Nachteil, dass sie selbst sehr viel Speicher benötigen, wenn der Text sehr groß ist. Deshalb wurden diverse Ansätze entwickelt, deren Ziel es ist, die Größe von q-Gramm-Indexen zu reduzieren, darunter der n-Gram/2L sowie der VGRAM-Index. Beide sollen im Rahmen dieser Bachelorarbeit vorgestellt und - mit Fokus auf ihren Speicherbedarf - analysiert werden. Als Vergleich wird die Implementierung des q-Gramm-Index der SeqAn-Programmbibliothek verwendet, in welcher die beiden Verfahren auch implementiert wurden.
  • Inequalities for the Number of Walks in Graphs

    Raum: 03.11.018
    Zeit: Mittwoch, 21.12.2011, 13:00–14:00
    Vortragender: Jeremias Weihmann

    We investigate the growth of the number w_k of directed walks of length k in undirected graphs as well as related inequalities. In that regard we generalize the inequalities of Erdös/Simonovits and Dress/Gutman using an inequality of Blakley/Dixon. An eigenwert approach yields the sandwich theorem which generalizes the inequalities of Dress/Gutman, Lagarias et al. and the inequality w_{2a} w_{2(a+b+c)} >= w_{2a+c} w_{2(a+b)+c} which is implied by a matrix theorem of Marcus/Newman. We show a family of lower bounds for the largest eigenvalue of the adjacency matrix which contains Nikiforovs family of lower bounds as a special case. Using the sandwich theorem, we prove the monotonicity of this general family of lower bounds. Finally, we show that the inequality w_0 w_3 >= w_1 w_2 holds for all trees while it isn't valid for general graphs. This demonstrates that there are nontrivial special cases of generalizations of the sandwich theorem that are valid if we consider nontrivial restricted graph families.
  • Implementation and comparison of suffix tree representations

    Raum: 03.11.018
    Zeit: Dienstag, 20.12.2011, 10:30–11:15
    Vortragender: Alexander Aumann

    Searching through large amounts of sequence data plays an important role in computer science in general and in bio informatics specifically. Thus efficient indexing techniques are required, one of them being the suffix tree. Suffix trees are very large though in relation to the input sequence and this becomes a growing problem as the gap between disk storage capacity and memory size increases. Luckily there has been intensive research on this topic in the last 20 years and the purpose of this paper is to present some of the latest ideas for disk based suffix trees as well as implementing one memory based and one of the proposed disk based algorithms within the SeqAn library which is free to use for everybody who in need for sequence analysis tools. These implementations are finally evaluated, with a strong focus on the implementation of the disk based algorithm, the STTD64 proposed by Halachev, to show that they are of practical usage.
  • Distributed Verification and Hardness of Distributed Approximation

    Raum: 03.11.018
    Zeit: Mittwoch, 16.11.2011, 14:00–15:15
    Vortragender: Stephan Holzer (ETH Zürich)

    We study the verification problem in distributed networks and show that in a distributed setting e.g. verifying that a subgraph H of a graph G is a minimum spanning tree takes much more time then actually computing a minimum spanning tree of G. We introduce a framework for a systematic study of distributed verification, and give almost tight lower bounds on the running time of distributed verification algorithms for many fundamental problems such as connectivity, spanning connected subgraph, and s - t cut verification.
    We then show applications of these results in deriving strong unconditional time lower bounds on the hardness of distributed approximation for many classical optimization problems including minimum spanning tree, shortest paths, and minimum cut. Many of these results are the first non-trivial lower bounds for both exact and approximate distributed computation and they resolve previous open questions. Our result implies that there can be no distributed approximation algorithm for minimum spanning tree that is significantly faster than the current exact algorithm, for any approximation factor.