Oberseminar Lehrstuhl Mayr
-
Space-efficient q-gram indices
Raum: 03.11.018
Zeit: Donnerstag, 22.09.2011, 14.00 Uhr
Vortragender: Johannes Merkle
-
tt-analyze and tt-generate: Tools to Analyze and Generate Sequences with Trained Statistical Properties
Raum: 03.11.018
Zeit: Montag, 29.08.2011, 13.30 Uhr
Vortragender: Johannes Krugel
Algorithms working on sequences are influenced by the statistical properties of the sequences. Algorithms for fragment assembly for example usually produce a worse result if there are many repetitions. Also the space usage and running time of many data structures and algorithms depend on the statistical properties of the underlying text.
We implemented tt-analyze, a tool to analyze sequences for certain statistical properties, among others the entropy, the number and distribution of different substrings, and the repeat structure. Besides, we also designed and implemented tt-generate, a tool to generate synthetic sequences with certain predefined properties, using models such as a Markov process, a discrete autoregressive process, and a repeat model. In bioinformatics these models have primarily been used to analyze given sequences, whereas here, we use them to also generate synthetic ones. The respective parameters of the models can be defined manually or be learned from given training data.
The combination of both tools allows to generate sequences that are similar to real world sequences with respect to certain properties. This will allow to investigate the performance of algorithms under to some extent realistic, yet controlled conditions, and to determine the degree of dependence from parameters of the underlying sequence.
Both tools have an extensible design which allows the integration of new modules for other statistical properties or generating models with the same programming interface.
-
Distributed Shared Memory und Transactional Memory
Raum: 03.11.018
Zeit: Mittwoch, 10.8.2011, 14.00 Uhr
Vortragender: Björn Saballus, Stephan Posselt und Thomas Fuhrmann
-
Improvement Strategies for Iterative Topological Sorting in External Memory
Raum: 03.11.018
Zeit: Mittwoch, 27.7.2011, 14.00 Uhr>
Vortragender: Christoph Flake
-
On the I/O complexity of stencil computations on 2 dimensional grids
Raum: 03.11.018
Zeit: Mittwoch, 6.7.2011, 14.00 Uhr
Vortragender: Philipp Hupp
Over the past years computer architecture has changed and the gap
between floating point performance and memory bandwidth has expanded.
Furthermore the size of datasets increases continuously. Therefore the
bottleneck limiting the performance of many computations seems to be
more and more given by the memory subsystem.
The calculations performed by many numerical tasks, like evaluating an
interpolation function or solving a partial differential equation, can
be modeled by data access patterns, so called stencils. Stencils are
used vastly in numerics and memory efficient algorithms to evaluate them
are known. However, they have not been studied from an theoretical I/O
point of view and the lower bounds are missing.
Here we study a particular stencil, the 5-point stencil which arises
when differential quotients are approximated in two dimensions. Lower
and upper I/O bounds are derived for evaluating the 5-point stencil on a
full grid in a model closely related to the red-blue pebble game of Hong
and Kung.
The research prepares future analysis of stencil computations in higher
dimensions and on sparse grids, a variation of grids that reduce the
number of grid points to tackle high dimensional problems.
-
An Almost Tight Bound for Reordering Buffer Management
Raum: 03.11.018
Zeit: Mittwoch, 25.5.2011, 14.00 Uhr
Vortragender: Harald Räcke
In the reordering buffer problem, we are given an input sequence of
requests for service each of which corresponds to a point in a metric
space. The cost of serving the requests heavily depends on the
processing order. Serving a request induces cost corresponding to the
distance between itself and the previously served request, measured in
the underlying metric space. A reordering buffer with storage capacity
$k$ can be used to reorder the input sequence in a restricted fashion so
as to construct an output sequence with lower service cost. This simple
and universal framework is useful for many applications in computer
science and economics, e.g., disk scheduling, rendering in computer
graphics, or painting shops in car plants.
We present the first non-constant lower bound on the competitive ratio
of determinstic online algorithms for the reordering buffer problem in a
uniform metric. Precisely, we show that any online algorithm for the
problem has a competitive ratio of $\Omega(\sqrt{\log k/\log\logk})$.
We complement this result by presenting a determinstic online algorithm
that obtains a competitive ratio of $O(\sqrt{\log k})$ for the problem.
This improves upon an algorithm by Avigdor-Elgrabli and Rabani [SODA
2010] that obtained a competitive ratio of $O(\log k/\log\logk)$.
-
Tuning Funnelsort for Permuting
Raum: 03.11.018
Zeit: Mittwoch, 18.5.2011, 14.00 Uhr
Vortragender: Yang Guo
Several approaches are known for the task of permuting data in memory. A particular one that is based on funnelsort can be adapted very well to the characteristics of the memory hierarchy and therefore has the potential to perform very well. We compare it to other algorithms with theoretical analysis using an I/O model and with experiments conducted on real hardware.
-
Optimal Hierarchical Decompositions for Congestion Minimization in
Networks
Raum: 03.11.018
Zeit: Mittwoch, 11.5.2011, 13.45 Uhr
Vortragender: Harald Räcke
An oblivious routing protocol makes its routing decisions independent of
the traffic in the underlying network. This means that the path chosen
for a routing request may only depend on its source node, its
destination node, and on some random input. In spite of these serious
limitations it has been shown that there are oblivious routing
algorithms that obtain a polylogarithmic competitive ratios w.r.t. the
congestion in the network (i.e., maximum load of a network link).
We present a new hierarchical decomposition technique that leads to good
oblivious routing algorithms. This decomposition can also be used as a
generic tool for solving a large variety of cut based problems in
undirected graphs. As one example a straightforward dynamic
programming algorithm based on the decomposition gives an O(log
n)-approximation to the Minimum Bisection problem improving on a result
of Feige and Krauthgamer who obtained an approximation ratio of
O(\log^1.5 n).
-
Engineering Architecture Aware Algorithms
- two case studies and some thoughts on Algorithms Engineering
Raum: 03.11.018
Zeit: Mittwoch, 4.5.2011, 13.45 Uhr
Vortragender: Riko Jacob
I will present preliminary results from two ongoing projects, both
regarding memory intensive tasks in (huge) internal memory.
One is asking the question if I/O-efficient (sorting based)
algorithms can outperform the direct algorithm when permuting
several gigabytes of data in internal memory.
The other one reports on an efficient implementation for numerical
computations in high dimensional settings using so called sparse
grids.
Again, the improved implementation is inspired by I/O-efficient algorithms.
Finally, I will try to show how these two examples fit into a more
general theme of using and combining theoretical models and
experiments to find the algorithm and implementation that performs
best on a given hardware.
-
Algorithms and Index Structures for Approximate Pattern Matching
Raum: 03.11.018
Zeit: Mittwoch, 27.04.2011, 13.45 Uhr
Vortragender: Johannes Krugel
In many applications it is necessary to efficiently find a search pattern in a long text, for example while searching DNA databases. Many different index structures can be used to speed up the searches (e.g. q-gram index, suffix tree, suffix array). An extension of this problem is _approximate_ pattern matching where we want to find also those occurrences that contain some errors (spelling mistakes, gene mutations, etc.).
The talk presents three approximate search algorithms we implemented. An experimental comparison shows the strengths and weaknesses of the three algorithms, also compared to an online search.
Furthermore, two advanced index structures will shortly be presented: compressed index structures (with space usage proportional to the size of the compressed text), and DiGeST (a special representation of suffix trees optimized for secondary memory). With our test framework we plan to perform a systematic experimental comparison of the different classes of solutions for approximate pattern matching.
-
Ungleichungen für die Anzahl der Wege in Graphen
Raum: 03.11.018
Zeit: Mittwoch, 20.4.2011, 13.45 Uhr
Vortragender: Hanjo Täubig und Jeremias Weihmann
Wir untersuchen das Wachstum der Anzahl w_k von gerichteten Wegen der Länge
k in ungerichteten Graphen, sowie damit verbundene Ungleichungen.
Wir zeigen, dass es sowohl zusammenhängende bipartite Graphen als auch
unzusammenhängende kreisfreie Graphen gibt, für die die Ungleichung w_0 w_3
>= w_1 w_2 nicht gilt. Weiterhin zeigen wir, dass überraschenderweise die
Ungleichung aber für Bäume gültig ist.
Außerdem demonstrieren wir eine Methode, mit der man für eine gegebene
Gradsequenz entsprechende Bäume mit geringstmöglicher Differenz der beiden
Seiten erzeugen kann (sozusagen worst-case Instanzen bzgl. der Ungleichung).
Wir zeigen weiterhin die Gültigkeit der Ungleichung w_{2a} w_{2(a+b+c)} >=
w_{2a+c} w_{2(a+b)+c} für alle Graphen, was eine Verallgemeinerung der
Ungleichung w_{2a} w_{2b} >= w_{a+b}2 von Dress und Gutman darstellt. Ein
analoges Sandwich-Theorem leiten wir ebenfalls für geschlossene Wege von
einem gegebenen Knoten her.
Weiterhin beweisen wir, dass w_{2l+pk} w_{2l}^{k-1} >= w_{2l+p}^k gilt, was
eine weitere Verallgemeinerung der Dress/Gutman-Ungleichung sowie gleichzeit
eine Verallgemeinerung einer Ungleichung von Erdös und Simonovits darstellt.
Beide neuen Ungleichungen lassen sich direkt in entsprechende Ungleichungen
für Dichten höherer Ordnung übertragen.
Schließlich zeigen wir noch eine Folge unterer Schranken für den größten
Eigenwert der Adjazenzmatrix und beweisen diesbezügliche Monotonieresultate,
auch für eine bereits von Nikiforov publizierte Schranke.
-
Rangieren am Ablaufberg
Raum: 03.11.018
Zeit: Mittwoch, 13.4.2011, 15.15 Uhr
Vortragender: Riko Jacob
Im Güterverkehr der Eisenbahn ist es mitunter zur Auslastung der Fernstrecken notwendig, Züge zu verwenden die aus Waggons mit ähnlichen, aber nicht gleichen, Zielen bestehen. Dies macht Rangiervorgänge notwendig, deren Effizienz (Kosten, erzwungene Verweildauer im Bahnhof) direkt die Attraktivität der erbrachten Transportleistung beeinflussen. Sehr oft wird zum Rangieren ein Ablaufberg genutzt, an dem die Wagen eines Zuges in schneller Abfolge auf mehrere Gleise verteilt und so zu neuen Zügen zusammengestellt werden.
Im Vortrag werde ich zeigen wie derartige Rangiervorgänge durch Binärzahlen kodiert werden können. Dabei führen die Anzahl der verfügbaren Gleise und ihre Länge zu natürliche Einschränkungen der verwendbaren Kodes. Diese Kodierung erlaubt zum einen für bestimmte Situationen optimale Kodes anzugeben, aber auch die NP-Vollständigkeit einer allgemeineren Variante des Optimierungsproblems zu zeigen.
Die vorgestellten Überlegungen sind motiviert aus einer Zusammenarbeit der ETH Zürich mit den Schweizer Bundesbahnen.