The keynote talk will be given by Mikkel Thorup from the Department of Computer Science, University of Copenhagen.
Fast and Powerful Hashing using Tabulation
Randomized algorithms are often enjoyed for their simplicity, but the hash functions employed to yield the desired probabilistic guarantees are often too complicated to be practical. Here we survey recent results on how simple hashing schemes based on tabulation provide unexpectedly strong guarantees.
Simple tabulation hashing dates back to Zobrist 1970. Keys are viewed as consisting of c characters and we have precomputed character tables h1,...,hq mapping characters to random hash values. A key x = (x1, ..., xc) is hashed to h1x1 ⊕ h2x2..... ⊕ hcxc. This schemes is very fast with character tables in cache. While simple tabulation is not even 4-independent, it does provide many of the guarantees that are normally obtained via higher independence, e.g., linear probing and Cuckoo hashing.
Next we consider twisted tabulation where one character is "twisted" with some simple operations. The resulting hash function has powerful distributional properties: Chernoff-Hoeffding type tail bounds and a very small bias for min-wise hashing.
Finally, we consider double tabulation where we compose two simple tabulation functions, applying one to the output of the other, and show that this yields very high independence in the classic framework of Carter and Wegman 1977. In fact, w.h.p., for a given set of size proportional to that of the space consumed, double tabulation gives fully-random hashing.
While these tabulation schemes are all easy to implement and use, their analysis is not.
Mikkel Thorup is a Fellow of the ACM and of AT&T, and a Member of the Royal Danish Academy of Sciences and Letters. He is co-winner of the 2011 MAA Robbins Award and winner of the 2015 Villum Kann Rasmussen Award for Technical and Scientific Research, which is Denmark's biggest individual prize for research. His main work is in algorithms and data structures, and he is the editor of this area for the Journal of the ACM. One of his best-known results is a linear-time algorithm for the single-source shortest paths problem in undirected graphs.