<style type="text/css">a[data-mtli~="mtli_filesize1233kB"]:after {content:" (12.33 kB)"}</style>{"id":12012,"date":"2024-11-06T00:13:28","date_gmt":"2024-11-05T23:13:28","guid":{"rendered":"https:\/\/monodes.com\/predaelli\/?p=12012"},"modified":"2024-11-06T00:13:28","modified_gmt":"2024-11-05T23:13:28","slug":"fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo","status":"publish","type":"post","link":"https:\/\/monodes.com\/predaelli\/2024\/11\/06\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/","title":{"rendered":"Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer Modulo)"},"content":{"rendered":"<p><em><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/\">Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer Modulo)<\/a><\/em><\/p>\n<p><!--more--><!--nextpage--><\/p>\n<blockquote>\n<div id=\"post-9623\" class=\"post-9623 post type-post status-publish format-standard hentry category-programming tag-fibonacci-hashing tag-hash-table\">\n<div class=\"post-content\">\n<h3 class=\"entry-title\">Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer\u00a0Modulo)<\/h3>\n<h4 class=\"vcard author\">by <span class=\"fn\">Malte Skarupke<\/span><\/h4>\n<div class=\"entry-content\">\n<p>I recently posted a blog post about a new hash table, and whenever I do something like that, I learn at least one new thing from my comments. In my last comment section Rich Geldreich talks about his hash table which uses \u201cFibonacci Hashing\u201d, which I hadn\u2019t heard of before. I have worked a lot on hash tables, so I thought I have at least heard of all the big important tricks and techniques, but I also know that there are so many small tweaks and improvements that you can\u2019t possibly know them all. I thought this might be another neat small trick to add to the collection.<\/p>\n<p>Turns out I was wrong. This is a big one. And everyone should be using it. Hash tables should not be prime number sized and they should not use an integer modulo to map hashes into slots. Fibonacci hashing is just better. Yet somehow nobody is using it and lots of big hash tables (including all the big implementations of std::unordered_map) are much slower than they should be because they don\u2019t use Fibonacci Hashing. So let\u2019s figure this out.<\/p>\n<p>&nbsp;<\/p>\n<p>First of all how do we find out what this Fibonacci Hashing is? Rich Geldreich called it \u201cKnuth\u2019s multiplicative method,\u201d but before looking it up in The Art of Computer Programming, I tried googling for it. The top result right now is <a href=\"http:\/\/book.huihoo.com\/data-structures-and-algorithms-with-object-oriented-design-patterns-in-c++\/html\/page214.html\">this page<\/a> which is old, with a copyright from 1997. Fibonacci Hashing is not mentioned on Wikipedia. You will find a few more pages mentioning it, mostly from universities who present this in their \u201cintroduction to hash tables\u201d material.<\/p>\n<p>From that I thought it\u2019s one of those techniques that they teach in university, but that nobody ends up using because it\u2019s actually more expensive for some reason. There are plenty of those in hash tables: Things that get taught because they\u2019re good in theory, but they\u2019re bad in practice so nobody uses them.<\/p>\n<p>Except somehow, on this one, the wires got crossed. Everyone uses the algorithm that\u2019s unnecessarily slow and leads to more problems, and nobody is using the algorithm that\u2019s faster while at the same time being more robust to problematic patterns. Knuth talked about Integer Modulo and about Fibonacci Hashing, and everybody should have taken away from that that they should use Fibonacci Hashing, but they didn\u2019t and everybody uses integer modulo.<\/p>\n<p>Before diving into this, let me just show you the results of a simple benchmark: Looking up items in a hash table:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-full wp-image-9650\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/unordered_map_comparison_larger_font.png?w=910&#038;ssl=1\" sizes=\"(max-width: 1176px) 100vw, 1176px\" srcset=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/unordered_map_comparison_larger_font.png 1176w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/unordered_map_comparison_larger_font.png?w=150&amp;h=76 150w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/unordered_map_comparison_larger_font.png?w=300&amp;h=151 300w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/unordered_map_comparison_larger_font.png?w=768&amp;h=387 768w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/unordered_map_comparison_larger_font.png?w=1024&amp;h=516 1024w\" alt=\"unordered_map_comparison_larger_font\" data-attachment-id=\"9650\" data-permalink=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/unordered_map_comparison_larger_font\/\" data-orig-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/unordered_map_comparison_larger_font.png\" data-orig-size=\"1176,593\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"unordered_map_comparison_larger_font\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/unordered_map_comparison_larger_font.png?w=300\" data-large-file=\"https:\/\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/unordered_map_comparison_larger_font.png\" \/><\/p>\n<p>In this benchmark I\u2019m comparing various unordered_map implementations. I\u2019m measuring their lookup speed when the key is just an integer. On the X-axis is the size of the container, the Y-axis is the time to find one item. To measure that, the benchmark is just spinning in a loop calling find() on this container, and at the end I divide the time that the loop took by the number of iterations in the loop. So on the left hand side, when the table is small enough to fit in cache, lookups are fast. On the right hand side the table is too big to fit in cache and lookups become much slower because we\u2019re getting cache misses for most lookups.<\/p>\n<p>But the main thing I want to draw attention to is the speed of ska::unordered_map, which uses Fibonacci hashing. Otherwise it\u2019s a totally normal implementation of unordered_map: It\u2019s just a vector of linked lists, with every element being stored in a separate heap allocation. On the left hand side, where the table fits in cache, ska::unordered_map can be more than twice as fast as the Dinkumware implementation of std::unordered_map, which is the next fastest implementation. (this is what you get when you use Visual Studio)<\/p>\n<p>So if you use std::unordered_map and look things up in a loop, that loop could be twice as fast if the hash table used Fibonacci hashing instead of integer modulo.<\/p>\n<h3>How it works<\/h3>\n<p>So let me explain how Fibonacci Hashing works. It\u2019s related to the golden ratio <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cphi%3D1.6180339...&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cphi%3D1.6180339...&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=%5Cphi%3D1.6180339...&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"\\phi=1.6180339...\" \/> which is related to the Fibonacci numbers, hence the name. One property of the Golden Ratio is that you can use it to subdivide any range roughly evenly without ever looping back to the starting position. What do I mean by subdividing? For example if you want to divide a circle into 8 sections, you can just make each step around the circle be an angle of <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2F8&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2F8&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2F8&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"360^\\circ\/8\" \/> degrees. And after eight steps you\u2019ll be back at the start. And for any <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=n&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=n&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=n&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"n\" \/> number of steps you want to take, you can just change the angle to be <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2Fn&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2Fn&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2Fn&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"360^\\circ\/n\" \/>. But what if you don\u2019t know ahead of time how many steps you\u2019re going to take? What if the <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=n&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=n&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=n&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"n\" \/> value is determined by something you don\u2019t control? Like maybe you have a picture of a flower, and you want to implement \u201cevery time the user clicks the mouse, add a petal to the flower.\u201d In that case you want to use the golden ratio: Make the angle from one petal to the next <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2F%5Cphi&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2F%5Cphi&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=360%5E%5Ccirc%2F%5Cphi&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"360^\\circ\/\\phi\" \/> and you can loop around the circle forever, adding petals, and the next petal will always fit neatly into the biggest gap and you\u2019ll never loop back to your starting position. Vi Hart has a good video about the topic:<\/p>\n<div class=\"embed-youtube\"><iframe loading=\"lazy\" title=\"Doodling in Math Class: Spirals, Fibonacci, and Being a Plant [2 of 3]\" src=\"https:\/\/www.youtube.com\/embed\/lOIP_Z_-0Hs?feature=oembed\" width=\"650\" height=\"488\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\" data-mce-fragment=\"1\"><\/iframe><\/div>\n<p>(The video is part two of a three-part series, part one is <a href=\"https:\/\/www.youtube.com\/watch?v=ahXIMUkSXX0\">here<\/a>)<\/p>\n<p>I knew about that trick because it\u2019s useful in procedural content generation: Any time that you want something to look randomly distributed, but you want to be sure that there are no clusters, you should at least try to see if you can use the golden ratio for that. (if that doesn\u2019t work, Halton Sequences are also worth trying before you try random numbers) But somehow it had never occurred to me to use the same trick for hash tables.<\/p>\n<p>So here\u2019s the idea: Let\u2019s say our hash table is 1024 slots large, and we want to map an arbitrarily large hash value into that range. The first thing we do is we map it using the above trick into the full 64 bit range of numbers. So we multiply the incoming hash value with <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D%2F%5Cphi+%5Capprox+11400714819323198485&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D%2F%5Cphi+%5Capprox+11400714819323198485&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D%2F%5Cphi+%5Capprox+11400714819323198485&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"2^{64}\/\\phi \\approx 11400714819323198485\" \/>. (the number 11400714819323198486 is closer but we don\u2019t want multiples of two because that would throw away one bit) Multiplying with that number will overflow, but just as we wrapped around the circle in the flower example above, this will wrap around the whole 64 bit range in a nice pattern, giving us an even distribution across the whole range from <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"0\" \/> to <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"2^{64}\" \/>. To illustrate, let\u2019s just look at the upper three bits. So we\u2019ll do this:<\/p>\n<div>\n<div id=\"highlighter_254386\" class=\"syntaxhighlighter  cpp\">\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td class=\"gutter\">\n<div class=\"line number1 index0 alt2\">1<\/div>\n<div class=\"line number2 index1 alt1\">2<\/div>\n<div class=\"line number3 index2 alt2\">3<\/div>\n<div class=\"line number4 index3 alt1\">4<\/div>\n<\/td>\n<td class=\"code\">\n<div class=\"container\">\n<div class=\"line number1 index0 alt2\"><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">fibonacci_hash_3_bits(<\/code><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">hash)<\/code><\/div>\n<div class=\"line number2 index1 alt1\"><code class=\"\" data-line=\"\">{<\/code><\/div>\n<div class=\"line number3 index2 alt2\"><code class=\"\" data-line=\"\">return<\/code> <code class=\"\" data-line=\"\">(hash * 11400714819323198485llu) &gt;&gt; 61;<\/code><\/div>\n<div class=\"line number4 index3 alt1\"><code class=\"\" data-line=\"\">}<\/code><\/div>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p>This will return the upper three bits after doing the multiplication with the magic constant. And we\u2019re looking at just three bits because it\u2019s easy to see how the golden ratio wraparound behaves when we just look at the top three bits. If we pass in some small numbers for the hash value, we get the following results from this:<\/p>\n<p>fibonacci_hash_3_bits(0) == 0<br \/>\nfibonacci_hash_3_bits(1) == 4<br \/>\nfibonacci_hash_3_bits(2) == 1<br \/>\nfibonacci_hash_3_bits(3) == 6<br \/>\nfibonacci_hash_3_bits(4) == 3<br \/>\nfibonacci_hash_3_bits(5) == 0<br \/>\nfibonacci_hash_3_bits(6) == 5<br \/>\nfibonacci_hash_3_bits(7) == 2<br \/>\nfibonacci_hash_3_bits(8) == 7<br \/>\nfibonacci_hash_3_bits(9) == 4<br \/>\nfibonacci_hash_3_bits(10) == 1<br \/>\nfibonacci_hash_3_bits(11) == 6<br \/>\nfibonacci_hash_3_bits(12) == 3<br \/>\nfibonacci_hash_3_bits(13) == 0<br \/>\nfibonacci_hash_3_bits(14) == 5<br \/>\nfibonacci_hash_3_bits(15) == 2<br \/>\nfibonacci_hash_3_bits(16) == 7<\/p>\n<p>This gives a pretty even distribution: The number 0 comes up three times, all other numbers come up twice. And every number is far removed from the previous and the next number. If we increase the input by one, the output jumps around quite a bit. So this is starting to look like a good hash function. And also a good way to map a number from a larger range into the range from 0 to 7.<\/p>\n<p>In fact we already have the whole algorithm right here. All we have to do to get an arbitrary power of two range is to change the shift amount. So if my hash table is size 1024, then instead of just looking at the top 3 bits I want to look at the top 10 bits. So I shift by 54 instead of 61. Easy enough.<\/p>\n<p>Now if you actually run a full hash function analysis on this, you find that it doesn\u2019t make for a great hash function. It\u2019s not terrible, but you will quickly find patterns. But if we make a hash table with a STL-style interface, we don\u2019t control the hash function anyway. The hash function is being provided by the user. So we will just use Fibonacci hashing to map the result of the hash function into the range that we want.<\/p>\n<h3>The problems with integer modulo<\/h3>\n<p>So why is integer modulo bad anyways? Two reasons: 1. It\u2019s slow. 2. It can be real stupid about patterns in the input data. So first of all how slow is integer modulo? If you\u2019re just doing the straightforward implementation like this:<\/p>\n<div>\n<div id=\"highlighter_185216\" class=\"syntaxhighlighter  cpp\">\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td class=\"gutter\">\n<div class=\"line number1 index0 alt2\">1<\/div>\n<div class=\"line number2 index1 alt1\">2<\/div>\n<div class=\"line number3 index2 alt2\">3<\/div>\n<div class=\"line number4 index3 alt1\">4<\/div>\n<\/td>\n<td class=\"code\">\n<div class=\"container\">\n<div class=\"line number1 index0 alt2\"><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">hash_to_slot(<\/code><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">hash, <\/code><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">num_slots)<\/code><\/div>\n<div class=\"line number2 index1 alt1\"><code class=\"\" data-line=\"\">{<\/code><\/div>\n<div class=\"line number3 index2 alt2\"><code class=\"\" data-line=\"\">return<\/code> <code class=\"\" data-line=\"\">hash % num_slots;<\/code><\/div>\n<div class=\"line number4 index3 alt1\"><code class=\"\" data-line=\"\">}<\/code><\/div>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p>Then this is real slow. It takes roughly 9 nanoseconds on my machine. Which, if the hash table is in cache, is about five times longer than the rest of the lookup takes. If you get cache misses then those dominate, but it\u2019s not good that this integer modulo is making our lookups several times slower when the table is in cache. Still the GCC, LLVM and boost implementations of unordered_map use this code to map the hash value to a slot in the table. And they are really slow because of this. The Dinkumware implementation is a little bit smarter: It takes advantage of the fact that when the table is sized to be a power of two, you can do an integer modulo by using a binary and:<\/p>\n<div>\n<div id=\"highlighter_755282\" class=\"syntaxhighlighter  cpp\">\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td class=\"gutter\">\n<div class=\"line number1 index0 alt2\">1<\/div>\n<div class=\"line number2 index1 alt1\">2<\/div>\n<div class=\"line number3 index2 alt2\">3<\/div>\n<div class=\"line number4 index3 alt1\">4<\/div>\n<\/td>\n<td class=\"code\">\n<div class=\"container\">\n<div class=\"line number1 index0 alt2\"><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">hash_to_slot(<\/code><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">hash, <\/code><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">num_slots_minus_one)<\/code><\/div>\n<div class=\"line number2 index1 alt1\"><code class=\"\" data-line=\"\">{<\/code><\/div>\n<div class=\"line number3 index2 alt2\"><code class=\"\" data-line=\"\">return<\/code> <code class=\"\" data-line=\"\">hash &amp; num_slots_minus_one;<\/code><\/div>\n<div class=\"line number4 index3 alt1\"><code class=\"\" data-line=\"\">}<\/code><\/div>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p>Which takes roughly 0 nanoseconds on my machine. Since num_slots is a power of two, this just chops off all the upper bits and only keeps the lower bits. So in order to use this you have to be certain that all the important information is in the lower bits. Dinkumware ensures this by using a more complicated hash function than the other implementations use: For integers it uses a FNV1 hash. It\u2019s much faster than a integer modulo, but it still makes your hash table twice as slow as it could be since FNV1 is expensive. And there is a bigger problem: If you provide your own hash function because you want to insert a custom type into the hash table, you have to know about this implementation detail.<\/p>\n<p>We have been bitten by that implementation detail several times at work. For example we had a custom ID type that\u2019s just a wrapper around a 64 bit integer which is composed from several sources of information. And it just so happens that that ID type has really important information in the upper bits. It took surprisingly long until someone noticed that we had a slow hash table in our codebase that could literally be made a hundred times faster just by changing the order of the bits in the hash function, because the integer modulo was chopping off the upper bits.<\/p>\n<p>Other tables, like google::dense_hash_map also use a power of two hash size to get the fast integer modulo, but it doesn\u2019t provide it\u2019s own implementation of std::hash&lt;int&gt; (because it can\u2019t) so you have to be real careful about your upper bits when using dense_hash_map.<\/p>\n<p>Talking about google::dense_hash_map, integer modulo brings even more problems with it for open addressing tables it. Because if you store all your data in one array, patterns in the input data suddenly start to matter more. For example google::dense_hash_map gets really, really slow if you ever insert a lot of sequential numbers. Because all those sequential numbers get assigned slots right next to each other, and if you\u2019re then trying to look up a key that\u2019s not in the table, you have to probe through a lot of densely occupied slots before you find your first empty slot. You will never notice this if you only look up keys that are actually in the map, but unsuccessful lookups can be dozens of times slower than they should be.<\/p>\n<p>Despite these flaws, all of the fastest hash table implementations use the \u201cbinary and\u201d approach to assign a hash value to a slot. And then you usually try to compensate for the problems by using a more complicated hash function, like FNV1 in the Dinkumware implementation.<\/p>\n<h3>Why Fibonacci Hashing is the Solution<\/h3>\n<p>Fibonacci hashing solves both of these problems. 1. It\u2019s really fast. It\u2019s a integer multiplication followed by a shift. It takes roughly 1.5 nanoseconds on my machine, which is fast enough that it\u2019s getting real hard to measure. 2. It mixes up input patterns. It\u2019s like you\u2019re getting a second hashing step for free after the first hash function finishes. If the first hash function is actually just the identity function (as it should be for integers) then this gives you at least a little bit of mixing that you wouldn\u2019t otherwise get.<\/p>\n<p>But really it\u2019s better because it\u2019s faster. When I worked on hash tables I was always frustrated by how much time we are spending on the simple problem of \u201cmap a large number to a small number.\u201d It\u2019s literally the slowest operation in the hash table. (outside of cache misses of course, but let\u2019s pretend you\u2019re doing several lookups in a row and the table is cached) And the only alternative was the \u201cpower of two binary and\u201d version which discards bits from the hash function and can lead to all kinds of problems. So your options are either slow and safe, or fast and losing bits and getting potentially many hash collisions if you\u2019re ever not careful. And everybody had this problem. I googled a lot for this problem thinking \u201csurely somebody must have a good method for bringing a large number into a small range\u201d but everybody was either doing slow or bad things. For example <a href=\"https:\/\/github.com\/lemire\/fastrange\">here<\/a> is an approach (called \u201cfastrange\u201d) that almost re-invents Fibonacci hashing, but it exaggerates patterns where Fibonacci hashing breaks up patterns. It\u2019s the same speed as Fibonacci hashing, but when I\u2019ve tried to use it, it never worked for me because I would suddenly find patterns in my hash function that I wasn\u2019t even aware of. (and with fastrange your subtle patterns suddenly get exaggerated to be huge problems) Despite its problems it is being used in Tensorflow, because everybody is desperate for a faster solution of this the problem of mapping a large number into a small range.<\/p>\n<h3>If Fibonacci Hashing is so great, why is nobody using it?<\/h3>\n<p>That\u2019s a tricky question because there is so little information about Fibonacci hashing on the Internet, but I think it has to do with a historical misunderstanding. In The Art of Computer Programming, Knuth introduces three hash functions to use for hash tables:<\/p>\n<ol>\n<li>Integer Modulo<\/li>\n<li>Fibonacci Hashing<\/li>\n<li>Something related to CRC hashes<\/li>\n<\/ol>\n<p>The inclusion of the integer modulo in this list is a bit weird from today\u2019s perspective because it\u2019s not much of a hash function. It just maps from a larger range into a smaller range, and doesn\u2019t otherwise do anything. Fibonacci hashing is actually a hash function, not the greatest hash function, but it\u2019s a good introduction. And the third one is too complicated for me to understand. It\u2019s something about coming up with good coefficients for a CRC hash that has certain properties about avoiding collisions in hash tables. Probably very clever, but somebody else has to figure that one out.<\/p>\n<p>So what\u2019s happening here is that Knuth uses the term \u201chash function\u201d differently than we use it today.\u00a0 Today the steps in a hash table are something like this:<\/p>\n<ol>\n<li>Hash the key<\/li>\n<li>Map the hash value to a slot<\/li>\n<li>Compare the item in the slot<\/li>\n<li>If it\u2019s not the right item, repeat step 3 with a different item until the right one is found or some end condition is met<\/li>\n<\/ol>\n<p>We use the term \u201chash function\u201d to refer to step 1. But Knuth uses the term \u201chash function\u201d to refer to something that does both step 1 and step 2. So when he refers to a hash function, he means something that both hashes the incoming key, and assigns it to a slot in the table. So if the table is only 1024 items large, the hash function can only return a value from 0 to 1023. This explains why \u201cinteger modulo\u201d is a hash function for Knuth: It doesn\u2019t do anything in step 1, but it does work well for step 2. So if those two steps were just one step, then integer modulo does a good job at that one step since it does a good job at our step 2. But when we take it apart like that, we\u2019ll see that Fibonacci Hashing is an improvement compared to integer modulo in both steps. And since we\u2019re only using it for step 2, it allows us to use a faster implementation for step 1 because the hash function gets some help from the additional mixing that Fibonacci hashing does.<\/p>\n<p>But this difference in terms, where Knuth uses \u201chash function\u201d to mean something different than \u201chash function\u201d means for std::unordered_map, explains to me why nobody is using Fibonacci hashing. When judged as a \u201chash function\u201d in today\u2019s terms, it\u2019s not that great.<\/p>\n<p>After I found that Fibonacci hashing is not mentioned anywhere, I did more googling and was more successful searching for \u201cmultiplicative hashing.\u201d Fibonacci hashing is just a simple multiplicative hash with a well-chosen magic number. But the language that I found describing multiplicative hashing explains why nobody is using this. For example Wikipedia has <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hash_function#Multiplicative_hashing\">this<\/a> to say about multiplicative hashing:<\/p>\n<blockquote><p>Multiplicative hashing is a simple type of hash function often used by teachers introducing students to hash tables. Multiplicative hash functions are simple and fast, but have higher collision rates in hash tables than more sophisticated hash functions.<\/p><\/blockquote>\n<p>So just from that, I certainly don\u2019t feel encouraged to check out what this \u201cmultiplicative hashing\u201d is. Or to get a feeling for how teachers introduce this, <a href=\"https:\/\/www.youtube.com\/watch?v=0M_kIqhwbFo\">here<\/a> is MIT instructor Erik Demaine (who\u2019s videos I very much recommend) introducing hash functions, and he says this:<\/p>\n<blockquote><p>I\u2019m going to give you three hash functions. Two of which are, let\u2019s say common practice, and the third of which is actually theoretically good. So the first two are not good theoretically, you can prove that they\u2019re bad, but at least they give you some flavor.<\/p><\/blockquote>\n<p>Then he talks about integer modulo, multiplicative hashing, and a combination of the two. He doesn\u2019t mention the Fibonacci hashing version of multiplicative hashing, and the introduction probably wouldn\u2019t inspire people to go seek out more information it.<\/p>\n<p>So I think people just learn that multiplicative hashing is not a good hash function, and they never learn that multiplicative hashing is a great way to map large values into a small range.<\/p>\n<p>Of course it could also be that I missed some unknown big downside to Fibonacci hashing and that there is a real good reason why nobody is using this, but I didn\u2019t find anything like that. But it could be that I didn\u2019t find anything bad about Fibonacci hashing simply because it\u2019s hard to find anything at all about Fibonacci hashing, so let\u2019s do our own analysis:<\/p>\n<h3>Analyzing Fibonacci Hashing<\/h3>\n<p>So I have to confess that I don\u2019t know much about analyzing hash functions. It seems like the best test is to see how close a hash function gets to the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Avalanche_effect\">strict avalanche criterion<\/a> which \u201cis satisfied if, whenever a single input bit is changed, each of the output bits changes with a 50% probability.\u201d<\/p>\n<p>To measure that I wrote a small program that takes a hash <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=H&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=H&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=H&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"H\" \/>, and runs it through Fibonacci hashing to get a slot in the hash table <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=S&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=S&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=S&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"S\" \/>. Then I change a single bit in <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=H&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=H&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=H&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"H\" \/>, giving me <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=H%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=H%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=H%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"H'\" \/>, and after I run that through Fibonacci hashing I get a slot <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=S%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=S%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=S%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"S'\" \/>. Then I measure depending on which bit I changed in <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=H%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=H%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=H%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"H'\" \/>, which bits are likely to change in <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=S%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=S%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=S%27&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"S'\" \/> compared to <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=S&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=S&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=S&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"S\" \/> and which bits are unlikely to change.<\/p>\n<p>I then run that same test every time after I doubled a hash table, because with different size hash tables there are more bits in the output: If the hash table only has four slots, there are only two bits in the output. If the hash table has 1024 slots, there are ten bits in the output. Finally I color code the result and plot the whole thing as a picture that looks like this:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-full wp-image-9634\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_fibonacci1.png?w=910&#038;ssl=1\" sizes=\"(max-width: 1100px) 100vw, 1100px\" srcset=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibonacci1.png 1100w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibonacci1.png?w=150&amp;h=35 150w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibonacci1.png?w=300&amp;h=70 300w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibonacci1.png?w=768&amp;h=179 768w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibonacci1.png?w=1024&amp;h=238 1024w\" alt=\"Avalanche_fibonacci.png\" data-attachment-id=\"9634\" data-permalink=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/avalanche_fibonacci\/\" data-orig-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibonacci1.png\" data-orig-size=\"1100,256\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Avalanche_fibonacci\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibonacci1.png?w=300\" data-large-file=\"https:\/\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_fibonacci1.png\" \/><\/p>\n<p>Let me explain this picture. Each row of pixels represents one of the 64 bits of the input hash. The bottom-most row is the first bit, the topmost row is the 64th bit. Each column represents one bit in the output. The first two columns are the output bits for a table of size 4, the next three columns are the output bits for a table of size 8 etc. until the last 23 bits are for a table of size eight million. The color coding means this:<\/p>\n<ul>\n<li>A black pixel indicates that when the input pixel for that row changes, the output pixel for that column has a 50% chance of changing. (this is ideal)<\/li>\n<li>A blue pixel means that when the input pixel changes, the ouput pixel has a 100% chance of changing.<\/li>\n<li>A red pixel means that when the input pixel changes, the output pixel has a 0% chance of changing.<\/li>\n<\/ul>\n<p>For a really good hash function the entire picture would be black. So Fibonacci hashing is not a really good hash function.<\/p>\n<p>The worst pattern we can see is at the top of the picture: The last bit of the input hash (the top row in the picture) can always only affect the last bit of the output slot in the table. (the last column of each section) So if the table has 1024 slots, the last bit of the input hash can only determine the bit in the output slot for the number 512. It will never change any other bit in the output. Lower bits in the input can affect more bits in the output, so there is more mixing going on for those.<\/p>\n<p>Is it bad that the last bit in the input can only affect one bit in the output? It would be bad if we used this as a hash function, but it\u2019s not necessarily bad if we just use this to map from a large range into a small range. Since each row has at least one blue or black pixel in it, we can be certain that we don\u2019t lose information, since every bit from the input will be used. What would be bad for mapping from a large range into a small range is if we had a row or a column that has only red pixels in it.<\/p>\n<p>Let\u2019s also look at what this would look like for integer modulo, starting with integer modulo using prime numbers:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-full wp-image-9635\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_prime1.png?w=910&#038;ssl=1\" sizes=\"(max-width: 1188px) 100vw, 1188px\" srcset=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_prime1.png 1188w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_prime1.png?w=150&amp;h=32 150w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_prime1.png?w=300&amp;h=65 300w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_prime1.png?w=768&amp;h=165 768w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_prime1.png?w=1024&amp;h=221 1024w\" alt=\"Avalanche_prime.png\" data-attachment-id=\"9635\" data-permalink=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/avalanche_prime\/\" data-orig-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_prime1.png\" data-orig-size=\"1188,256\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Avalanche_prime\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_prime1.png?w=300\" data-large-file=\"https:\/\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_prime1.png\" \/><\/p>\n<p>This one has more randomness at the top, but a clearer pattern at the bottom. All that red means that the first few bits in the input hash can only determine the first few bits in the output hash. Which makes sense for integer modulo. A small number modulo a large number will never result in a large number, so a change to a small number can never affect the later bits.<\/p>\n<p>This picture is still \u201cgood\u201d for mapping from a large range into a small range because we have that diagonal line of bright blue pixels in each block. To show a bad function, here is integer modulo with a power of two size:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-full wp-image-9636\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_power_of_two1.png?w=910&#038;ssl=1\" sizes=\"(max-width: 1100px) 100vw, 1100px\" srcset=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_power_of_two1.png 1100w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_power_of_two1.png?w=150&amp;h=35 150w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_power_of_two1.png?w=300&amp;h=70 300w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_power_of_two1.png?w=768&amp;h=179 768w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_power_of_two1.png?w=1024&amp;h=238 1024w\" alt=\"Avalanche_power_of_two.png\" data-attachment-id=\"9636\" data-permalink=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/avalanche_power_of_two\/\" data-orig-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_power_of_two1.png\" data-orig-size=\"1100,256\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Avalanche_power_of_two\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_power_of_two1.png?w=300\" data-large-file=\"https:\/\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_power_of_two1.png\" \/><\/p>\n<p>This one is obviously bad: The upper bits of the hash value have completely red rows, because they will simply get chopped off. Only the lower bits of the input have any effect, and they can only affect their own bits in the output. This picture right here shows why using a power of two size requires that you are careful with your choice of hash function for the hash table: If those red rows represent important bits, you will simply lose them.<\/p>\n<p>Finally let\u2019s also look at the \u201cfastrange\u201d algorithm that I briefly mentioned above. For power of two sizes it looks really bad, so let me show you what it looks like for prime sizes:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-full wp-image-9637\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_fastrange_prime.png?w=910&#038;ssl=1\" sizes=\"(max-width: 1188px) 100vw, 1188px\" srcset=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fastrange_prime.png 1188w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fastrange_prime.png?w=150&amp;h=32 150w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fastrange_prime.png?w=300&amp;h=65 300w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fastrange_prime.png?w=768&amp;h=165 768w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fastrange_prime.png?w=1024&amp;h=221 1024w\" alt=\"Avalanche_fastrange_prime.png\" data-attachment-id=\"9637\" data-permalink=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/avalanche_fastrange_prime\/\" data-orig-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fastrange_prime.png\" data-orig-size=\"1188,256\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Avalanche_fastrange_prime\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fastrange_prime.png?w=300\" data-large-file=\"https:\/\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_fastrange_prime.png\" \/><\/p>\n<p>What we see here is that fastrange throws away the lower bits of the input range. It only uses the upper bits. I had used it before and I had noticed that a change in the lower bits doesn\u2019t seem to make much of a difference, but I had never realized that it just completely throws the lower bits away. This picture totally explains why I had so many problems with fastrange. Fastrange is a bad function to map from a large range into a small range because it\u2019s throwing away the lower bits.<\/p>\n<p>Going back to Fibonacci hashing, there is actually one simple change you can make to improve the bad pattern for the top bits: Shift the top bits down and xor them once. So the code changes to this:<\/p>\n<div>\n<div id=\"highlighter_553146\" class=\"syntaxhighlighter  cpp\">\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td class=\"gutter\">\n<div class=\"line number1 index0 alt2\">1<\/div>\n<div class=\"line number2 index1 alt1\">2<\/div>\n<div class=\"line number3 index2 alt2\">3<\/div>\n<div class=\"line number4 index3 alt1\">4<\/div>\n<div class=\"line number5 index4 alt2\">5<\/div>\n<\/td>\n<td class=\"code\">\n<div class=\"container\">\n<div class=\"line number1 index0 alt2\"><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">index_for_hash(<\/code><code class=\"\" data-line=\"\">size_t<\/code> <code class=\"\" data-line=\"\">hash)<\/code><\/div>\n<div class=\"line number2 index1 alt1\"><code class=\"\" data-line=\"\">{<\/code><\/div>\n<div class=\"line number3 index2 alt2\"><code class=\"\" data-line=\"\">hash ^= hash &gt;&gt; shift_amount;<\/code><\/div>\n<div class=\"line number4 index3 alt1\"><code class=\"\" data-line=\"\">return<\/code> <code class=\"\" data-line=\"\">(11400714819323198485llu * hash) &gt;&gt; shift_amount;<\/code><\/div>\n<div class=\"line number5 index4 alt2\"><code class=\"\" data-line=\"\">}<\/code><\/div>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p>It\u2019s almost looking more like a proper hash function, isn\u2019t it? This makes the function two cycles slower, but it gives us the following picture:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"9640\" data-permalink=\"https:\/\/monodes.com\/predaelli\/?attachment_id=9640\" data-orig-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2022\/09\/dragonstore-classifica-2022-08.webp?fit=1080%2C1080&amp;ssl=1\" data-orig-size=\"1080,1080\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"dragonstore-classifica-2022-08\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2022\/09\/dragonstore-classifica-2022-08.webp?fit=300%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2022\/09\/dragonstore-classifica-2022-08.webp?fit=510%2C510&amp;ssl=1\" class=\"alignnone size-full wp-image-9640\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_fibxor.png?w=910&#038;ssl=1\" sizes=\"(max-width: 1100px) 100vw, 1100px\" srcset=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibxor.png 1100w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibxor.png?w=150&amp;h=35 150w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibxor.png?w=300&amp;h=70 300w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibxor.png?w=768&amp;h=179 768w, https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibxor.png?w=1024&amp;h=238 1024w\" alt=\"Avalanche_fibxor\" data-attachment-id=\"9640\" data-permalink=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/avalanche_fibxor\/\" data-orig-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibxor.png\" data-orig-size=\"1100,256\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Avalanche_fibxor\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/probablydance.com\/wp-content\/uploads\/2018\/06\/avalanche_fibxor.png?w=300\" data-large-file=\"https:\/\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2024\/11\/avalanche_fibxor.png\" \/><\/p>\n<p>This looks a bit nicer, with the problematic pattern at the top gone. (and we\u2019re seeing more black pixels now which is the ideal for a hash function) I\u2019m not using this though because we don\u2019t really need a good hash function, we need a good function to map from a large range into a small range. And this is on the critical path for the hash table, before we can even do the first comparison. Any cycle added here makes the whole line in the graph above move up.<\/p>\n<p>So I keep on saying that we need a good function to map from a large range into a small range, but I haven\u2019t defined what \u201cgood\u201d means there. I don\u2019t know of a proper test like the avalanche analysis for hash functions, but my first attempt at a definition for \u201cgood\u201d would be that every value in the smaller range is equally likely to occur. That test is very easy to fulfill though: all of the methods (including fastrange) fulfill that criteria. So how about we pick a sequence of values in the input range and check if every value in the output is equally likely. I had given the examples for numbers 0 to 16 above. We could also do multiples of 8 or all powers of two or all prime numbers or the Fibonacci numbers. Or let\u2019s just try as many sequences as possible until we figure out the behavior of the function.<\/p>\n<p>Looking at the above list we see that there might be a problematic pattern with multiples of 4: fibonacci_hash_3_bits(4) returned 3, for fibonacci_hash_3_bits(8) returned 7, fibonacci_hash_3_bits(12) returned 3 again and fibonacci_hash_3_bits(16) returned 7 again. Let\u2019s see how this develops if we print the first sixteen multiples of four:<\/p>\n<p>Here are the results:<\/p>\n<p>0 -&gt; 0<br \/>\n4 -&gt; 3<br \/>\n8 -&gt; 7<br \/>\n12 -&gt; 3<br \/>\n16 -&gt; 7<br \/>\n20 -&gt; 2<br \/>\n24 -&gt; 6<br \/>\n28 -&gt; 2<br \/>\n32 -&gt; 6<br \/>\n36 -&gt; 1<br \/>\n40 -&gt; 5<br \/>\n44 -&gt; 1<br \/>\n48 -&gt; 5<br \/>\n52 -&gt; 1<br \/>\n56 -&gt; 4<br \/>\n60 -&gt; 0<br \/>\n64 -&gt; 4<\/p>\n<p>Doesn\u2019t look too bad actually: Every number shows up twice, except the number 1 shows up three times. What about multiples of eight?<\/p>\n<p>0 -&gt; 0<br \/>\n8 -&gt; 7<br \/>\n16 -&gt; 7<br \/>\n24 -&gt; 6<br \/>\n32 -&gt; 6<br \/>\n40 -&gt; 5<br \/>\n48 -&gt; 5<br \/>\n56 -&gt; 4<br \/>\n64 -&gt; 4<br \/>\n72 -&gt; 3<br \/>\n80 -&gt; 3<br \/>\n88 -&gt; 3<br \/>\n96 -&gt; 2<br \/>\n104 -&gt; 2<br \/>\n112 -&gt; 1<br \/>\n120 -&gt; 1<br \/>\n128 -&gt; 0<\/p>\n<p>Once again doesn\u2019t look too bad, but we are definitely getting more repeated numbers. So how about multiples of sixteen?<\/p>\n<p>0 -&gt; 0<br \/>\n16 -&gt; 7<br \/>\n32 -&gt; 6<br \/>\n48 -&gt; 5<br \/>\n64 -&gt; 4<br \/>\n80 -&gt; 3<br \/>\n96 -&gt; 2<br \/>\n112 -&gt; 1<br \/>\n128 -&gt; 0<br \/>\n144 -&gt; 7<br \/>\n160 -&gt; 7<br \/>\n176 -&gt; 6<br \/>\n192 -&gt; 5<br \/>\n208 -&gt; 4<br \/>\n224 -&gt; 3<br \/>\n240 -&gt; 2<br \/>\n256 -&gt; 1<\/p>\n<p>This looks a bit better again, and if we were to look at multiples of 32 it would look better still. The reason why the number 8 was starting to look problematic was not because it\u2019s a power of two. It was starting to look problematic because it is a Fibonacci number. If we look at later Fibonacci numbers, we see more problematic patterns. For example here are multiples of 34:<\/p>\n<p>0 -&gt; 0<br \/>\n34 -&gt; 0<br \/>\n68 -&gt; 0<br \/>\n102 -&gt; 0<br \/>\n136 -&gt; 0<br \/>\n170 -&gt; 0<br \/>\n204 -&gt; 0<br \/>\n238 -&gt; 0<br \/>\n272 -&gt; 0<br \/>\n306 -&gt; 0<br \/>\n340 -&gt; 1<br \/>\n374 -&gt; 1<br \/>\n408 -&gt; 1<br \/>\n442 -&gt; 1<br \/>\n476 -&gt; 1<br \/>\n510 -&gt; 1<br \/>\n544 -&gt; 1<\/p>\n<p>That\u2019s looking bad. And later Fibonacci numbers will only look worse. But then again how often are you going to insert multiples of 34 into a hash table? In fact if you had to pick a group of numbers that\u2019s going to give you problems, the Fibonacci numbers are not the worst choice because they don\u2019t come up that often naturally. As a reminder, here are the first couple Fibonacci numbers: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584\u2026 The first couple numbers don\u2019t give us bad patterns in the output, but anything bigger than 13 does. And most of those are pretty harmless: I can\u2019t think of any case that would give out multiples of those numbers. 144 bothers me a little bit because it\u2019s a multiple of 8 and you might have a struct of that size, but even then your pointers will just be eight byte aligned, so you\u2019d have to get unlucky for all your pointers to be multiples of 144.<\/p>\n<p>But really what you do here is that you identify the bad pattern and you tell your users \u201cif you ever hit this bad pattern, provide a custom hash function to the hash table that fixes it.\u201d I mean people are happy to use integer modulo with powers of two, and for that it\u2019s ridiculously easy to find bad patterns: Normal pointers are a bad pattern for that. Since it\u2019s harder to come up with use cases that spit out lots of multiples of Fibonacci numbers, I\u2019m fine with having \u201cmultiples of Fibonacci numbers\u201d as bad patterns.<\/p>\n<p>So why are Fibonacci numbers a bad pattern for Fibonacci hashing anyways? It\u2019s not obvious if we just have the magic number multiplication and the bit shift. First of all we have to remember that the magic constant came from dividing by the golden ratio: <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D%2F%5Cphi+%5Capprox+11400714819323198485&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D%2F%5Cphi+%5Capprox+11400714819323198485&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D%2F%5Cphi+%5Capprox+11400714819323198485&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"2^{64}\/\\phi \\approx 11400714819323198485\" \/>. And then since we are truncating the result of the multiplication before we shift it, there is actually a hidden modulo by <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"2^{64}\" \/> in there. So whenever we are hashing a number <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=x&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=x&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"x\" \/> the slot is actually determined by this:<\/p>\n<p><img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=slot%5C_before%5C_shift%28x%29+%3D+%28x+%2A+2%5E%7B64%7D%2F%5Cphi%29+%5C%25+2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=slot%5C_before%5C_shift%28x%29+%3D+%28x+%2A+2%5E%7B64%7D%2F%5Cphi%29+%5C%25+2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=slot%5C_before%5C_shift%28x%29+%3D+%28x+%2A+2%5E%7B64%7D%2F%5Cphi%29+%5C%25+2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"slot\\_before\\_shift(x) = (x * 2^{64}\/\\phi) \\% 2^{64}\" \/><\/p>\n<p>I\u2019m leaving out the shift at the end because that part doesn\u2019t matter for figuring out why Fibonacci numbers are giving us problems. In the example of stepping around a circle (from the Vi Hart video above) the equation would look like this:<\/p>\n<p><img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=angle%5C_for%5C_leaf%28x%29+%3D+%28x+%2A+360%2F%5Cphi%29+%5C%25+360&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=angle%5C_for%5C_leaf%28x%29+%3D+%28x+%2A+360%2F%5Cphi%29+%5C%25+360&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=angle%5C_for%5C_leaf%28x%29+%3D+%28x+%2A+360%2F%5Cphi%29+%5C%25+360&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"angle\\_for\\_leaf(x) = (x * 360\/\\phi) \\% 360\" \/><\/p>\n<p>This would give us an angle from 0 to 360. These functions are obviously similar. We just replaced <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"2^{64}\" \/> with <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=360&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=360&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=360&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"360\" \/>. So while we\u2019re in math-land with infinite precision, we might as well make the function return something in the range from 0 to 1, and then multiply the constant in afterwards:<\/p>\n<p><img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%28x%29+%3D+frac%28x%2F%5Cphi%29&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%28x%29+%3D+frac%28x%2F%5Cphi%29&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%28x%29+%3D+frac%28x%2F%5Cphi%29&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"hash\\_0\\_to\\_1(x) = frac(x\/\\phi)\" \/><\/p>\n<p>Where <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=frax%28x%29&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=frax%28x%29&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=frax%28x%29&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"frax(x)\" \/> returns the fractional part of a number. So <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=frax%281.1%29+%3D+0.1&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=frax%281.1%29+%3D+0.1&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=frax%281.1%29+%3D+0.1&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"frax(1.1) = 0.1\" \/>. In this last formulation it\u2019s easy to find out why Fibonacci numbers give us problems. Let\u2019s try putting in a few Fibonacci numbers:<\/p>\n<p><img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%28144%29+%3D+frac%28144%2F%5Cphi%29+%5Capprox+frac%2889%29+%3D+0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%28144%29+%3D+frac%28144%2F%5Cphi%29+%5Capprox+frac%2889%29+%3D+0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%28144%29+%3D+frac%28144%2F%5Cphi%29+%5Capprox+frac%2889%29+%3D+0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"hash\\_0\\_to\\_1(144) = frac(144\/\\phi) \\approx frac(89) = 0\" \/><br \/>\n<img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%281587%29+%3D+frac%281597%2F%5Cphi%29+%5Capprox+frac%28987%29+%3D+0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%281587%29+%3D+frac%281597%2F%5Cphi%29+%5Capprox+frac%28987%29+%3D+0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%281587%29+%3D+frac%281597%2F%5Cphi%29+%5Capprox+frac%28987%29+%3D+0&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"hash\\_0\\_to\\_1(1587) = frac(1597\/\\phi) \\approx frac(987) = 0\" \/><br \/>\n<img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%288%29+%3D+frac%288%2F%5Cphi%29+%5Capprox+frac%284.94%29+%3D+0.94&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%288%29+%3D+frac%288%2F%5Cphi%29+%5Capprox+frac%284.94%29+%3D+0.94&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=hash%5C_0%5C_to%5C_1%288%29+%3D+frac%288%2F%5Cphi%29+%5Capprox+frac%284.94%29+%3D+0.94&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"hash\\_0\\_to\\_1(8) = frac(8\/\\phi) \\approx frac(4.94) = 0.94\" \/><\/p>\n<p>What we see here is that if we divide a Fibonacci number by the golden ratio, we just get the previous Fibonacci number. There is no fractional part so we always end up with 0. So even if we multiply the full range of <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"2^{64}\" \/> back in, we still get 0. But for smaller Fibonacci numbers there is some imprecision because the Fibonacci sequence is just an integer approximation of golden ratio growth. That approximation gets more exact the further along we get into the sequence, but for the number 8 it\u2019s not that exact. That\u2019s why 8 was not a problem, 34 started to look problematic, and 144 is going to be real bad.<\/p>\n<p>Except that when we talk about badness, we also have to consider the size of the hash table. It\u2019s really easy to find bad patterns when the table only has eight slots. If the table is bigger and has, say 64 slots, suddenly multiples of 34 don\u2019t look as bad:<\/p>\n<p>0 -&gt; 0<br \/>\n34 -&gt; 0<br \/>\n68 -&gt; 1<br \/>\n102 -&gt; 2<br \/>\n136 -&gt; 3<br \/>\n170 -&gt; 4<br \/>\n204 -&gt; 5<br \/>\n238 -&gt; 5<br \/>\n272 -&gt; 6<br \/>\n306 -&gt; 7<br \/>\n340 -&gt; 8<br \/>\n374 -&gt; 9<br \/>\n408 -&gt; 10<br \/>\n442 -&gt; 10<br \/>\n476 -&gt; 11<br \/>\n510 -&gt; 12<br \/>\n544 -&gt; 13<\/p>\n<p>And if the table has 1024 slots we get all the multiples nicely spread out:<\/p>\n<p>0 -&gt; 0<br \/>\n34 -&gt; 13<br \/>\n68 -&gt; 26<br \/>\n102 -&gt; 40<br \/>\n136 -&gt; 53<br \/>\n170 -&gt; 67<br \/>\n204 -&gt; 80<br \/>\n238 -&gt; 94<br \/>\n272 -&gt; 107<br \/>\n306 -&gt; 121<br \/>\n340 -&gt; 134<br \/>\n374 -&gt; 148<br \/>\n408 -&gt; 161<br \/>\n442 -&gt; 175<br \/>\n476 -&gt; 188<br \/>\n510 -&gt; 202<br \/>\n544 -&gt; 215<\/p>\n<p>At size 1024 even the multiples of 144 don\u2019t look scary any more because they\u2019re starting to be spread out now:<\/p>\n<p>0 -&gt; 0<br \/>\n144 -&gt; 1020<br \/>\n288 -&gt; 1017<br \/>\n432 -&gt; 1014<br \/>\n576 -&gt; 1011<br \/>\n720 -&gt; 1008<br \/>\n864 -&gt; 1004<br \/>\n1008 -&gt; 1001<br \/>\n1152 -&gt; 998<\/p>\n<p>So the bad pattern of multiples of Fibonacci numbers goes away with bigger hash tables. Because Fibonacci hashing spreads out the numbers, and the bigger the table is, the better it gets at that. This doesn\u2019t help you if your hash table is small, or if you need to insert multiples of a larger Fibonacci number, but it does give me confidence that this \u201cbad pattern\u201d is something we can live with.<\/p>\n<p>So I am OK with living with the bad pattern of Fibonacci hashing. It\u2019s less bad than making the hash table a power of two size. It can be slightly more bad than using prime number sizes, as long as your prime numbers are well chosen. But I still think that on average Fibonacci hashing is better than prime number sized integer modulo, because Fibonacci hashing mixes up sequential numbers. So it fixes a real problem I have run into in the past while introducing a theoretical problem that I am struggling to find real examples for. I think that\u2019s a good trade.<\/p>\n<p>Also prime number integer modulo can have problems if you choose bad prime numbers. For example boost::unordered_map can choose size 196613, which is 0b110000000000000101 in binary, which is a pretty round number in the same way that 15000005 is a pretty round number in decimal. Since this prime number is \u201ctoo round of a number\u201d this causes lots of hash collisions in one of my benchmarks, and I didn\u2019t set that benchmark up to find bad cases like this. It was totally accidental and took lots of debugging to figure out why boost::unordered_map does so badly in that benchmark. (the benchmark in question was set up to find problems with sequential numbers) But I won\u2019t go into that and will just say that while prime numbers give fewer problematic patterns than Fibonacci hashing, you still have to choose them well to avoid introducing hash collisions.<\/p>\n<h3>Conclusion<\/h3>\n<p>Fibonacci hashing may not be the best hash function, but I think it\u2019s the best way to map from a large range of numbers into a small range of numbers. And we are only using it for that. When used only for that part of the hash table, we have to compare it against two existing approaches: Integer modulo with prime numbers and Integer modulo with power of two sizes. It\u2019s almost as fast as the power of two size, but it introduces far fewer problems because it doesn\u2019t discard any bits. It\u2019s much faster than the prime number size, and it also gives us the bonus of breaking up sequential numbers, which can be a big benefit for open addressing hash tables. It does introduce a new problem of having problems with multiples of large Fibonacci numbers in small hash tables, but I think those problems can be solved by using a custom hash function when you encounter them. Experience will tell how often we will have to use this.<\/p>\n<p>All of my hash tables now use Fibonacci hashing by default. For my flat_hash_map the property of breaking up sequential numbers is particularly important because I have had real problems caused by sequential numbers. For the others it\u2019s just a faster default. It might almost make the option to use the power of two integer modulo unnecessary.<\/p>\n<p>It\u2019s surprising that the world forgot about this optimization and that we\u2019re all using primer number sized hash tables instead. (or use Dinkumware\u2019s solution which uses a power of two integer modulo, but spends more time on the hash function to make up for the problems of the power of two integer modulo) Thanks to Rich Geldreich for writing a hash table that uses this optimization and for mentioning it in my comments. But this is an interesting example because academia had a solution to a real problem in existing hash tables, but professors didn\u2019t realize that they did. The most likely reason for that is that it\u2019s not well known how big the problem of \u201cmapping a large number into a small range\u201d is and how much time it takes to do an integer modulo.<\/p>\n<p>For future work it might be worth looking into Knuth\u2019s third hash function: The one that\u2019s related to CRC hashes. It seems to be a way to construct a good CRC hash function if you need a n-bit output for a hash table. But it was too complicated for me to look into, so I\u2019ll leave that as an exercise to the reader to find out if that one is worth using.<\/p>\n<p>Finally <a href=\"https:\/\/github.com\/skarupke\/flat_hash_map\/blob\/master\/unordered_map.hpp\">here<\/a> is the link to my implementation of unordered_map. My other two hash tables are also there: flat_hash_map has very fast lookups and bytell_hash_map is also very fast but was designed more to save memory compared to flat_hash_map.<\/p>\n<div id=\"jp-post-flair\" class=\"sharedaddy sd-like-enabled sd-sharing-enabled\">\n<div class=\"sharedaddy sd-sharing-enabled\">\n<div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\">\n<h3 class=\"sd-title\">Share this:<\/h3>\n<div class=\"sd-content\">\n<ul data-sharing-events-added=\"true\">\n<li class=\"share-twitter\"><a class=\"share-twitter sd-button share-icon\" title=\"Click to share on Twitter\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?share=twitter&amp;nb=1\" target=\"_blank\" rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-9623\">Twitter<\/a><\/li>\n<li class=\"share-facebook\"><a class=\"share-facebook sd-button share-icon\" title=\"Click to share on Facebook\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?share=facebook&amp;nb=1\" target=\"_blank\" rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-facebook-9623\">Facebook<\/a><\/li>\n<li class=\"share-end\"><\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"like-post-wrapper-22872755-9623-67292172ee3db\" class=\"sharedaddy sd-block sd-like jetpack-likes-widget-wrapper jetpack-likes-widget-loaded\" data-src=\"\/\/widgets.wp.com\/likes\/index.html?ver=20241104#blog_id=22872755&amp;post_id=9623&amp;origin=probablydance.wordpress.com&amp;obj_id=22872755-9623-67292172ee3db&amp;domain=probablydance.com\" data-name=\"like-post-frame-22872755-9623-67292172ee3db\" data-title=\"Like or Reblog\"><iframe loading=\"lazy\" class=\"post-likes-widget jetpack-likes-widget\" title=\"Like or Reblog\" src=\"https:\/\/widgets.wp.com\/likes\/index.html?ver=20241104#blog_id=22872755&amp;post_id=9623&amp;origin=probablydance.wordpress.com&amp;obj_id=22872755-9623-67292172ee3db&amp;domain=probablydance.com\" name=\"like-post-frame-22872755-9623-67292172ee3db\" width=\"100%\" height=\"55px\" frameborder=\"0\" scrolling=\"no\" data-mce-fragment=\"1\"><\/iframe><\/div>\n<div id=\"jp-relatedposts\" class=\"jp-relatedposts\">\n<h3 class=\"jp-relatedposts-headline\"><em>Related<\/em><\/h3>\n<div class=\"jp-relatedposts-items jp-relatedposts-items-minimal jp-relatedposts-grid \">\n<p class=\"jp-relatedposts-post jp-relatedposts-post0\" data-post-id=\"6655\" data-post-format=\"false\"><span class=\"jp-relatedposts-post-title\"><a class=\"jp-relatedposts-post-a\" title=\"I Wrote The Fastest&amp;nbsp;Hashtable\" href=\"https:\/\/probablydance.com\/2017\/02\/26\/i-wrote-the-fastest-hashtable\/?relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=0&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=0&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=0&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=0&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=0&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=0\" data-origin=\"9623\" data-position=\"0\">I Wrote The Fastest\u00a0Hashtable<\/a><\/span><time class=\"jp-relatedposts-post-date\" datetime=\"February 26, 2017\">February 26, 2017<\/time><span class=\"jp-relatedposts-post-context\">In &#8220;Programming&#8221;<\/span><\/p>\n<p class=\"jp-relatedposts-post jp-relatedposts-post1\" data-post-id=\"9618\" data-post-format=\"false\"><span class=\"jp-relatedposts-post-title\"><a class=\"jp-relatedposts-post-a\" title=\"A new fast hash table in response to Google&amp;#8217;s new fast hash&amp;nbsp;table\" href=\"https:\/\/probablydance.com\/2018\/05\/28\/a-new-fast-hash-table-in-response-to-googles-new-fast-hash-table\/?relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=1&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=1&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=1&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=1&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=1&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=1\" data-origin=\"9623\" data-position=\"1\">A new fast hash table in response to Google\u2019s new fast hash\u00a0table<\/a><\/span><time class=\"jp-relatedposts-post-date\" datetime=\"May 28, 2018\">May 28, 2018<\/time><span class=\"jp-relatedposts-post-context\">In &#8220;Programming&#8221;<\/span><\/p>\n<p class=\"jp-relatedposts-post jp-relatedposts-post2\" data-post-id=\"1633\" data-post-format=\"false\"><span class=\"jp-relatedposts-post-title\"><a class=\"jp-relatedposts-post-a\" title=\"I Wrote a Faster Hash&amp;nbsp;Table\" href=\"https:\/\/probablydance.com\/2014\/05\/31\/i-wrote-a-faster-hash-table\/?relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=2&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=2&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=2&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=2&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=2&amp;relatedposts_hit=1&amp;relatedposts_origin=9623&amp;relatedposts_position=2\" data-origin=\"9623\" data-position=\"2\">I Wrote a Faster Hash\u00a0Table<\/a><\/span><time class=\"jp-relatedposts-post-date\" datetime=\"May 31, 2014\">May 31, 2014<\/time><span class=\"jp-relatedposts-post-context\">In &#8220;Programming&#8221;<\/span><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"post-meta\">\n<div class=\"post-date\">Published: <abbr class=\"published\" title=\"2018-06-16T11:26:17-0700\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/\">June 16, 2018<\/a><\/abbr><\/div>\n<div class=\"categories\">Filed Under: <a href=\"https:\/\/probablydance.com\/category\/programming\/\" rel=\"category tag\">Programming<\/a><\/div>\n<p>Tags: <a href=\"https:\/\/probablydance.com\/tag\/fibonacci-hashing\/\" rel=\"tag\">fibonacci hashing<\/a> : <a href=\"https:\/\/probablydance.com\/tag\/hash-table\/\" rel=\"tag\">hash table<\/a><\/div>\n<\/div>\n<div id=\"comments\">\n<h3 id=\"comments\">99 Comments to \u201cFibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer\u00a0Modulo)\u201d<\/h3>\n<div class=\"navigation\">\n<div class=\"alignleft\"><\/div>\n<div class=\"alignright\"><\/div>\n<\/div>\n<ol class=\"commentlist\">\n<li id=\"comment-3597\" class=\"comment byuser comment-author-sebastiansylvan even thread-even depth-1 parent\">\n<div id=\"div-comment-3597\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/sebastiansylvan.wordpress.com\" rel=\"ugc external nofollow\">Sebastian Sylvan<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3597\">June 16, 2018 at 14:39<\/a><\/div>\n<p>You can also use fast range reduction: <a href=\"https:\/\/lemire.me\/blog\/2016\/06\/27\/a-fast-alternative-to-the-modulo-reduction\/\" rel=\"nofollow ugc\">https:\/\/lemire.me\/blog\/2016\/06\/27\/a-fast-alternative-to-the-modulo-reduction\/<\/a><\/p>\n<p>It\u2019s also super fast (a multiply and a shift), but the nice part about it is that you can map it to any sized range, not just powers of two (so e.g. you can grow your hash table by 1.5x each time, instead of 2.. or use prime number sizes, or whatever you want).<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3597#respond\" rel=\"nofollow\" data-commentid=\"3597\" data-postid=\"9623\" data-belowelement=\"div-comment-3597\" data-respondelement=\"respond\" data-replyto=\"Reply to Sebastian Sylvan\" aria-label=\"Reply to Sebastian Sylvan\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3598\" class=\"comment byuser comment-author-sebastiansylvan odd alt depth-2 parent\">\n<div id=\"div-comment-3598\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/sebastiansylvan.wordpress.com\" rel=\"ugc external nofollow\">Sebastian Sylvan<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3598\">June 16, 2018 at 15:16<\/a><\/div>\n<p>Oh just realized you briefly mentioned it. IMO you\u2019re not giving it enough credit.. neither fibonacci or fastrange will work as a hash function alone, you need a decent hash function to start with, and the benefits of fastrange is that it allows you to use non-pow-of-2 sizes (if you do use power of two sizes it\u2019s true that it\u2019s exactly equivalent to just throwing away the lower bits). You could do the trick you do above of xor-ing in the low bits into the high bits before mapping to precondition it a bit and ensure you don\u2019t throw a way the low bits completely (but the original hash function should already have done plenty such mixing).<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3598#respond\" rel=\"nofollow\" data-commentid=\"3598\" data-postid=\"9623\" data-belowelement=\"div-comment-3598\" data-respondelement=\"respond\" data-replyto=\"Reply to Sebastian Sylvan\" aria-label=\"Reply to Sebastian Sylvan\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3613\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-3\">\n<div id=\"div-comment-3613\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3613\">June 16, 2018 at 16:52<\/a><\/div>\n<p>Why does Fibonacci hashing not work as a hash function alone? It should, unless you have a use case that results in lots of Fibonacci numbers being inserted. You can use the identity function as the hash when you use Fibonacci hashing to assign a hash to an index. In fact you can also use the identity function when using prime number sizes. Only for power of two sizes and for fastrange do you need a separate hash function. Or when your type is bigger than 64 bits obviously.<\/p>\n<p>Since both the hash function, and the function that you use to map from the hash value to a slot are on the critical path of the hash table, you want to be very careful what you do in there. If the hash function can be identity, it should be identity. And the cases where it can\u2019t be identity are pretty rare. (except of course when using power of two sizes or when using fastrange, then those cases are common)<\/p>\n<p>I think it is possible to improve fastrange so that it also uses the low bits, but why not just use Fibonacci hashing? To improve fastrange you would have to slow it down. Fibonacci hashing is already the same speed and gives better results.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3624\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3624\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/c3da23674e3b824deb9dcd120fc45d809632826e378b202dc7f5c9e4498f8fd5?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/c3da23674e3b824deb9dcd120fc45d809632826e378b202dc7f5c9e4498f8fd5?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/c3da23674e3b824deb9dcd120fc45d809632826e378b202dc7f5c9e4498f8fd5?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/c3da23674e3b824deb9dcd120fc45d809632826e378b202dc7f5c9e4498f8fd5?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/c3da23674e3b824deb9dcd120fc45d809632826e378b202dc7f5c9e4498f8fd5?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/c3da23674e3b824deb9dcd120fc45d809632826e378b202dc7f5c9e4498f8fd5?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Grant Husbands<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3624\">June 17, 2018 at 05:40<\/a><\/div>\n<p>Can\u2019t reply to Malte (no reply button), so I\u2019ll reply here. The reason you need another hash function is that you can be hashing arbitrary length things like strings, and you haven\u2019t supplied an iterative version of the Fibonacci hash.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3667\" class=\"comment byuser comment-author-sebastiansylvan even depth-3\">\n<div id=\"div-comment-3667\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/sebastiansylvan.wordpress.com\" rel=\"ugc external nofollow\">Sebastian Sylvan<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3667\">June 17, 2018 at 21:38<\/a><\/div>\n<p>A few thoughts: You literally say \u201cFibonacci hashing is not a really good hash function.\u201d in the article. Then you proceed to add a really ghetto hashing algorithm preprocessing step to improve it somewhat. Having a real hash function like xxHash (or for integer-&gt;integer, the permutation step from PCG, perhaps) should be much better. As far as I can tell your argument here basically boils down to \u201csometimes a bad hash function is okay, if we do some post processing on it\u201d, which is true, but there are many ways of doing such post processing, and in fact hashing functions do just this to get their final output.<\/p>\n<p>Second, fast range preserves the order of elements regardless of table size. So when resizing you can do a linear scan to move elements into the new table. This is faster.<\/p>\n<p>Third, it allows arbitrary sizes. I don\u2019t understand your comment that Fibonacci works with prime sizes. Maybe I missed something, but you don\u2019t seem to show that you can use Fibonacci hashing with arbitrary sizes (do you mean by adding an integer modulo at the end? That\u2019s the whole thing we\u2019re trying to avoid.. fast range doesn\u2019t need modulo).<\/p>\n<p>I don\u2019t think you need to improve fast range, I think it works just fine with a decent hash function, which I think is fair to assume for someone using a hash table (xxHash does 10+GB\/s, there\u2019s no real excuse to not use a decent hash function).<\/p>\n<p>Finally, as Grant Husbands points out \u2013 I\u2019d wager most uses of a hash table isn\u2019t with an integer key \u2013 you need a hash function anyway.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3670\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-3\">\n<div id=\"div-comment-3670\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3670\">June 18, 2018 at 05:34<\/a><\/div>\n<p>Alright sorry, this needs a lot more clarification. So first of all the \u201cprime number sizes\u201d was referring to prime number integer modulo.<\/p>\n<p>Then we have two cases:<br \/>\n1. The type is smaller than 64 bits. This includes ints, floats, small custom structs or small std::pairs.<br \/>\n2. The type is variable size or larger than 64 bits. This includes std::string and other sequences.<\/p>\n<p>Now we can break down point 1 more:<br \/>\n1.1 You are not inserting lots of multiples of Fibonacci numbers into your table. This is the usual case. In this case just use Fibonacci hashing. And use the identity function for the hash function. Adding a custom hashing function on top would just slow you down unnecessarily.<br \/>\n1.2 You have a pattern that has problems with Fibonacci hashing. In this case use prime number integer modulo. I didn\u2019t mention this in the blog post, but it can actually also be made faster. If you look at flat_hash_map.hpp in my repository you will see a huge code block dedicated to making prime number integer modulo faster. And with a compile time switch you can use this behavior. (define the hash_policy typedef in your hash type. See the power_of_two_std_hash struct as an example that just uses std::hash but uses the power of two integer modulo policy) But once again even with a prime number integer modulo, you want to use the identity function as your hash function, because there is no reason to use anything more complicated.<\/p>\n<p>There are very, very few other cases. The only other case I would maybe consider is if you\u2019re super confident that all the information is in the lower bits. Then you can use power of two sized integer modulo and get a tiny speed up over Fibonacci hashing. But now you\u2019re playing with fire, because if your inputs ever change you\u2019ll get hash collisions. Fibonacci hashing is the safer choice and it\u2019s not much slower.<\/p>\n<p>When would you use fastrange? Never. If you know the information is in the upper bits, you could use fastrange. But then you could also use Fibonacci hashing because it\u2019s the same speed but it uses all the bits.<\/p>\n<p>Let\u2019s break down point 2, too:<\/p>\n<p>2.1. The type already has a good hash function. For example std::string. In this case you want to use power of two sized integer modulo. I have been thinking of doing that by default for strings. If there is already a good hash function, there is no reason to use anything more complicated. You\u2019d just be adding cost on top of the hash function.<br \/>\n2.2. You have to write a hash function for the type. In that case use an existing hash function. If you use a good one, just use power of two sized integer modulo, if you use a bad one (maybe because it\u2019s faster) use Fibonacci hashing.<\/p>\n<p>So what we have here is that in the most common cases you want to use Fibonacci hashing, or if you already have a good hash function, you want to use power of two sized integer modulo because that is simply the fastest thing, and you don\u2019t need anything more fancy if you have a good hash function. There are no cases where you want to use fastrange.<\/p>\n<p>So what about the benefits of \u201cpreserving order independent of size\u201d or \u201cit allows arbitrary sizes\u201d? I don\u2019t know, man. Why are those big benefits? It\u2019s fairly standard to just double the size of the table on resize. I even do that for my prime number size table, where I could grow in smaller steps. And I doubt I can take advantage of the items being in a certain order. Once you get a few hash collision you have to resolve those and shuffle items around. And that will play out different depending on the size. (when you double the size you get fewer collisions and thus a different order of your elements) So you can\u2019t rely on things being in a certain order, so I really don\u2019t know what benefit you get from the results of the hash function being in the same order. Also even if they were in the same order, they won\u2019t be in the same position. So you can\u2019t just memcpy. And otherwise the loop that you do when resizing is already really simple, so I struggle to come up with a way to simplify it more, even if you could know that items were in the same order. Plus, how often do you resize your hash table? If you use fastrange you\u2019re inviting a lot of problems and thus a lot of complexity. Maybe you now have to use a slower hash function (like xxHash instead of the identity function) because fastrange otherwise gives you problems. How many of those slowdowns and problems are you willing to add just to make table resizing faster? It\u2019s just putting the priorities in the wrong place. You should make sure that your lookups are fast, not your table resizing.<\/p>\n<p>I think there is a common misunderstanding where people are really, really scared of bad inputs and are thus saying that \u201cyou always have to run a proper hash function on the input.\u201d That is simply wrong. If you use prime number integer modulo you are fine. If you use Fibonacci hashing, you are fine. You don\u2019t need a perfect distribution. So just use the identity function when you can, and only use a more expensive hash function when you have to. (when the type is bigger or variable size) Yes, there are certain patterns that can hurt you, but those are very, very rare. When picking these defaults you have to consider that everyone is going to use them. You don\u2019t want to slow down the common case in order to make the very, very rare case faster. I admit that this is a trade-off. For example I do slow down the common case by not using power of two sized integer modulo. If I did that, the common case would be even faster. But there are too many inputs that break for power of two sized integer modulo, so I\u2019m not happy to use that as the default. Where for Fibonacci hashing, bad inputs are \u201cvery, very rare\u201d for power of two sized integer modulo bad inputs are common. But part of the reason why Fibonacci hashing is so brilliant is that it\u2019s a great trade-off here. It\u2019s really fast while also being really good at reducing the range. So use that and use the identity hasher by default.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3671\" class=\"comment byuser comment-author-sebastiansylvan even depth-3\">\n<div id=\"div-comment-3671\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/sebastiansylvan.wordpress.com\" rel=\"ugc external nofollow\">Sebastian Sylvan<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3671\">June 18, 2018 at 09:41<\/a><\/div>\n<p>Re: 1.1, I feel like your argument can be restated as \u201csometimes multiplicative hashing works okay, even though it\u2019s a terrible hash function\u201d. And yes that\u2019s true, but this is an argument for weak hashing being practical in some cases, but obviously this isn\u2019t true in all cases and you seem weirdly excited about this particular version as if it\u2019s not just another weak hash function. IMO there are much better behaved ways of turning an int into a reasonably uniform int (see the permutation step of PCG, for example) that are still cheap but behave much better than multiplicative hashing.<br \/>\nRe: 1.2 As I\u2019m sure you\u2019re aware, fibonacci numbers come up *all the time* naturally. Seems kind of like a double standard to make the assumption that the input is not going to play badly with your particularly chosen weak hash function, while in the other hand complain that fast range behave poorly on (very) weak hash functions.<\/p>\n<p>You sort of just side step the fact that fast range doesn\u2019t need any clever tricks to avoid integer modulo. It \u201cjust works\u201d. The reason you don\u2019t want power of two table sizes is two-fold: 1) It wastes memory. 2) It plays poorly with the allocator. To elaborate on that last point, if you use a growth factor of 1.5x, it\u2019s more likely that you can reuse previously used memory blocks which is more efficient (so e.g. 4, then 6, then 9 \u2013 when allocating the array for 9 elemens, the previous two can be merged and reused because 4+6&gt;9).<\/p>\n<p>The reason preserving order is good is because it speeds up re-insertion. It becomes a linear scan over the new and old arrays rather than having to jump around all over the new one because the order changed. For small hash tables in particular, the cost of growth can be very signficant average overhead, and for large table it can be extremely bad for predictability. You have to make sure when you move an element to the new table that it\u2019s at \u201cat least\u201d its \u201ctarget slot\u201d (because the new table is bigger, you\u2019ll want to insert some \u201choles\u201d occasionally) but memory access is much better.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3672\" class=\"comment byuser comment-author-sebastiansylvan odd alt depth-3\">\n<div id=\"div-comment-3672\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/sebastiansylvan.wordpress.com\" rel=\"ugc external nofollow\">Sebastian Sylvan<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3672\">June 18, 2018 at 09:45<\/a><\/div>\n<p>BTW, re: the allocator-aware growth thing: you do have to make sure you use \u201crealloc\u201d for that to work, *and* that your allocator tries to merge blocks like that, but it\u2019s fairly standard to use 1.5.x instead of 2x for this reason. Unfortunately with C++ not all types can be realloc\u2019d.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3673\" class=\"comment byuser comment-author-sebastiansylvan even depth-3\">\n<div id=\"div-comment-3673\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/fc998fecaef9d96857358c3d4f916ab4723ab96426fd1cacec23cb9a42ba3ea7?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/sebastiansylvan.wordpress.com\" rel=\"ugc external nofollow\">Sebastian Sylvan<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3673\">June 18, 2018 at 11:55<\/a><\/div>\n<p>I also just realized that the allocation block sizes for the aforementioned allocator-aware growth strategy is a Fibonacci sequence. That\u2019s just ironic and drives home the point that the Fibonacci sequence is literally famous for showing up all over in unexpected places.<\/p>\n<p>I think my broader point is: The user can already customize the hash function. If they have well behaved data and want to use a weak hash (like Fibonacci, or any other) then they already can. This shouldn\u2019t be something the table does on top of what the user\u2019s hash function returns.<\/p>\n<p>If you made the growth factor customizible too, you could special case it for growth=2 and do just plain shifting in that case (but tbh I wouldn\u2019t be surprised if you found that fast range performed the same in practice.. theoretically a simple shift may be simpler than a multiply (esp. for 32 bit hashes), but in practice, does it make a difference in a superscalar CPU? Maybe, maybe not)<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3688\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-3\">\n<div id=\"div-comment-3688\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3688\">June 19, 2018 at 19:06<\/a><\/div>\n<p>For the realloc strategy: It\u2019s a clever trick, but I can\u2019t use it because I am using std::allocator. Yes, nobody likes std::allocator, but if you want to be interface-compatible with std::unordered_map, you have to use it. And I think it is very important to be interface-compatible with std::unordered_map. Also, as you observed yourself, in a generic hash table you can\u2019t use the trick anyway because you need to keep the old block around because you need to move-construct from the old block into the new one. So there is no time where the allocator could merge the old blocks because the hash table is still holding on to it, and it will only be freed once the data has been transfered to the larger block.<\/p>\n<p>For the \u201csometimes multiplicative hashing works okay, even though it\u2019s a terrible hash function\u201d thing: I think \u201cterrible\u201d is too strong of a word, but that\u2019s besides the point because my argument is actually even stronger: It literally doesn\u2019t matter how good of a hash function it is. I think part of the problem is that I couldn\u2019t come up with a good measure for what makes a function be a good mapping function from a large range into a small range. If a function is a good hash function, then it will also be a good function to map from a large range into a small range, (just chop off the top bits after hashing) but it can also be a good function for that purpose even if it\u2019s a bad hash function. To illustrate that, lets consider the case of prime number integer modulo, which I think everyone would agree is a really terrible hash function.<\/p>\n<p>As I mentioned in the blog post, most unordered_map implementations use a prime number sized integer modulo, and then just use the identity function as the hash function for integer types. Why do they do that? These hash tables were written by really smart people. Why would smart people decide to use no hash function at all and to only rely on the mapping function of the prime number integer modulo? Because it\u2019s a really good trade-off when you look at all the different considerations of a hash table. I actually think it\u2019s a better decision than what Dinkumware did where they went with a power of two size and a more complex hash function. The other unordered_map implementations should just use a faster integer modulo like I do in my prime_number_hash_policy, then they would beat Dinkumware both in performance, and they would have fewer problems than Dinkumware.<\/p>\n<p>Now it may sound like heresy to say that you\u2019d get fewer problems with the identity hash function, but it really is true. Because all you need is that every slot in the table is equally likely to be used. Prime number integer modulo gives you that. You can check this with all kinds of patterns, and you will always find that every slot is equally likely to be used. Of course you can always construct patterns that intentionally break the hash table, but that is always the case, even if you use a really complicated hash function. But if you can show me a real pattern that comes up in real code that breaks prime number integer modulo, then I will consider using a more complicated hash function. If you can\u2019t, then it is a good solution. If you have to artificially construct inputs to find a weakness, then it is a good solution.<\/p>\n<p>The other thing to consider is that hash tables have the customization point of providing your own hash function. And lots of people use that customization point because lots of people want to store their own types in hash tables. So now the hash table has to decide on some method of turning the result from the user provided hash function into a slot in the table. If you use power of two sized integer modulo, there are lots of traps to fall into for your users, but prime number integer modulo stays robust. I think this is also very important, and this is why I think Dinkumware made the wrong choice by just chopping off the upper bits. Your users won\u2019t necessarily know how to write a good hash function, and the table has to be somewhat robust against that. Fastrange is really not robust against bad hash functions. It has all the same problems of power of two sized integer modulo, because it just chops off the bottom bits. (I actually think it has more problems because the bottom bits more often have information in it) Fibonacci hashing is robust in the same way that prime number integer modulo is robust: I don\u2019t know of a real pattern that comes up in real code that would break it. I know how to break it if I want to, but I can\u2019t come up with any situation where I would break it in real code. I will say that I wouldn\u2019t at all be surprised if I will soon come across such a use case, but I haven\u2019t yet, and it doesn\u2019t make sense to be scared to use something because maybe someday I will potentially run into a problem with it, even though right now it looks like I never will. If you think like that, you shouldn\u2019t use hash tables at all.<\/p>\n<p>So that brings me to this part of your comment:<br \/>\n\u201cThe user can already customize the hash function. If they have well behaved data and want to use a weak hash (like Fibonacci, or any other) then they already can. This shouldn\u2019t be something the table does on top of what the user\u2019s hash function returns.\u201d<\/p>\n<p>The beauty of Fibonacci hashing is that it does two jobs in one: 1. It maps the number from a large range into a small range (which the hash table definitely has to do) and 2. It does some additional mixing of the inputs. So yes, the table doesn\u2019t have to do any hashing on top of what the user does, but it does have to map the number that the user provides into a small range. If we can get some additional hashing at the same time for free, that\u2019s great.<\/p>\n<p>There is also one meta-problem with this discussion, which is that when talking about hash tables, you can always come up with situations where one table is faster than another. Give me your fastest hash table, and I will come up with a situation where std::unordered_map is faster. So the discussion can\u2019t be about \u201cI found a case where your table is slower than another\u201d but it has to be about weighing different trade-offs and about deciding what\u2019s important to be fast.<\/p>\n<p>For me, when considering various trade-offs, it\u2019s always important for lookups to be as fast as possible. So when we talk about how much better the behavior would be when using a more complicated hash function, you really have to think about how much slower or faster your average lookup will be. Let\u2019s say you use a more complicated hash function like xxHash. If you insert random numbers, you immediately find that your lookups are now more than twice as slow as Fibonacci hashing when the table is cached. Bummer. So now instead of random numbers we throw a bunch of different use cases at the hash tables. Because xxHash will give us fewer hash collisions when we insert non-random numbers. How much do you think hash collisions will go down? 1 percent? 5 percent? Enough to make up for a factor two difference in speed? Because if you can\u2019t drastically reduce hash collisions, your table won\u2019t be any faster and you are only losing by using a more complicated hash function. And you can\u2019t reduce hash collisions much compared to Fibonacci hashing, because there are already very few hash collisions with Fibonacci hashing.<\/p>\n<p>And even if you find a use case where Fibonacci hashing does badly and you want to use a more complicated hash function like xxHash, you should try using prime number integer modulo first. It\u2019s very easy to change that in my hash tables, just use the prime_number_hash_policy. That is still twice as fast as using xxHash, and it will definitely handle your pattern. If it doesn\u2019t, then I would actually be really interested in that. I have a collection of bad cases that break hash tables, and I am always looking for more of those. They have to be real cases though, not artificial constructs. Because when it comes to hash tables you can always come up with edge cases where a hash table does really badly.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-4133\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-3\">\n<div id=\"div-comment-4133\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-4133\">September 7, 2018 at 02:41<\/a><\/div>\n<p>Coming back to this conversation months later: I actually had a use case where I needed to save as much space as possible, so I wanted to pack the hash table as tight as possible. Which, as you pointed out, fibonacci hashing doesn\u2019t support because the table has to be sized to be a power of two. But fastrange also didn\u2019t work because I got too many hash collisions with that. Integer modulo with constants would have worked and was fairly fast, (faster than trying to fix the hash collisions in fastrange) but it was then that I realized that a combination of fibonacci hashing and fastrange actually makes a lot of sense: Fibonacci hashing gives you a good mixing of the upper bits, and fastrange only looks at the upper bits. So I ended up using fibonacci hashing followed by fastrange to make it possible to give the container an arbitrary size. It was still faster than any alternative and saved me a lot of disk space and memory.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3604\" class=\"comment byuser comment-author-deathbuffer odd alt thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-3604\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/a8bb96e0070d88cc37664602a172a1c57109f20c5058a7ce1da6cd13c4672f32?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/a8bb96e0070d88cc37664602a172a1c57109f20c5058a7ce1da6cd13c4672f32?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/a8bb96e0070d88cc37664602a172a1c57109f20c5058a7ce1da6cd13c4672f32?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/a8bb96e0070d88cc37664602a172a1c57109f20c5058a7ce1da6cd13c4672f32?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/a8bb96e0070d88cc37664602a172a1c57109f20c5058a7ce1da6cd13c4672f32?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/a8bb96e0070d88cc37664602a172a1c57109f20c5058a7ce1da6cd13c4672f32?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">deathbuffer<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3604\">June 16, 2018 at 15:24<\/a><\/div>\n<p>Didn\u2019t finish reading yet, just a correction: I think you got Rich Geldreich\u2019s name wrong in the third paragraph. Please feel free to delete this comment later.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3604#respond\" rel=\"nofollow\" data-commentid=\"3604\" data-postid=\"9623\" data-belowelement=\"div-comment-3604\" data-respondelement=\"respond\" data-replyto=\"Reply to deathbuffer\" aria-label=\"Reply to deathbuffer\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3611\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-2\">\n<div id=\"div-comment-3611\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3611\">June 16, 2018 at 16:44<\/a><\/div>\n<p>Thanks! Fixed.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3611#respond\" rel=\"nofollow\" data-commentid=\"3611\" data-postid=\"9623\" data-belowelement=\"div-comment-3611\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3608\" class=\"comment byuser comment-author-cpergiel odd alt thread-even depth-1 parent\">\n<div id=\"div-comment-3608\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/pergelator.blogspot.com\" rel=\"ugc external nofollow\">Charles Pergiel<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3608\">June 16, 2018 at 15:38<\/a><\/div>\n<p>The graph is too small, the legend is unreadable.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3608#respond\" rel=\"nofollow\" data-commentid=\"3608\" data-postid=\"9623\" data-belowelement=\"div-comment-3608\" data-respondelement=\"respond\" data-replyto=\"Reply to Charles Pergiel\" aria-label=\"Reply to Charles Pergiel\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3609\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-2\">\n<div id=\"div-comment-3609\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3609\">June 16, 2018 at 16:34<\/a><\/div>\n<p>Thanks for pointing it out. I made the font larger.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3609#respond\" rel=\"nofollow\" data-commentid=\"3609\" data-postid=\"9623\" data-belowelement=\"div-comment-3609\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3610\" class=\"comment byuser comment-author-cpergiel odd alt thread-odd thread-alt depth-1\">\n<div id=\"div-comment-3610\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/pergelator.blogspot.com\" rel=\"ugc external nofollow\">Charles Pergiel<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3610\">June 16, 2018 at 16:37<\/a><\/div>\n<p>In the graphs of the odds of a bit changing, blue and black are a little hard to tell apart, but you probably already know that.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3610#respond\" rel=\"nofollow\" data-commentid=\"3610\" data-postid=\"9623\" data-belowelement=\"div-comment-3610\" data-respondelement=\"respond\" data-replyto=\"Reply to Charles Pergiel\" aria-label=\"Reply to Charles Pergiel\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-3612\" class=\"comment even thread-even depth-1 parent\">\n<div id=\"div-comment-3612\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/www.sesse.net\/\" rel=\"ugc external nofollow\">Steinar H. Gunderson<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3612\">June 16, 2018 at 16:47<\/a><\/div>\n<p>For a production example, Snappy uses multiplicative hashing:<\/p>\n<p><a href=\"https:\/\/github.com\/google\/snappy\/blob\/master\/snappy.cc#L65\" rel=\"nofollow ugc\">https:\/\/github.com\/google\/snappy\/blob\/master\/snappy.cc#L65<\/a><\/p>\n<p>Originally (before the open-source release), it used a different constant that was much closer to the golden ratio, which also worked well. I remember at some point just brute-forcing all 2^32 constants to see if any did markedly better on the test set, but it really doesn\u2019t matter all that much.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3612#respond\" rel=\"nofollow\" data-commentid=\"3612\" data-postid=\"9623\" data-belowelement=\"div-comment-3612\" data-respondelement=\"respond\" data-replyto=\"Reply to Steinar H. Gunderson\" aria-label=\"Reply to Steinar H. Gunderson\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3614\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2 parent\">\n<div id=\"div-comment-3614\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3614\">June 16, 2018 at 17:04<\/a><\/div>\n<p>Thanks! Do you know why that constant was chosen? I also looked into a lot of different constants as alternatives to the golden ratio, but every time that I tried a new number I discovered a new way that multiplicative hashing can break. In the end I had to conclude that the golden ratio really is a very good choice. I am certain that there are other good choices out there, and I wouldn\u2019t be surprised if there were better choices than the golden ratio, I just don\u2019t know how to go looking for them.<\/p>\n<p>Oh and the ways in which it can break can be subtle. Like for a certain constant I might find \u201cif the hash table is either size 32768 or size 65536 then you get more collisions, in all other cases it\u2019s better than the golden ratio.\u201d Or I would find it being better in the general case but being worse in specific tests where I insert numbers with specific patterns. It was really hard to predict which number would do well, but somehow the golden ratio always did well.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3614#respond\" rel=\"nofollow\" data-commentid=\"3614\" data-postid=\"9623\" data-belowelement=\"div-comment-3614\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3641\" class=\"comment byuser comment-author-darkmistfire even depth-3\">\n<div id=\"div-comment-3641\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8a125555c4d6fdd4309fa49eb632be0280ae9c939cc309ff95c65496c43158d6?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8a125555c4d6fdd4309fa49eb632be0280ae9c939cc309ff95c65496c43158d6?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8a125555c4d6fdd4309fa49eb632be0280ae9c939cc309ff95c65496c43158d6?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8a125555c4d6fdd4309fa49eb632be0280ae9c939cc309ff95c65496c43158d6?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8a125555c4d6fdd4309fa49eb632be0280ae9c939cc309ff95c65496c43158d6?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8a125555c4d6fdd4309fa49eb632be0280ae9c939cc309ff95c65496c43158d6?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/walcomdynamics.wordpress.com\" rel=\"ugc external nofollow\">darkmistfire<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3641\">June 17, 2018 at 12:22<\/a><\/div>\n<p>The brotli code actually has some of the properties they were looking for when searching out this number as well as referencing some heuristic of which they ran tests in order to optimize this for. Interesting you can see this number popping up in a number of open sourced google tools that need extremely fast hashing (snappy, brotli, webp).<\/p>\n<p><a href=\"https:\/\/github.com\/google\/brotli\/blob\/master\/c\/enc\/hash.h#L79\" rel=\"nofollow ugc\">https:\/\/github.com\/google\/brotli\/blob\/master\/c\/enc\/hash.h#L79<\/a><\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3668\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3668\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/www.sesse.net\/\" rel=\"ugc external nofollow\">Steinar H. Gunderson<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3668\">June 18, 2018 at 03:29<\/a><\/div>\n<p>The choice of constant was in a sense a bit political (I never really agreed with it). IIRC it was based on good distribution of some common English byte sequences (e.g. \u201d the\u201d and similar) in Brotli, and someone(TM) wanted Snappy and Brotli to use the same hash constant. It didn\u2019t really matter much for any of the tests we had at the time, so it went through.<\/p>\n<p>The previous one was a) close to the golden ratio, b) prime, and c) had each sub-byte within some special set of ranges as described by Knuth. IIRC all of these actually helped a tiny bit on the benchmarks when we changed the hash (it was originally something that was home-grown and both quite bad and slow), but the latter was so microscopic that it could just have been luck.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3716\" class=\"comment even depth-3\">\n<div id=\"div-comment-3716\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/6df8f315ab71e3c11b2f80b19cb3226118bfde092a24639efe4104f86844a782?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/6df8f315ab71e3c11b2f80b19cb3226118bfde092a24639efe4104f86844a782?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/6df8f315ab71e3c11b2f80b19cb3226118bfde092a24639efe4104f86844a782?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/6df8f315ab71e3c11b2f80b19cb3226118bfde092a24639efe4104f86844a782?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/6df8f315ab71e3c11b2f80b19cb3226118bfde092a24639efe4104f86844a782?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/6df8f315ab71e3c11b2f80b19cb3226118bfde092a24639efe4104f86844a782?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Peter de Heer<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3716\">July 2, 2018 at 05:45<\/a><\/div>\n<p>Just a crazy out of the blue idea here.<\/p>\n<div class=\"embed-youtube\"><iframe loading=\"lazy\" title=\"Beyond the Golden Ratio | Infinite Series\" src=\"https:\/\/www.youtube.com\/embed\/MIxvZ6jwTuA?feature=oembed\" width=\"650\" height=\"366\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\" data-mce-fragment=\"1\"><\/iframe><\/div>\n<p>Next to the gold, there are other \u201cmetalic\u201d ratios. If cheap enough to do, why not pick two of the available, execute them in parallel (well the CPU will do that for you) and mix the result?<\/p>\n<p>My guesstimate is that it will no longer match natural occurring patters, while still showing some sort of unique spread. That or the emergence of a disastrous interfering pattern, but I think that is a bit less likely with there kind of ratios.<\/p>\n<p>Now onto a totally different line of thought. Are multipliers not known for cooking your CPU, like in consuming lots of power? That should have an effect on clock-speed in stressing scenarios. There must be something to say for choosing an energy efficient method in today\u2019s auto tuning CPUs with many cores on the same chip.<\/p>\n<p>Granted a measured 2x speedup for common cases is likely worth it, but in less extreme situations, higher overall clocks is a safer bet I think. But man, will this be hard to benchmark objectively.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3615\" class=\"comment odd alt thread-odd thread-alt depth-1\">\n<div id=\"div-comment-3615\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/5b79175a120778d5d23b80623cc99f58ae8ea6b7de60fa146ce4fd2a220e3aa5?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/5b79175a120778d5d23b80623cc99f58ae8ea6b7de60fa146ce4fd2a220e3aa5?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/5b79175a120778d5d23b80623cc99f58ae8ea6b7de60fa146ce4fd2a220e3aa5?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/5b79175a120778d5d23b80623cc99f58ae8ea6b7de60fa146ce4fd2a220e3aa5?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/5b79175a120778d5d23b80623cc99f58ae8ea6b7de60fa146ce4fd2a220e3aa5?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/5b79175a120778d5d23b80623cc99f58ae8ea6b7de60fa146ce4fd2a220e3aa5?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Marcelo Albertini<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3615\">June 16, 2018 at 18:06<\/a><\/div>\n<p>Nice article. Your hash function seems to be an instance of a near-universal family called binary multiplicative hashing.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3615#respond\" rel=\"nofollow\" data-commentid=\"3615\" data-postid=\"9623\" data-belowelement=\"div-comment-3615\" data-respondelement=\"respond\" data-replyto=\"Reply to Marcelo Albertini\" aria-label=\"Reply to Marcelo Albertini\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-3616\" class=\"comment even thread-even depth-1 parent\">\n<div id=\"div-comment-3616\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/4eb14ea35872fefbdd3e511f704fa64488840d02c6612dd041f83590724f48ab?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/4eb14ea35872fefbdd3e511f704fa64488840d02c6612dd041f83590724f48ab?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/4eb14ea35872fefbdd3e511f704fa64488840d02c6612dd041f83590724f48ab?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/4eb14ea35872fefbdd3e511f704fa64488840d02c6612dd041f83590724f48ab?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/4eb14ea35872fefbdd3e511f704fa64488840d02c6612dd041f83590724f48ab?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/4eb14ea35872fefbdd3e511f704fa64488840d02c6612dd041f83590724f48ab?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">allenz<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3616\">June 16, 2018 at 19:23<\/a><\/div>\n<p>If you mainly care about speed, you should use xor + bit mask instead of a multiplicative hash:<\/p>\n<p>Android (C++): <a href=\"https:\/\/android.googlesource.com\/platform\/system\/core\/+\/master\/libcutils\/hashmap.cpp#83\" rel=\"nofollow ugc\">https:\/\/android.googlesource.com\/platform\/system\/core\/+\/master\/libcutils\/hashmap.cpp#83<\/a><\/p>\n<p>Java SDK: <a href=\"http:\/\/hg.openjdk.java.net\/jdk8\/jdk8\/jdk\/file\/687fd7c7986d\/src\/share\/classes\/java\/util\/HashMap.java#l320\" rel=\"nofollow ugc\">http:\/\/hg.openjdk.java.net\/jdk8\/jdk8\/jdk\/file\/687fd7c7986d\/src\/share\/classes\/java\/util\/HashMap.java#l320<\/a><\/p>\n<p>By the way, I found this post from Hacker News: <a href=\"https:\/\/news.ycombinator.com\/item?id=17328756\" rel=\"nofollow ugc\">https:\/\/news.ycombinator.com\/item?id=17328756<\/a><\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3616#respond\" rel=\"nofollow\" data-commentid=\"3616\" data-postid=\"9623\" data-belowelement=\"div-comment-3616\" data-respondelement=\"respond\" data-replyto=\"Reply to allenz\" aria-label=\"Reply to allenz\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3630\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2 parent\">\n<div id=\"div-comment-3630\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3630\">June 17, 2018 at 07:14<\/a><\/div>\n<p>I think if you go down the path of xor+shift, and you try to make it good, you will get Knuth\u2019s third hash function, which I briefly mentioned in the blog post. And I think it\u2019s worth investigating, I just don\u2019t have the time to do it. My prediction is that it may be faster for 32 bit hashes but it will be slower for 64 bit hashes because you have to do more operations to mix in all the bits.<\/p>\n<p>The C++ code you linked to is already slower than Fibonacci hashing. It does two additions, two xors, one bit negation, four shifts and one binary and. The single multiplication followed by a single shift of Fibonacci hashing is going to be faster than that. Just to be sure I actually measured it, and it just comes out slower. Which isn\u2019t surprising because there is no reason why it should be faster.<\/p>\n<p>The other thing to consider with that C++ code is that it is written for a 32 bit hashtable. If you\u2019re on a 64 bit platform you have to mix in more bits, so you probably have to do even more operations and it would be even slower. Or you just continue to use 32 bit hashes even on 64 bit platforms, but now you\u2019re not compatible with the interface of std::unordered_map. And you can\u2019t just chop off the top 32 bits because the whole point of this was to use all the bits.<\/p>\n<p>The Java code may be fast since it just does a shift, xor and a binary and. But I don\u2019t think it\u2019ll be good. For example I think the very common pattern of inserting pointers will still give you lots of hash collisions with this. If your pointers are eight bytes aligned, you are losing three bits at the bottom. The code is hoping that by shifting exactly 16 bits over you are getting three more helpful bits mixed in. And maybe that\u2019s the case in Java, because maybe they wrote their allocator to work like that, but I wouldn\u2019t rely on that in general. If you\u2019ve looked at a lot of pointers, you see that they have very clear patterns, and so it wouldn\u2019t surprise me if you shift down sixteen bits and find that all your pointers have the same values in the shifted-down bits and you get just as many hash collisions as before. (oh and if code uses SIMD instructions, it\u2019s common to just align all heap allocations by sixteen bytes because you can get segfaults for doing certain SIMD instructions on unaligned data. So in my world all pointers are sixteen byte aligned and you are losing one more bit that you have to make up for with just that single shift)<\/p>\n<p>Another pattern that would give you problems is if you have two values and you combine them in the hash. For example I once had a grid of data, and I used the X and Y coordinate of the cell as the hash: The X coordinate was in the lower 32 bits, the Y coordinate was in the upper 32 bits. That immediately gives you problems if you just do a single shift by sixteen. Because half the data is 32 bits over, not 16 bits over, so now your table has to be at least size 65536 before you start using the Y coordinate at all, and you\u2019re getting a massive number of hash collisions before that. Let\u2019s say I compress things so that the X coordinate is in the lower 16 bits and the Y coordinate is in the next 16 bits. Now you get a problem where all diagonal coordinates like (1, 1) hash to the same value, because the xor makes the result zero. So the next thing to try would be to have a more complicated hash function on the client side so that the X and Y coordinate contribute in a less predictable pattern. (maybe use FNV1 hash like Dinkumware) But now you\u2019re even slower, just because you had to work around a problem where the hash table had a bad way to map from hash values to slots. (and you\u2019re still paying the cost for that on top of the cost of FNV1)<\/p>\n<p>You may say that that is a bit of a constructed example (it\u2019s a real example though, honest) but even if you never do this specific thing, there is a very common pattern of having IDs combined of several different bits of information. Like maybe the lowest 32 bits of your ID are just an incrementing counter, the next 8 bits of the ID indicate a type, the next 16 bits have different meaning depending on the type, and the top 8 bits indicate a sort order. I\u2019m sure you have seen constructions like that. And I\u2019m sure you can immediately see how those break with the Java method. (like any change to the \u201cdifferent meaning\u201d bits will just get ignored if your hash table is small)<\/p>\n<p>Plus it also suffers from the same problem as the C++ code where it\u2019s written for a 32 bit platform, and for a 64 bit platform you\u2019d want to do more work to use the whole hash. And if you change it so that it does a second shift and a second xor, it\u2019s already slower than Fibonacci hashing. So we\u2019d have to be more clever to adapt it to 64 bits, otherwise we might as well use Fibonacci hashing. (more cleverness meaning you\u2019d have to take advantage of instruction level parallelism somehow, which neither the Java code nor the C++ code does)<\/p>\n<p>All that being said I still think the path of \u201cxor + shift\u201d is worth investigating. I would just start off by looking into Knuth\u2019s third hash function, because he actually proves that that has good properties.<\/p>\n<p>And finally, I actually think that the two examples you posted are perfect examples of how much of a shame it is that the world forgot about Fibonacci Hashing. People saw that prime number integer modulo is slow, so they use power of two integer modulo. But now they get lots of problems so they try to deal with those problems, but they\u2019re improvising and it ends up bad or too slow. They should have just used Fibonacci hashing. It\u2019s simply the best trade-off for this problem.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3630#respond\" rel=\"nofollow\" data-commentid=\"3630\" data-postid=\"9623\" data-belowelement=\"div-comment-3630\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-4735\" class=\"comment even depth-3\">\n<div id=\"div-comment-4735\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/5349625b453159f8c6ec10228ca97f2f43d95cb338c7a7ec96c7d1225f23cab5?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/5349625b453159f8c6ec10228ca97f2f43d95cb338c7a7ec96c7d1225f23cab5?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/5349625b453159f8c6ec10228ca97f2f43d95cb338c7a7ec96c7d1225f23cab5?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/5349625b453159f8c6ec10228ca97f2f43d95cb338c7a7ec96c7d1225f23cab5?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/5349625b453159f8c6ec10228ca97f2f43d95cb338c7a7ec96c7d1225f23cab5?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/5349625b453159f8c6ec10228ca97f2f43d95cb338c7a7ec96c7d1225f23cab5?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Nima<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-4735\">November 30, 2018 at 20:35<\/a><\/div>\n<p>I don\u2019t know what you mean by Knuth\u2019s third hash function, but in the hardware world, building hash tables with randomly-generated CRC-like matrices is very standard and those hash functions have a lot of great properties. The problem is that those don\u2019t translate well to Intel instructions. The CRC instructions don\u2019t mix bits well. If you\u2019re taking the hit to use AVX\/AVX2 instructions, you can do something similar to a hardware hash table and be very happy, but the instruction path might be a little long through SSE instructions only (although still full-throughput unlike integer modulo), and the cost is going to grow as the hash table gets bigger. That being said, I\u2019m not all that familiar with how people tend to handle the issues related to power of 2 hashing, so there\u2019s a chance that even doing the \u201cslow\u201d thing is fast enough.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-5329\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-5329\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/6ea76dce6fcaf93a10e00d04b6bdff9bd85a2cdc0acc4e504a98bd1c0f2406e4?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/6ea76dce6fcaf93a10e00d04b6bdff9bd85a2cdc0acc4e504a98bd1c0f2406e4?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/6ea76dce6fcaf93a10e00d04b6bdff9bd85a2cdc0acc4e504a98bd1c0f2406e4?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/6ea76dce6fcaf93a10e00d04b6bdff9bd85a2cdc0acc4e504a98bd1c0f2406e4?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/6ea76dce6fcaf93a10e00d04b6bdff9bd85a2cdc0acc4e504a98bd1c0f2406e4?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/6ea76dce6fcaf93a10e00d04b6bdff9bd85a2cdc0acc4e504a98bd1c0f2406e4?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">St\u00e9phane L.<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-5329\">March 12, 2019 at 11:00<\/a><\/div>\n<p>Great article as always, and thank you for all that useful information you provide!<\/p>\n<p>Now at the risk of being nitpicky (but I also write that as reference for whoever will read this thread in the future), I think your assumptions regarding Java are not correct. Although Java objects are indeed 8-byte aligned in the JVM heap, you never manipulate the actual addresses in (non-native) Java code, let alone when hashing object references. There seems to be a widespread idea that Object\u2019s default hashCode implementation (the so-called \u201cidentity hashcode\u201d) uses the address of the instance, but it normally does not, for the very good reason that the JVM may move objects around during GC sweeps (and that as you point out, it\u2019s not a very good hash anyway since it only uses a limited number of actual bits). This article [https:\/\/srvaroa.github.io\/jvm\/java\/openjdk\/biased-locking\/2017\/01\/30\/hashCode.html] explains what really happens in OpenJDK\u2019s HotSpot for instance, and it is quite involved and has evolved across different Java versions from outright random numbers to a hash based on the state of the current thread. The point is, the JDK HashMap implementation, with its xorshifting and power-of-2 tables actually work really well with identity hashcodes. Now of course, if instead of objects, you hash scalars which tend to be not-so-large multiples of 8\/16\/32\u2026, then your point becomes valid again.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3618\" class=\"comment even thread-odd thread-alt depth-1\">\n<div id=\"div-comment-3618\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/75cc1d4e64ed530927639869f04547dca99f570f3a3565ede57ee5ad49f218c7?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/75cc1d4e64ed530927639869f04547dca99f570f3a3565ede57ee5ad49f218c7?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/75cc1d4e64ed530927639869f04547dca99f570f3a3565ede57ee5ad49f218c7?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/75cc1d4e64ed530927639869f04547dca99f570f3a3565ede57ee5ad49f218c7?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/75cc1d4e64ed530927639869f04547dca99f570f3a3565ede57ee5ad49f218c7?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/75cc1d4e64ed530927639869f04547dca99f570f3a3565ede57ee5ad49f218c7?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">VG<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3618\">June 17, 2018 at 00:42<\/a><\/div>\n<p>Awesome writeup! Thanks for sharing<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3618#respond\" rel=\"nofollow\" data-commentid=\"3618\" data-postid=\"9623\" data-belowelement=\"div-comment-3618\" data-respondelement=\"respond\" data-replyto=\"Reply to VG\" aria-label=\"Reply to VG\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-3619\" class=\"comment byuser comment-author-pablofeo odd alt thread-even depth-1\">\n<div id=\"div-comment-3619\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/45315121cc21fabae721fb746a53c937b13123671df309a55c3d53db0ad1e243?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/45315121cc21fabae721fb746a53c937b13123671df309a55c3d53db0ad1e243?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/45315121cc21fabae721fb746a53c937b13123671df309a55c3d53db0ad1e243?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/45315121cc21fabae721fb746a53c937b13123671df309a55c3d53db0ad1e243?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/45315121cc21fabae721fb746a53c937b13123671df309a55c3d53db0ad1e243?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/45315121cc21fabae721fb746a53c937b13123671df309a55c3d53db0ad1e243?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/leitzakomapa.wordpress.com\" rel=\"ugc external nofollow\">pablofeo<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3619\">June 17, 2018 at 00:42<\/a><\/div>\n<p>Maths in Nature<\/p>\n<div class=\"embed-youtube\"><iframe loading=\"lazy\" title=\"Nature by Numbers\" src=\"https:\/\/www.youtube.com\/embed\/kkGeOWYOFoA?feature=oembed\" width=\"650\" height=\"366\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\" data-mce-fragment=\"1\"><\/iframe><\/div>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3619#respond\" rel=\"nofollow\" data-commentid=\"3619\" data-postid=\"9623\" data-belowelement=\"div-comment-3619\" data-respondelement=\"respond\" data-replyto=\"Reply to pablofeo\" aria-label=\"Reply to pablofeo\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-3621\" class=\"comment even thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-3621\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/atom-symbol.net\" rel=\"ugc external nofollow\">Jan Ziak<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3621\">June 17, 2018 at 01:23<\/a><\/div>\n<p>How is multiplication by 11400714819323198485 different from multiplication by a large prime number?<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3621#respond\" rel=\"nofollow\" data-commentid=\"3621\" data-postid=\"9623\" data-belowelement=\"div-comment-3621\" data-respondelement=\"respond\" data-replyto=\"Reply to Jan Ziak\" aria-label=\"Reply to Jan Ziak\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3625\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2 parent\">\n<div id=\"div-comment-3625\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3625\">June 17, 2018 at 06:08<\/a><\/div>\n<p>I don\u2019t know if it matters here whether the number is prime or not. The properties that you want for multiplicative hashing are different like, \u201cthere shouldn\u2019t be too many zero bits in a row.\u201d Prime numbers may or may not have that property.<\/p>\n<p>The one thing that you definitely need is that the constant comes from a division of <img decoding=\"async\" class=\"latex\" src=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002\" srcset=\"https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=2%5E%7B64%7D&amp;bg=fff&amp;fg=444444&amp;s=0&amp;c=20201002&amp;zoom=4.5 4x\" alt=\"2^{64}\" \/> by an irrational number. If you accidentally can get there by dividing by an integer or by a rational number, then when you wrap around you will eventually get back to the starting position, and you won\u2019t use all the slots in the hash table. For example here is a large prime that would be a terrible choice: 11068046444225730979. I found that number by entering \u201cnearestprime(2^64\/(5\/3))\u201d in Wolfram Alpha. This number comes from a division by a rational number and will not visit all the slots in your hash table as it wraps around. I just tried this for a table of size 8, and it only visits slot 0, 1, 3, 4 and 6. For larger tables it\u2019s even more wasteful.<\/p>\n<p>I was playing around with different constants before posting the blog post, but I only ever found bad constants, (and found lots of new ways that constants can be bad) so I left that part out of the text.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3625#respond\" rel=\"nofollow\" data-commentid=\"3625\" data-postid=\"9623\" data-belowelement=\"div-comment-3625\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3629\" class=\"comment byuser comment-author-77ae1a014f even depth-3\">\n<div id=\"div-comment-3629\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">atomsymbol (<img decoding=\"async\" class=\"emoji\" role=\"img\" draggable=\"false\" src=\"https:\/\/s0.wp.com\/wp-content\/mu-plugins\/wpcom-smileys\/twemoji\/2\/svg\/269b.svg\" alt=\"\u269b\" \/>)<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3629\">June 17, 2018 at 07:14<\/a><\/div>\n<p>Taking the lowest bits, instead of the highest bits, of the multiplication provides an even distribution.<\/p>\n<p>The GCD of a prime number other than 2 and 2^N is 1, which means it outputs the whole permutation as it wraps around the powers of 2.<\/p>\n<p>Taking the lowest bits gives the remainder of dividing by 2^K.<\/p>\n<p><a rel=\"tag\" class=\"hashtag u-tag u-category\" href=\"https:\/\/monodes.com\/predaelli\/tag\/include\/\">#include<\/a><br \/>\n<a rel=\"tag\" class=\"hashtag u-tag u-category\" href=\"https:\/\/monodes.com\/predaelli\/tag\/include\/\">#include<\/a><\/p>\n<p>int main() {<br \/>\nconst uint64_t m = 11068046444225730979llu; \/\/ prime<br \/>\n\/\/const uint64_t m = 18446744073709551629llu; \/\/ nearest prime 2^64<br \/>\n\/\/const uint64_t m = 3; \/\/ smallest prime other than 2<br \/>\n<a rel=\"tag\" class=\"hashtag u-tag u-category\" href=\"https:\/\/monodes.com\/predaelli\/tag\/define\/\">#define<\/a> K 1024<br \/>\nint count[K] = {};<br \/>\nfor(uint64_t i=0; i&lt;100*K; i++) {<br \/>\nuint64_t h = (i*m) &amp; (K-1);<br \/>\nprintf(&#8220;%ld\\n&#8221;, h);<br \/>\ncount[h]++;<br \/>\n}<br \/>\nprintf(&#8220;\\n&#8221;);<br \/>\nfor(size_t i=0; i&lt;K; i++) {<br \/>\nprintf(&#8220;%d\\n&#8221;, count[i]);<br \/>\n}<br \/>\nreturn 0;<br \/>\n}<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3639\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-3\">\n<div id=\"div-comment-3639\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3639\">June 17, 2018 at 11:52<\/a><\/div>\n<p>I think that function is discarding the upper bits. In this loop header:<br \/>\nfor(uint64_t i=0; i&lt;100*K; i++)<br \/>\n{<br \/>\n\/\/ \u2026<br \/>\n}<\/p>\n<p>Try doing larger step sizes. For example doing this:<br \/>\nuint64_t step_size = 1024 * 1024;<br \/>\nfor(uint64_t i = 0; i &lt; 100 * K * step_size; i += step_size)<br \/>\n{<br \/>\n\/\/ \u2026<br \/>\n}<\/p>\n<p>Now they will all be mapped to zero because now we&#8217;re iterating through the upper bits, and that hash function will discard the upper bits.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3642\" class=\"comment byuser comment-author-77ae1a014f even depth-3\">\n<div id=\"div-comment-3642\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/f5c8ab1a4a405f877f5caf23a7105cae64785ae147161b93922c6e26020be3e5?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">atomsymbol (<img decoding=\"async\" class=\"emoji\" role=\"img\" draggable=\"false\" src=\"https:\/\/s0.wp.com\/wp-content\/mu-plugins\/wpcom-smileys\/twemoji\/2\/svg\/269b.svg\" alt=\"\u269b\" \/>)<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3642\">June 17, 2018 at 12:55<\/a><\/div>\n<p>Well, but there is no such thing as step_size=1024*1024 in my codes because I always try to keep the randomness in the lower bits or even better to keep the randomness uniformly distributed over all bits of the 64-bit integer.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3622\" class=\"comment odd alt thread-even depth-1 parent\">\n<div id=\"div-comment-3622\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/1cb1f5b296c0d7eac5e6a138ad6b61e5c39d4ba57c123693fd556777502c3317?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/1cb1f5b296c0d7eac5e6a138ad6b61e5c39d4ba57c123693fd556777502c3317?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/1cb1f5b296c0d7eac5e6a138ad6b61e5c39d4ba57c123693fd556777502c3317?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/1cb1f5b296c0d7eac5e6a138ad6b61e5c39d4ba57c123693fd556777502c3317?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/1cb1f5b296c0d7eac5e6a138ad6b61e5c39d4ba57c123693fd556777502c3317?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/1cb1f5b296c0d7eac5e6a138ad6b61e5c39d4ba57c123693fd556777502c3317?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">anon<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3622\">June 17, 2018 at 02:35<\/a><\/div>\n<p>Hi! Any chance you can upload your benchmarks and also your implementation of google\u2019s new hash table?<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3622#respond\" rel=\"nofollow\" data-commentid=\"3622\" data-postid=\"9623\" data-belowelement=\"div-comment-3622\" data-respondelement=\"respond\" data-replyto=\"Reply to anon\" aria-label=\"Reply to anon\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3631\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-2\">\n<div id=\"div-comment-3631\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3631\">June 17, 2018 at 07:18<\/a><\/div>\n<p>Yes, sorry, it\u2019s still in the plans. I just got excited by Fibonacci hashing, so that took priority.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3631#respond\" rel=\"nofollow\" data-commentid=\"3631\" data-postid=\"9623\" data-belowelement=\"div-comment-3631\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3623\" class=\"comment odd alt thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-3623\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Frank Zingsheim<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3623\">June 17, 2018 at 03:22<\/a><\/div>\n<p>You wrote: \u201cthe number 11400714819323198486 is closer but we don\u2019t want multiples of two because that would throw away one bit\u201d<\/p>\n<p>Note: This bit which is lost is used only if you have a bucket size of 2^64 (which is unrealistic). In all other cases where you have a bucket size of 2^n (with n&lt;64) you throw away this bit anyhow by the right shift.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3623#respond\" rel=\"nofollow\" data-commentid=\"3623\" data-postid=\"9623\" data-belowelement=\"div-comment-3623\" data-respondelement=\"respond\" data-replyto=\"Reply to Frank Zingsheim\" aria-label=\"Reply to Frank Zingsheim\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3632\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-2 parent\">\n<div id=\"div-comment-3632\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3632\">June 17, 2018 at 07:21<\/a><\/div>\n<p>You\u2019re actually losing the top-most bit of the input hash. Because if you multiply the top-most bit by anything other than 1, you will overflow and then lose it after truncation. So the 1 bit has to be set in the magic constant. And the top-most bit of the input hash is used for any number of buckets, not just for 2^64, because we shift the result down after the multiplication.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3632#respond\" rel=\"nofollow\" data-commentid=\"3632\" data-postid=\"9623\" data-belowelement=\"div-comment-3632\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3634\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3634\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8863cad8a5a30285ffe8a2ec727a1079999b61c8626b47327238077a5977db38?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Frank Zingsheim<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3634\">June 17, 2018 at 08:39<\/a><\/div>\n<p>Yes you are right. Thank you for the clarification.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3628\" class=\"comment even thread-even depth-1 parent\">\n<div id=\"div-comment-3628\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/gravatar.com\/myhalloweencostumerocks\" rel=\"ugc external nofollow\">cornholio<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3628\">June 17, 2018 at 06:38<\/a><\/div>\n<p>Since Fibonacci multiples have essentially random values for the lower bits, I wonder if you can break up any remaining ill conditioned patterns by XOR-ing the result of the multiplication with the input value itself. Intuitively, it should combine the strong hashing power of the multiplication with the excellent spreading you would otherwise see for a Fibonacci pattern mapped 1:1 in a power of 2 sized table.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3628#respond\" rel=\"nofollow\" data-commentid=\"3628\" data-postid=\"9623\" data-belowelement=\"div-comment-3628\" data-respondelement=\"respond\" data-replyto=\"Reply to cornholio\" aria-label=\"Reply to cornholio\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3633\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2 parent\">\n<div id=\"div-comment-3633\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3633\">June 17, 2018 at 07:23<\/a><\/div>\n<p>I think that might work, but you\u2019re adding two instructions to the critical path there. You have to do the xor, and then you have to do another binary and to get rid of any bits that are too high. The other problem with that approach is that you\u2019re not using the top-bits more. You are only using the bottom bits more. So I think you\u2019re only going to gain a little at the cost of making all lookups more expensive.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3633#respond\" rel=\"nofollow\" data-commentid=\"3633\" data-postid=\"9623\" data-belowelement=\"div-comment-3633\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3636\" class=\"comment even depth-3\">\n<div id=\"div-comment-3636\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/gravatar.com\/myhalloweencostumerocks\" rel=\"ugc external nofollow\">cornholio<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3636\">June 17, 2018 at 09:27<\/a><\/div>\n<p>I think it\u2019s just a single instruction, since you are already clamping the upper bits to fit the table. For a power of 2 table, something like this:<br \/>\nreturn ((hash * 11400714819323198485llu) ^ hash) &gt;&gt; 54;<\/p>\n<p>Depending on the exact hardware architecture, an extra XOR could come almost free compared with a 64 bit multiplication and barrel shift. The sole purpose of this is to gain generality over your algorithm in the face of \u201cmultiple of N\u201d data, where N is variable and performance suddenly drops for magic values of N.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3637\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3637\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/b60546a4e2520e6d6c193fb6ee3d7ace5f028845de7138defe1b8161f29f12e2?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/gravatar.com\/myhalloweencostumerocks\" rel=\"ugc external nofollow\">cornholio<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3637\">June 17, 2018 at 09:41<\/a><\/div>\n<p>Oups, I see now that won\u2019t work since your shifting out the lower bits. Indeed, it needs two extra instruction, at which point your shift-xor scheme before multiplication is probably preferable, with a strong avalanche to other bits.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3638\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-3\">\n<div id=\"div-comment-3638\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3638\">June 17, 2018 at 11:33<\/a><\/div>\n<p>Yeah I think if you do the xor before the shift, you get a problem with the top-most bit. That bit will just be xor-ed out. So the contribution of the top-most bit of the input would disappear.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3643\" class=\"comment odd alt thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-3643\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/twitter.com\/richgel999\" rel=\"ugc external nofollow\">Rich Geldreich (@richgel999)<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3643\">June 17, 2018 at 14:55<\/a><\/div>\n<p>Great article. Glad my comment was useful!<\/p>\n<p>I\u2019ve read Knuth\u2019s \u201cSorting and Searching\u201d from cover to cover a few times, always on the lookout for new\/interesting algorithms.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3643#respond\" rel=\"nofollow\" data-commentid=\"3643\" data-postid=\"9623\" data-belowelement=\"div-comment-3643\" data-respondelement=\"respond\" data-replyto=\"Reply to Rich Geldreich (@richgel999)\" aria-label=\"Reply to Rich Geldreich (@richgel999)\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3644\" class=\"comment even depth-2\">\n<div id=\"div-comment-3644\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/69cab060603b312c511650d43999c447f3abc8d0b5e97ead0c01453de5e286a4?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/twitter.com\/richgel999\" rel=\"ugc external nofollow\">Rich Geldreich (@richgel999)<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3644\">June 17, 2018 at 15:03<\/a><\/div>\n<p>(\u201cnew\u201d meaning algorithms or methods that modern software engineers have ignored or forgotten)<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3644#respond\" rel=\"nofollow\" data-commentid=\"3644\" data-postid=\"9623\" data-belowelement=\"div-comment-3644\" data-respondelement=\"respond\" data-replyto=\"Reply to Rich Geldreich (@richgel999)\" aria-label=\"Reply to Rich Geldreich (@richgel999)\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3674\" class=\"comment byuser comment-author-berkamin odd alt thread-even depth-1 parent\">\n<div id=\"div-comment-3674\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/6cecdbe28628de8796b8b4b7b86fb58968226e34f004bcdd5f19df504342d275?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/6cecdbe28628de8796b8b4b7b86fb58968226e34f004bcdd5f19df504342d275?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/6cecdbe28628de8796b8b4b7b86fb58968226e34f004bcdd5f19df504342d275?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/6cecdbe28628de8796b8b4b7b86fb58968226e34f004bcdd5f19df504342d275?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/6cecdbe28628de8796b8b4b7b86fb58968226e34f004bcdd5f19df504342d275?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/6cecdbe28628de8796b8b4b7b86fb58968226e34f004bcdd5f19df504342d275?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/gravatar.com\/berkamin\" rel=\"ugc external nofollow\">Berkana<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3674\">June 18, 2018 at 16:49<\/a><\/div>\n<p>Correct me if I\u2019m mistaken, but wouldn\u2019t any of the irrational square roots of whole numbers work just as well as phi in this case? By being an irrational number, the same property of distributing petals around a circle without overlapping should persist. Maybe using one of these would resolve the problem of certain numbers not working well with the Fibonacci hash.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3674#respond\" rel=\"nofollow\" data-commentid=\"3674\" data-postid=\"9623\" data-belowelement=\"div-comment-3674\" data-respondelement=\"respond\" data-replyto=\"Reply to Berkana\" aria-label=\"Reply to Berkana\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3676\" class=\"comment even depth-2\">\n<div id=\"div-comment-3676\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/www.sesse.net\/\" rel=\"ugc external nofollow\">Steinar H. Gunderson<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3676\">June 19, 2018 at 04:33<\/a><\/div>\n<p>Only if you disregard rounding, I believe. The beauty of Fibonacci hashing is that with incrementing inputs, the output will always fall in the largest open interval (ie. as far as possible from any earlier element).<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3676#respond\" rel=\"nofollow\" data-commentid=\"3676\" data-postid=\"9623\" data-belowelement=\"div-comment-3676\" data-respondelement=\"respond\" data-replyto=\"Reply to Steinar H. Gunderson\" aria-label=\"Reply to Steinar H. Gunderson\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-3677\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2 parent\">\n<div id=\"div-comment-3677\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3677\">June 19, 2018 at 04:38<\/a><\/div>\n<p>Yes, any irrational number will wrap around without getting back to the starting point. And I actually spent a good amount of time searching for a replacement number that has fewer problematic numbers.<\/p>\n<p>But I just found lots of ways to break multiplicative hashing. Like if there are too many zero bits in the number, that causes problems. Or you try a number and it\u2019s not obviously bad, but some of your benchmarks suddenly run 10% slower, indicating that there are more hash collisions, and it\u2019s hard to find out where they come from.<\/p>\n<p>The golden ratio however did well in all my benchmarks. It\u2019s only when you stop using it that you discover how many problems you run into, and how careful the choice has to be.<\/p>\n<p>I still think there are other good numbers out there, but I don\u2019t know how to find them. But it wouldn\u2019t surprise me if there was a better choice out there than the golden ratio.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3677#respond\" rel=\"nofollow\" data-commentid=\"3677\" data-postid=\"9623\" data-belowelement=\"div-comment-3677\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3679\" class=\"comment even depth-3\">\n<div id=\"div-comment-3679\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Joern<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3679\">June 19, 2018 at 11:19<\/a><\/div>\n<p>While I\u2019m not sure how to find them either, I think I have a quick test to discard most bad ones. My old criteria were:<br \/>\n\u2013 hex representation doesn\u2019t contain 0 or f,<br \/>\n\u2013 popcount of both 32-bit halves is 16 and<br \/>\n\u2013 must be a prime number.<\/p>\n<p>Pick a random number and roughly one in 10k will match all criteria above. Maybe slightly weaker criteria would also be ok, you want roughly half of all bits to be 1 while my criteria enforce exactly half \u2013 and do that for both halves of the multiplier. Similar for the hex representation, you want to avoid long sequences of identical bits. Excluding 0 and f limits sequences to 6 identical bits, though not necessarily in the most mathematically sound way.<\/p>\n<p>Another important detail that I missed in the past is:<br \/>\n\u2013 the multiplicative inverse shouldn\u2019t be a common number like 3 or 5.<\/p>\n<p>My hunch is that enforcing the same criteria as above for the multiplicative inverse would be a good choice. But I have to spend more time on that.<\/p>\n<p>You golden-ratio-based multiplier, 0x9E3779B97F4A7C15, is pretty close. I see one F and one sequence of 7 1-bits, popcount is 38. So you missed my criteria, but not by much. Might say more about my criteria being overly pessimistic than about your constant being bad.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3779\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3779\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/www.chaos.org.uk\/~eddy\/\" rel=\"ugc external nofollow\">Edward Welbourne<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3779\">July 24, 2018 at 06:57<\/a><\/div>\n<p>The golden ratio works well, for this job, because it\u2019s particularly hard to approximate well with a ratio of whole numbers. Successive entries in Fibonacci\u2019s sequence are as good as you\u2019ll get, and even that sequence, f(i+1)\/f(i) for successive i, converges slowly relative to good rational approximations to various other irrationals. Your problem sequences come where you look at a succession of values that are multiples of a good denominator for a rational approximation.<\/p>\n<p>You\u2019re using hash(x) = (x*k)%N with k = (r*M)%M for some irrational r, M = 2^64 and N as some smaller power of two. If it weren\u2019t for rounding, hash(x) would be (x*r*M)%N and, for some rational m\/n approximation to r, hash(i*n) is close to (i*m*M)%N, which is zero; the \u201cclose to\u201d difference is (i*(n*r -m)*M)%N and some rounding errors; if m\/n is a good approximation, this is small and you get the zeros you saw with larger Fibonacci numbers. At least, that\u2019s what I think is happening \u2026<\/p>\n<p>One way to consider rational approximation of an irrational is to split the irrational into whole number and a fractional part, then apply the same method to approximate the fractional part; go a few steps, approximate by a whole number, then unroll back to where you started. Thus, for pi we get pi = 3.1415\u2026, 1\/(pi \u2013 3) = 7.0625\u2026, 1\/(1\/(pi-3) -7) = 15.99\u2026, let\u2019s call that 16; so 1\/16 = 1\/(pi-3) -7, pi = 3 +1\/(7 +1\/16) = 3 +16\/(16*7 +1) = 3 +16\/113, good to three parts in ten million; and I only went a few steps in. One can slightly improve this by, rather than a whole-and-fractional split, using the nearest whole number and a +\/- fractional error, so that the error is never more than a half (where 15.99\u2026 would have had error .99\u2026) and the whole number is never (after the first step) less than two. This approach truncates a \u201ccontinued fraction\u201d approximation to the irrational, pi = 3 +1\/(7 +1\/(16 -1\/(294 -1\/(3 -1\/\u2026)))) in order to obtain a good rational approximation.<\/p>\n<p>You can do the same with the golden ratio, of course. It\u2019s the solution to x*x = x +1, so I can tell you right away that x = 1 +1\/x = 1 +1\/(1 +1\/x) = 1 +1\/(1 +1\/(1 +1\/(1 +1\/\u2026))). That\u2019s using the simple whole+fraction split; if we allow negatives, we have x -1 = 1\/x, so x\/(x -1) = x*x = x +1; with q = 1\/(2 -x) = 1\/(1 -1\/x) = x\/(x -1) = x +1, we get 1\/(3 -q) = 1\/(2 -x) = q, whence q = 3 -1\/q = 3 -1\/(3 -1\/(3 -\u2026)) and x = 2 -1\/q = 2 -1\/(3 -1\/(3 -1\/(3 -1\/\u2026))) and it\u2019s threes all the way down. This means you never get a nice big number in the sequence (like our sudden leap to 294 in pi\u2019s continued fraction), for which the 1\/\u2026 you\u2019ve got to add to it or subtract from it is a small fractional difference, making that a good place to truncate. So rational approximations to the golden ratio improve painfully slowly \u2013 which is a PITA if you want a good rational approximation, but a blessing when you\u2019re trying to pick an irrational to use in your multiplicative hashing. When we truncate 1 +1\/(1 +1\/\u2026) before successive +s, we get 1\/1, 2\/1, 3\/2, 5\/3, 8\/5, 13\/8, \u2026, the ratios of successive Fibonacci numbers. Truncating 2 -1\/(3 -1\/(3 -\u2026)) before successive -s, we get 2\/1, 5\/3, 13\/8, 34\/21, \u2026, selecting every second entry from the previous sequence; allowing subtraction doubles our speed of refinement, but it\u2019s still painfully slow.<\/p>\n<p>So can we find something more pathological than the golden ratio ? Well, the only credible candidate is 2 +1\/(2 +1\/(2 +1\/\u2026)) = z = 2 +1\/z so 1 = z*z -2*z = (z -1)^2 -1 and z -1 = sqrt(2) so z = 1 +sqrt(2) = 1\/(sqrt(2) -1). So maybe give that a try \u2013 obviously, you\u2019ll be dividing by it, using r = sqrt(2) -1 in my analysis above. I\u2019d be interested to know how well it fares as a hash ;^)<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3780\" class=\"comment even depth-3\">\n<div id=\"div-comment-3780\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/5232a76e1198f53163f09f7cf6d7752d94bc43d03723fa8bd19b166d3d63e29c?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/www.chaos.org.uk\/~eddy\/\" rel=\"ugc external nofollow\">Edward Welbourne<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3780\">July 24, 2018 at 07:15<\/a><\/div>\n<p>Sorry \u2013 I said: \u201csplit the irrational into whole number and a fractional part, then apply the same method to approximate the fractional part\u201d but meant \u2026 apply the same method to approximate the *inverse of the* fractional part.<\/p>\n<p>Note that successive truncations of 2 +1\/(2 +1\/(2 +\u2026)) are 2, 5\/2, 12\/5, 29\/12, 70\/29, 169\/70, 408\/169, \u2026 so it\u2019s the denominators showing up here that are apt to give trouble for 2^64 *(sqrt(2) -1)<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3681\" class=\"comment odd alt thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-3681\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Vladimir G. Ivanovic<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3681\">June 19, 2018 at 12:18<\/a><\/div>\n<p>Curious to why all implementations have such an obvious knee in the graph.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3681#respond\" rel=\"nofollow\" data-commentid=\"3681\" data-postid=\"9623\" data-belowelement=\"div-comment-3681\" data-respondelement=\"respond\" data-replyto=\"Reply to Vladimir G. Ivanovic\" aria-label=\"Reply to Vladimir G. Ivanovic\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3685\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-2 parent\">\n<div id=\"div-comment-3685\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3685\">June 19, 2018 at 16:47<\/a><\/div>\n<p>It\u2019s where the table is too big to fit entirely in L3 cache. At that point you start getting more and more cache misses, and those are really slow.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3685#respond\" rel=\"nofollow\" data-commentid=\"3685\" data-postid=\"9623\" data-belowelement=\"div-comment-3685\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3687\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3687\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/677175c554caf70f227d878831c1c1a85a3b258180c1932cab204fa255a77a24?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Vladimir G. Ivanovic<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3687\">June 19, 2018 at 17:36<\/a><\/div>\n<p>I should have thought of that\u2026<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3682\" class=\"comment even thread-even depth-1 parent\">\n<div id=\"div-comment-3682\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/fabd517cef4433891ecce8c1fe2b25481b9d46fe04b0636794ba85c4fbf7336b?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/fabd517cef4433891ecce8c1fe2b25481b9d46fe04b0636794ba85c4fbf7336b?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/fabd517cef4433891ecce8c1fe2b25481b9d46fe04b0636794ba85c4fbf7336b?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/fabd517cef4433891ecce8c1fe2b25481b9d46fe04b0636794ba85c4fbf7336b?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/fabd517cef4433891ecce8c1fe2b25481b9d46fe04b0636794ba85c4fbf7336b?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/fabd517cef4433891ecce8c1fe2b25481b9d46fe04b0636794ba85c4fbf7336b?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Oleg Pliss<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3682\">June 19, 2018 at 13:04<\/a><\/div>\n<p>The article *assumes* that integer division by a prime is slower than integer multiplication followed by the shift. This is not necessarily true.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3682#respond\" rel=\"nofollow\" data-commentid=\"3682\" data-postid=\"9623\" data-belowelement=\"div-comment-3682\" data-respondelement=\"respond\" data-replyto=\"Reply to Oleg Pliss\" aria-label=\"Reply to Oleg Pliss\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3684\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2 parent\">\n<div id=\"div-comment-3684\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3684\">June 19, 2018 at 16:31<\/a><\/div>\n<p>Can you elaborate on that? The claim for integer division being slow comes from measurements, not assumptions. But I also wouldn\u2019t be surprised if there were faster ways of doing the division. I know of libdivide, but that\u2019s slower than Fibonacci hashing. But if you know of another way to make it fast, I would be very interested.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3684#respond\" rel=\"nofollow\" data-commentid=\"3684\" data-postid=\"9623\" data-belowelement=\"div-comment-3684\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-4064\" class=\"comment even depth-3\">\n<div id=\"div-comment-4064\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/cb4c36f94d471e6605022787e6f221746846ce65704cf5017564045abf099f38?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/cb4c36f94d471e6605022787e6f221746846ce65704cf5017564045abf099f38?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/cb4c36f94d471e6605022787e6f221746846ce65704cf5017564045abf099f38?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/cb4c36f94d471e6605022787e6f221746846ce65704cf5017564045abf099f38?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/cb4c36f94d471e6605022787e6f221746846ce65704cf5017564045abf099f38?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/cb4c36f94d471e6605022787e6f221746846ce65704cf5017564045abf099f38?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Max Langhof<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-4064\">August 21, 2018 at 04:17<\/a><\/div>\n<p>I think they are referring to some clever transformations a compiler can make to avoid divisions, for example: <a href=\"https:\/\/godbolt.org\/g\/7hfX19\" rel=\"nofollow ugc\">https:\/\/godbolt.org\/g\/7hfX19<\/a><\/p>\n<p>In that case it is very obvious that it\u2019s slower than a multiply + shift, but there might be \u201cgood\u201d primes that you can calculate division (mod 2^64) quite fast for.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-6088\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-6088\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/0aa37dc30fbc718713c14b149a5b518118b0c9d4e37f663b72fc630f1a586d4d?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/0aa37dc30fbc718713c14b149a5b518118b0c9d4e37f663b72fc630f1a586d4d?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/0aa37dc30fbc718713c14b149a5b518118b0c9d4e37f663b72fc630f1a586d4d?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/0aa37dc30fbc718713c14b149a5b518118b0c9d4e37f663b72fc630f1a586d4d?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/0aa37dc30fbc718713c14b149a5b518118b0c9d4e37f663b72fc630f1a586d4d?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/0aa37dc30fbc718713c14b149a5b518118b0c9d4e37f663b72fc630f1a586d4d?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/www.ffconsultancy.com\" rel=\"ugc external nofollow\">Jon Harrop<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-6088\">July 30, 2019 at 14:12<\/a><\/div>\n<p>\u201cCan you elaborate on that? The claim for integer division being slow comes from measurements, not assumptions.\u201d<\/p>\n<p>Looks like you\u2019ve measured the performance of modulo by a variable (i.e. a number unknown at compile time) when you should be measuring the performance of modulo by a constant.<\/p>\n<p>As others have said, modulo by a constant is optimised into much faster code by any decent compiler. I expect if you measure it you will find that there is no justification for the alternatives.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-6116\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-3\">\n<div id=\"div-comment-6116\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-6116\">August 6, 2019 at 05:33<\/a><\/div>\n<p>Modulo by a constant is what I used before I switched to Fibonacci hashing. It was slower. If there are \u201cgood\u201d primes that can do a modulo quickly, I don\u2019t know how to find them. (there is also a risk that those prime numbers might be bad for hash tables, which you can only really figure out by measuring a couple of bad patterns)<\/p>\n<p>Jon what do you mean when you say this:<br \/>\n\u201cAs others have said, modulo by a constant is optimised into much faster code by any decent compiler. I expect if you measure it you will find that there is no justification for the alternatives.\u201d<\/p>\n<p>I have measured it and it wasn\u2019t faster.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-6117\" class=\"comment byuser comment-author-1yk0s odd alt depth-3\">\n<div id=\"div-comment-6117\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-6117\">August 6, 2019 at 06:14<\/a><\/div>\n<p>I agree with Malte. \u201cgood primes\u201d are precisely good because they are of low complexity, which makes them \u201cbad primes\u201d for hashing.<\/p>\n<p>Modulo constant is going to reduce to around two multiplications, and some shifts and adds. But most importantly hash table sizes are not fixed, you\u2019d need to force compilation of reduction with all possible moduli, maybe like this: <a href=\"https:\/\/gcc.godbolt.org\/z\/4T6ZaD\" rel=\"nofollow ugc\">https:\/\/gcc.godbolt.org\/z\/4T6ZaD<\/a><br \/>\nI can easily imagine this being slower even if you have to emulate a 64 bit multiplication with three 32 bit multiplications and 4 add\/sub because of smaller code size at a similar computational effort.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-6778\" class=\"comment byuser comment-author-1yk0s even depth-3\">\n<div id=\"div-comment-6778\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-6778\">November 13, 2019 at 06:52<\/a><\/div>\n<p>You can accelerate division by computing an analog to the inverse of a number with the Newton-Raphson method. To compute the division you multiply the inverse and the number you want to divide and then take the high bits. The result is potentially 1 less than the division you wanted, so you check for that and you are done. If you store the inverse and if the division is going to be repeated more than three times it should already be worth it according to my benchmarks. Computing and applying the inverse takes around 2.7 times more than the hardware division, but the hardware division takes 13.3 times longer than just applying the inverse, that is when the divisor is momentarily constant.<\/p>\n<p>Well I\u2019m not saying that you should do that, because you could just as well be using fastrange, it has the same overhead, the same operations even. But if you wanted to use modulo reduction then this is the way to go in my opinion.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3689\" class=\"comment byuser comment-author-stevefink375311815 odd alt thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-3689\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/94b511d319946b940bfeba1f668f161f452634aa2699170d326a3b56ad0d2316?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/94b511d319946b940bfeba1f668f161f452634aa2699170d326a3b56ad0d2316?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/94b511d319946b940bfeba1f668f161f452634aa2699170d326a3b56ad0d2316?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/94b511d319946b940bfeba1f668f161f452634aa2699170d326a3b56ad0d2316?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/94b511d319946b940bfeba1f668f161f452634aa2699170d326a3b56ad0d2316?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/94b511d319946b940bfeba1f668f161f452634aa2699170d326a3b56ad0d2316?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">stevefink375311815<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3689\">June 19, 2018 at 21:46<\/a><\/div>\n<p>There are definitely use cases where you need to be at least somewhat robust to adversarial input \u2014 eg, in a Javascript implementation that\u2019s going to be used to run arbitrary code on the Web. Then I think you need the property that a pathological input won\u2019t be problematic on all machines, only a random subset. I believe the standard fix is to use a hash function (in the traditional sense, not the big -&gt; tablesize mapping) that incorporates a random seed.<\/p>\n<p>I guess you could just declare this to be the problem of the hash function whenever the user controls the input. I don\u2019t know if it\u2019s enough to xor with the seed in between the hash and range-reducing steps? When using a Fibonacci reduction, I mean. (You could think of that as either the final step of the hash function, or the initial step of the reduction, depending on how much you trust the users to choose appropriate hash functions depending on where their input comes from.)<\/p>\n<p>I don\u2019t think this problem is restricted to directly web-exposed code, either. You don\u2019t want someone to be able to DOS you just by picking a whole bunch of awkward user names, for example. Databases probably need to worry about this too.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3689#respond\" rel=\"nofollow\" data-commentid=\"3689\" data-postid=\"9623\" data-belowelement=\"div-comment-3689\" data-respondelement=\"respond\" data-replyto=\"Reply to stevefink375311815\" aria-label=\"Reply to stevefink375311815\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3690\" class=\"comment byuser comment-author-sagan1338 bypostauthor even depth-2 parent\">\n<div id=\"div-comment-3690\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3690\">June 20, 2018 at 03:56<\/a><\/div>\n<p>Yes, that is a great point because none of my hash tables handle this right now. I am also not sure where to handle this. My first thought would be to generate a random number on hash table creation, and to always xor that with the result of the hash function.(just before the Fibonacci hashing) But I honestly don\u2019t know if that\u2019s good enough, and I wouldn\u2019t want to make a change like that unless I know it\u2019s good, and unless I can test it. And I don\u2019t work in an environment where I can test it, so for now I don\u2019t do this. I think it\u2019s better to obviously not handle this than it is to pretend to handle it with a bad solution.<\/p>\n<p>I might just use Google\u2019s solution when they open source their hash table. They xor the address of the bucket memory into the hash value. If that is good enough for them, it\u2019s good enough for me. (You need to have ASLR enabled)<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3690#respond\" rel=\"nofollow\" data-commentid=\"3690\" data-postid=\"9623\" data-belowelement=\"div-comment-3690\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3691\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3691\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/www.sesse.net\/\" rel=\"ugc external nofollow\">Steinar H. Gunderson<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3691\">June 20, 2018 at 04:01<\/a><\/div>\n<p>As far as I know, randomizing isn\u2019t sufficient, because it\u2019s too easy to leak the seed inadvertedly (or simply try a bunch of inserts, observe which of them that take longer time, repeat, and tada, you have a lot ot them and can do an attack). I thought Robin Hood hashing was regarded a more robust solution than randomizing, but I might be wrong.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3694\" class=\"comment even thread-even depth-1 parent\">\n<div id=\"div-comment-3694\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Z01<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3694\">June 21, 2018 at 19:32<\/a><\/div>\n<p>Btw prime bucket sizes could also be improved.<br \/>\nAs you see here:<br \/>\n<a href=\"https:\/\/godbolt.org\/g\/LZCtwi\" rel=\"nofollow ugc\">https:\/\/godbolt.org\/g\/LZCtwi<\/a><br \/>\ncompilers replace slow div instruction with mix of mul shr, lea, sub.<\/p>\n<p>Unfortunately IDK the math behind it so I would not know how to produce C++ function that only depends on few params(for example as you can see sometimes codegen uses sub, sometimes it does not).<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3694#respond\" rel=\"nofollow\" data-commentid=\"3694\" data-postid=\"9623\" data-belowelement=\"div-comment-3694\" data-respondelement=\"respond\" data-replyto=\"Reply to Z01\" aria-label=\"Reply to Z01\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3695\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2 parent\">\n<div id=\"div-comment-3695\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3695\">June 22, 2018 at 05:54<\/a><\/div>\n<p>Yes, my old default (before Fibonacci hashing) was an improved version of integer modulo. And you can still use it by using the prime_number_hash_policy with my hash tables. The fastest version I have found of that is actually to use libdivide. (<a href=\"https:\/\/libdivide.com\" rel=\"nofollow ugc\">https:\/\/libdivide.com<\/a>) But the reason why that wasn\u2019t my default choice was that it caused the code size of my find() function to grow, and that made inlining less likely, and losing inlining was a bigger cost than the speedup from using libdivide. So I used the custom solution that\u2019s still in there.<\/p>\n<p>Fibonacci hashing is still faster though, and I like that it breaks up sequential numbers.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3695#respond\" rel=\"nofollow\" data-commentid=\"3695\" data-postid=\"9623\" data-belowelement=\"div-comment-3695\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3701\" class=\"comment even depth-3\">\n<div id=\"div-comment-3701\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Z01<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3701\">June 26, 2018 at 13:24<\/a><\/div>\n<p>regarding prime_number_hash_policy<br \/>\nI am not surprised it is slow since it can not be inlined(in theory it can but function ptr makes that hard).<\/p>\n<p>What I was thinking about is not having dozen of functions and adjusting a pointer but having one function and adjusting arguments.<\/p>\n<p>But like I said IDK the recipe that compilers use for optimization\u2026<\/p>\n<p>Still I managed to find 5 primes that give nice form of optimized code(all 5 primes have same sequence of asm instructions with 2 diff constants) that could be used to implement as a simple function with 2 arguments(beside input hash).<\/p>\n<p><a href=\"https:\/\/godbolt.org\/g\/97WZmn\" rel=\"nofollow ugc\">https:\/\/godbolt.org\/g\/97WZmn<\/a><\/p>\n<p>If you want you could test it and see if it performs better.<\/p>\n<p>regarding inlining:<br \/>\nmaybe you could try POGO, in theory every program that cares about performance will use it anyway. <img decoding=\"async\" class=\"emoji\" role=\"img\" draggable=\"false\" src=\"https:\/\/s0.wp.com\/wp-content\/mu-plugins\/wpcom-smileys\/twemoji\/2\/svg\/1f642.svg\" alt=\"\ud83d\ude42\" \/><\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3705\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-3705\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Joern<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3705\">June 29, 2018 at 09:37<\/a><\/div>\n<p>To replace division with multiplication, you can look up the extended euclidian algorithm. Or use this function:<br \/>\nstatic uint64_t inverse(uint64_t m)<br \/>\n{<br \/>\nuint64_t t, r, newt, newr, q, tmp;<\/p>\n<p>r = m; newr = -m;<br \/>\nt = 1; newt = -1;<\/p>\n<p>while (newr) {<br \/>\nq = r \/ newr;<br \/>\ntmp = newr; newr = r \u2013 q * newr; r = tmp;<br \/>\ntmp = newt; newt = t \u2013 q * newt; t = tmp;<br \/>\n}<br \/>\nreturn t;<br \/>\n}<\/p>\n<p>Finds the multiplicative inverse for any odd 64bit number.<\/p>\n<p>While at it, here is a fun bit of brute-force math:<br \/>\nuint64_t mstep = -1ull;<br \/>\nfor (int i = 1; i &lt;= 1 &lt;&lt; 16; i++) {<br \/>\nuint64_t step = i * 0x9E3779B97F4A7C15ull;<br \/>\nuint64_t astep = step;<br \/>\nif (-astep astep) {<br \/>\nprintf(\u201c%8d: %16lx %16lx %16lx\\n\u201d, i, step, astep, mstep);<br \/>\nmstep = astep;<br \/>\n}<br \/>\n}<br \/>\nThis just tries every increment and prints it out if the step for fibonacci hashing would be worse than for any smaller increment. Some of the numbers printed might seem familiar.<\/p>\n<p>I suppose you could also use this code to search for a better multiplier. For example 0x6659948fa40e609bull appears better for most increments.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-3711\" class=\"comment even depth-3\">\n<div id=\"div-comment-3711\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8226a0875d4cee58d52550ce141bf48211a7b2a1703976a3decc1deb8b7e749a?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Z01<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3711\">June 30, 2018 at 16:35<\/a><\/div>\n<p>@Joern<br \/>\nthank you\u2026<\/p>\n<p>but why do compilers use mul and shift if they could just use mul(if I understand you correctly)?<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-3719\" class=\"comment byuser comment-author-cpergiel odd alt thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-3719\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/7b7041979f722735f7190300e85d9fddc334a53180fb65711cffaee32df4520f?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/pergelator.blogspot.com\" rel=\"ugc external nofollow\">Charles Pergiel<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3719\">July 2, 2018 at 17:55<\/a><\/div>\n<p>@Joern<br \/>\nI do not understand what you mean by \u201cmultiplicative inverse\u201d. I thought that multiplicative inverse was the reciprocal of a number, so it could not be an integer. Am I missing something?<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3719#respond\" rel=\"nofollow\" data-commentid=\"3719\" data-postid=\"9623\" data-belowelement=\"div-comment-3719\" data-respondelement=\"respond\" data-replyto=\"Reply to Charles Pergiel\" aria-label=\"Reply to Charles Pergiel\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-3739\" class=\"comment even depth-2\">\n<div id=\"div-comment-3739\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/20f86699f03ffe4b77cd842a0bb5f3bcab9620c49f3e25072dd80c302ea56f85?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Joern<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-3739\">July 8, 2018 at 10:47<\/a><\/div>\n<p>Modular arithmetic makes a difference. If you multiply 0x9e3779b97f4a7c15 by 0xf1de83e19937733d, you get 0x957bbf35006ed6770000000000000001, a large 128bit number. If you only look at the low 64 bits, you get 1. If you only consider the remainder after dividing by 2^64 (as computers do), the product is 1 and each of the two factors is the inverse of the other.<\/p>\n<p>Every odd number has a multiplicative inverse in mod(2^64). Even number can be \u201cmade odd\u201d by shifting, so replacing division by 6 involves an inverse for 3 and a shift.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=3739#respond\" rel=\"nofollow\" data-commentid=\"3739\" data-postid=\"9623\" data-belowelement=\"div-comment-3739\" data-respondelement=\"respond\" data-replyto=\"Reply to Joern\" aria-label=\"Reply to Joern\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-4115\" class=\"comment odd alt thread-even depth-1\">\n<div id=\"div-comment-4115\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/e43d89cb1688eda77d7ff2fb95d9de77d81ee5905985a22d08d96cd25e848fa4?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/e43d89cb1688eda77d7ff2fb95d9de77d81ee5905985a22d08d96cd25e848fa4?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/e43d89cb1688eda77d7ff2fb95d9de77d81ee5905985a22d08d96cd25e848fa4?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/e43d89cb1688eda77d7ff2fb95d9de77d81ee5905985a22d08d96cd25e848fa4?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/e43d89cb1688eda77d7ff2fb95d9de77d81ee5905985a22d08d96cd25e848fa4?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/e43d89cb1688eda77d7ff2fb95d9de77d81ee5905985a22d08d96cd25e848fa4?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Tobin Baker<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-4115\">August 30, 2018 at 17:07<\/a><\/div>\n<p>I\u2019m a couple months late, but I just wanted to say that I agree with everything that Sebastian Sylvan said. In a nutshell, a hash table should take the output of a user-supplied hash function, \u201cmix\u201d the hash using a robust integer permutation, and then use fastrange to map the result into the table. If you use a permutation rather than a general function for the mixing step, you are guaranteed no collisions. That means that as long as your user-supplied hash function has no spurious collisions, the input to fastrange will have no collisions either. So the best possible user-supplied hash function for integers is in fact the identity function: it is cheap (free) and has no collisions. As for the mixing permutation, I used to use a reduced-round version of the 32-bit Speck block cipher, but I recently switched to the 2-round permutation described in <a href=\"https:\/\/github.com\/skeeto\/hash-prospector\" rel=\"nofollow ugc\">https:\/\/github.com\/skeeto\/hash-prospector<\/a>, and it\u2019s much faster and statistically just as good for my purposes. (I hear the Murmur3 finalizer also works quite well for this.)<\/p>\n<p>PS It makes no sense to use xxHash as a mixing function. 1) it is not a permutation, so it can have collisions, and 2) it is a byte-at-a-time hash function, designed for variable-length strings, not fixed-width integers.<\/p>\n<p>PPS Robin Hood hashing isn\u2019t quite the final word on linear probing. There\u2019s a couple of other linear probing variants I\u2019m currently benchmarking, and one of them is clearly superior to Robin Hood (and predates it by over 20 years)\u2026<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=4115#respond\" rel=\"nofollow\" data-commentid=\"4115\" data-postid=\"9623\" data-belowelement=\"div-comment-4115\" data-respondelement=\"respond\" data-replyto=\"Reply to Tobin Baker\" aria-label=\"Reply to Tobin Baker\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-4217\" class=\"comment byuser comment-author-1yk0s even thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-4217\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-4217\">September 30, 2018 at 11:35<\/a><\/div>\n<p>Carry less multiplication and xor-ing the resulting two n-bit blocks to produce the hash is a permutation that also spreads the entropy in the input very evenly. On a modern CPU this will only cost about 3 to 4 cycles in addition to a latency to 4 to 7 cycles. Meaning it will be almost as fast as the fibonacci hash.<\/p>\n<p><a rel=\"tag\" class=\"hashtag u-tag u-category\" href=\"https:\/\/monodes.com\/predaelli\/tag\/include\/\">#include<\/a> \/\/ compile with -mpclmul<br \/>\nuint64_t const clmul_circ(const uint64_t&amp; a,const uint64_t&amp; b){<br \/>\n__m128i ma{(const int64_t)(a),0ull};<br \/>\n__m128i mb{(const int64_t)(b),0ull};<br \/>\nauto t = _mm_clmulepi64_si128(ma,mb,0);<br \/>\nreturn t[0]^t[1];<br \/>\n}<\/p>\n<p>uint64_t const distribute(const uint64_t&amp; a){<br \/>\nreturn clmul_circ(a,15112557877901478707ul); \/\/ any odious integer will do<br \/>\n}<\/p>\n<p>One could go one step further and at the cost of two more carry-less operations and four xor\u2019s one could construct a galois field by reducing with an irreducible polynomial like in AES-GCM<\/p>\n<p><a href=\"https:\/\/www.intel.cn\/content\/dam\/www\/public\/us\/en\/documents\/white-papers\/carry-less-multiplication-instruction-in-gcm-mode-paper.pdf\" data-mtli=\"mtli_filesize1233kB\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Click to access carry-less-multiplication-instruction-in-gcm-mode-paper.pdf<\/a><\/p>\n<p>I don\u2019t think much would be gained by doing that though.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=4217#respond\" rel=\"nofollow\" data-commentid=\"4217\" data-postid=\"9623\" data-belowelement=\"div-comment-4217\" data-respondelement=\"respond\" data-replyto=\"Reply to 1yk0s\" aria-label=\"Reply to 1yk0s\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-4361\" class=\"comment byuser comment-author-1yk0s odd alt depth-2\">\n<div id=\"div-comment-4361\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-4361\">October 18, 2018 at 10:16<\/a><\/div>\n<p>Turns out clmul_circ is not mixing well enough, I ended up doing:<\/p>\n<p>uint64_t const gfmul64(const uint64_t&amp; i,const uint64_t&amp; j){<br \/>\n__m128i I{};I[0]^=i;<br \/>\n__m128i J{};J[0]^=j;<br \/>\n__m128i M{};M[0]^=0xb000000000000000ull;<br \/>\n__m128i X = _mm_clmulepi64_si128(I,J,0);<br \/>\n__m128i A = _mm_clmulepi64_si128(X,M,0);<br \/>\n__m128i B = _mm_clmulepi64_si128(A,M,0);<br \/>\nreturn A[0]^A[1]^B[1]^X[0]^X[1];<br \/>\n}<\/p>\n<p>where clmul is fast enough, and rotate_left(h*c1,21)*c1 where it is not, so very similar to what you did, but when rotating instead of shifting one does retain the injective property of the hash.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=4361#respond\" rel=\"nofollow\" data-commentid=\"4361\" data-postid=\"9623\" data-belowelement=\"div-comment-4361\" data-respondelement=\"respond\" data-replyto=\"Reply to 1yk0s\" aria-label=\"Reply to 1yk0s\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-10420\" class=\"comment even thread-even depth-1 parent\">\n<div id=\"div-comment-10420\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Zo\u00eb the Scribe<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10420\">March 18, 2021 at 03:06<\/a><\/div>\n<p>Hello Malte! Thank you for the very informative post and for bringing up much interesting discussions. It turns out the technique is the the same as the once used in the Linux kernel for hashing pointer address values since 2016:<\/p>\n<p><a href=\"https:\/\/git.kernel.org\/pub\/scm\/linux\/kernel\/git\/stable\/linux.git\/tree\/include\/linux\/hash.h?h=v5.11.7#n37\" rel=\"nofollow ugc\">https:\/\/git.kernel.org\/pub\/scm\/linux\/kernel\/git\/stable\/linux.git\/tree\/include\/linux\/hash.h?h=v5.11.7#n37<\/a><\/p>\n<p>The header\u2019s date says 2002, but before 2016 they used a prime factor which turned out not so well-conditioned. But the word \u201cprime\u201d in the macro stuck, hence the comments and compatibility macros defined in the beginning. So indeed we keep on rediscovering the technique, and it\u2019s right now being used whenever Linux runs ^^<\/p>\n<p>Curiously, they use the modular factor for (1 \u2013 \u03c6) instead of \u03c6 as the multiplier, and the code comment suggests that (1 \u2013 \u03c6) \u201cis very slightly easier to multiply by\u201d. I guess the reason is that the value is the smaller one, and the most significant nibble of the (1 \u2013 \u03c6) factor is 0x6 instead of 0x9, therefore one bit shorter than the \u03c6 factor when written out as binary. Maybe this could have made it \u201cslightly easier to multiply by\u201d on platforms that lack native hardware support for 32\/64-bit multiplication? Or maybe it\u2019s just that the 64-bit factor ends in 0xB, so that they could spell out the literal as \u201c0x\u2026Bull\u201d in the C code <img decoding=\"async\" class=\"emoji\" role=\"img\" draggable=\"false\" src=\"https:\/\/s0.wp.com\/wp-content\/mu-plugins\/wpcom-smileys\/twemoji\/2\/svg\/1f642.svg\" alt=\"\ud83d\ude42\" \/><\/p>\n<p>I personally find the multiplicative hash by \u03c6 or (1 \u2013 \u03c6) very effective when you want to hash pointer addresses into a small number of slots. Address values are not arbitrary, for the most significant bits tend to be fixed by the segment, and the least significant bits tend to be multiples of some common factors dictated by alignment requirements. The informative bits are likely in the middle, but there they could still be subject to patterns created by particularities of the memory allocator. By scrambling with a multiplier and taking the highest bits, these patterning effects tend to be reduced.<\/p>\n<p>There\u2019s another potentially useful technique if we want the the hash to be perfect or near-perfect for a small and fixed collection of input values: rotate first, then hash by golden ratio. The \u201cideal\u201d number of bits to rotate by can be figured out by trial and error at setup time, provided that the input collection is small and the initial table leaves enough empty slots to allow for some slack. For instance, if we begin with 8 unique inputs and make the initial table size 16, I think we\u2019re guaranteed to find a rotation-distance between 0 and 63 that put the 8 inputs into distinct slots among the 16 total. I imagine this as rotating the inputs such that their informative bits are juggled out of the places not adequately scrambled by the Fibonacci hash and put into its sweet spots, without losing any bit. To reuse one of your examples, the multiple of 144 (a Fibonacci number) by any number in [1..8] maps to 7 identically if we just take the highest 3 bits of the hash, but if we rotate the input by 3 bits first, the outputs nicely distribute to unique values in [0..7].<\/p>\n<p>If the hardware supports it, the rotation arithmetic compiles down to one instruction. Compared with shift-and-xor, it saves one arithmetic but adds one load. I find it helpful for one specific job: building a small lookup table or \u201cfrozenset\u201d for fixed constant pointers. Once built, for any non-zero input it answers the question \u201cis this input among the elements in the set?\u201d in constant time using no branching. If the input is also guaranteed to be a member and we want to do lookup, the keys don\u2019t need to be stored.<\/p>\n<p>Anyway, thank you so much for the post! I hope you\u2019re doing well!<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=10420#respond\" rel=\"nofollow\" data-commentid=\"10420\" data-postid=\"9623\" data-belowelement=\"div-comment-10420\" data-respondelement=\"respond\" data-replyto=\"Reply to Zo\u00eb the Scribe\" aria-label=\"Reply to Zo\u00eb the Scribe\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-10422\" class=\"comment byuser comment-author-1yk0s odd alt depth-2 parent\">\n<div id=\"div-comment-10422\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/gravatar.com\/1yk0s\" rel=\"ugc external nofollow\">1yk0s<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10422\">March 18, 2021 at 06:15<\/a><\/div>\n<p>You are not guaranteed to find that rotation at all. The probability for success is 14750191\/16777216 .<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=10422#respond\" rel=\"nofollow\" data-commentid=\"10422\" data-postid=\"9623\" data-belowelement=\"div-comment-10422\" data-respondelement=\"respond\" data-replyto=\"Reply to 1yk0s\" aria-label=\"Reply to 1yk0s\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-10423\" class=\"comment byuser comment-author-1yk0s even depth-3\">\n<div id=\"div-comment-10423\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10423\">March 18, 2021 at 06:29<\/a><\/div>\n<p>For each rotation. For 64 rotations, if they are independent, the probability for failure is 1.806797088433464E-59 . We can find a counter example by making some rotations degenerate by randomly choosing n and then shifting it to the different bits.<br \/>\n<code class=\"\" data-line=\"\">n=rand(256);<br \/>\n{n&lt;&lt;(3*0),n&lt;&lt;(8*1),n&lt;&lt;(8*2),n&lt;&lt;(8*3),...,n&lt;&lt;(8*7)}<\/code><br \/>\nThis will mean only 8 out of the 64 rotations can be independent. Lowering the chance for chance for success to 99.8% .<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-10424\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-10424\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/39537a44daf38636766f41ef608a923b08013be037c0749d82168b0fc87cc4c3?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Zo\u00eb the Scribe<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10424\">March 18, 2021 at 12:42<\/a><\/div>\n<p>Hi! Thanks for the correction; I should have been more precise <img decoding=\"async\" class=\"emoji\" role=\"img\" draggable=\"false\" src=\"https:\/\/s0.wp.com\/wp-content\/mu-plugins\/wpcom-smileys\/twemoji\/2\/svg\/1f642.svg\" alt=\"\ud83d\ude42\" \/> Since the comments don\u2019t nest I\u2019m replying to your last message actually. As I understand it, if the inputs in this case are rotationally symmetric (is that what you mean?), rotation of course fails to adequately break the symmetry. As an extreme example, if the inputs are the 64 integers with the \u201c1\u201d bit on each of the 64 places, the rotation step is totally useless. For this particular case, we\u2019d have to enlarge the table size to 1024 (if working with powers of 2) when those keys start to map into distinct slots, which is super (16x) wasteful \u2014 compared to perfect and minimal hash function (the logarithm will do for this instance). Fortunately for my purpose (hashing pointers to static variables or long-living objects allocated in some pool), the memory allocator is not my adversary, hopefully!. The keys are clumped in some range but otherwise without \u201cbad patterns\u201d. In the end I tend to get away with it without enlarging the table by too much ^^<\/p>\n<p>Even so, I can only use the rotation trick for small tables (typically &lt; 16) \u2014 anything larger would have required a real perfect hash function or another data structure for efficient lookup.<\/p>\n<p>But back to Malte\u2019s original post, where the purpose is to map a real hash code into slots. I don\u2019t know if an additional rotation step could help alleviating some of the \u201cbad patterns\u201d such as multiples of a Fibonacci number. I guess it may not worth the overhead except for some very specific cases.<\/p>\n<p>P.S. when you wrote \u201cThe probability for success is 14750191\/16777216\u201d, how do you mean \u201cprobability\u201d, and what is the random space? I\u2019m afraid I\u2019m not on the same page with you. How did you derive these numbers? Thanks!<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-10425\" class=\"comment byuser comment-author-1yk0s even depth-3\">\n<div id=\"div-comment-10425\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10425\">March 18, 2021 at 16:03<\/a><\/div>\n<p>n is the number of items with random hashes in the range [0,m], m is the number of slots, the probability for a collision is (m-1)\/m<em>(m-2)\/m<\/em>\u2026<em>(m-n+1)\/m which is approximately exp(-n<\/em>(n-1)\/(2*m)) . The collisions scale inversely with the number of slots but to the square with the number of items.<br \/>\nIn your scheme you could forgo the rotations and just try different uneven integer multiples, this would give you a larger space to search from. But it would still not get you far. You should look at more advanced perfect hashing schemes. Or, if you just have 8 values, take just a flat array, it\u2019s faster than you think, testing 8 values is very quick. And for more than that there\u2019s nothing wrong with a proper hash table.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-10426\" class=\"comment byuser comment-author-1yk0s odd alt depth-3\">\n<div id=\"div-comment-10426\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10426\">March 19, 2021 at 01:48<\/a><\/div>\n<p>we essentially get new random numbers in the range [0,15) for each rotation. The second number needs to fall in any of the 15 free slots out of 16 total, the third needs to fall in any of the 14 free slots out of 16 total and so on.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-10983\" class=\"comment even thread-odd thread-alt depth-1 parent\">\n<div id=\"div-comment-10983\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10983\">June 20, 2021 at 05:18<\/a><\/div>\n<p>This article is quite misleading. Also the world did not forget about fibonacci\/golden ratio\/multiplicative hashing, but some people choose not to use it, because this method with power of 2 hash table sizes, results in too many collisions. The best method is to use a hash table with a prime number. Yes modulus arithmetic is a little bit slower, but you will gain performance by not having to deal with a large number of collisions. There is no such thing as free lunch and I\u2019m yet to see a power of 2 hashing function that has comparable collision rate to the one with a prime number and modulus arithmetic. Please stop spreading misinformation.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=10983#respond\" rel=\"nofollow\" data-commentid=\"10983\" data-postid=\"9623\" data-belowelement=\"div-comment-10983\" data-respondelement=\"respond\" data-replyto=\"Reply to SadClouds\" aria-label=\"Reply to SadClouds\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-10994\" class=\"comment byuser comment-author-1yk0s odd alt depth-2 parent\">\n<div id=\"div-comment-10994\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/gravatar.com\/1yk0s\" rel=\"ugc external nofollow\">1yk0s<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10994\">June 21, 2021 at 00:19<\/a><\/div>\n<p>Yes the title is slightly exaggerated to generate clicks. He showed you the number of collisions observed and powers of two do not lead to more collisions than any other hash table size, if you apply a good hash function. When the hash function has a uniform distribution you can use whatever modulus you like, and also other ways of range reduction like <code class=\"\" data-line=\"\">fastrange<\/code> too.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=10994#respond\" rel=\"nofollow\" data-commentid=\"10994\" data-postid=\"9623\" data-belowelement=\"div-comment-10994\" data-respondelement=\"respond\" data-replyto=\"Reply to 1yk0s\" aria-label=\"Reply to 1yk0s\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-10996\" class=\"comment even depth-3 parent\">\n<div id=\"div-comment-10996\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10996\">June 21, 2021 at 06:30<\/a><\/div>\n<p>Can you please show me a hash function where what you say is true for integer values like pointers that are aligned on 8-byte boundary \u2013 8, 16, 24, 32, etc.<\/p>\n<p>mul_hash(uint64_t start, uint64_t step):<\/p>\n<p>uint64_t <em>buckets;<br \/>\nuint64_t buckets_len = 4194304; \/<\/em> Power of 2 number: 2^22 <em>\/<br \/>\nuint64_t buckets_len_log2 = 22;<br \/>\nconst uint64_t gr = UINT64_C(11400714819323198485); \/<\/em> Golden ratio *\/<br \/>\nfor (i = 0, n = start; i &lt; buckets_len\/2; i++, n += step)<br \/>\n{<br \/>\nhcode = n;<br \/>\nindex = (hcode * gr) &gt;&gt; (64 \u2013 buckets_len_log2);<br \/>\n}<\/p>\n<p>mod_hash(uint64_t start, uint64_t step):<\/p>\n<p>uint64_t <em>buckets;<br \/>\nuint64_t buckets_len = 4194301; \/<\/em> Prime number close to 2^22 *\/<br \/>\nfor (i = 0, n = start; i &lt; buckets_len\/2; i++, n += step)<br \/>\n{<br \/>\nhcode = n;<br \/>\nindex = hcode % buckets_len;<br \/>\n}<\/p>\n<p>And the results are:<\/p>\n<p>mul_hash: start=8, step=8, time=15.62 msec, coll_cnt=100957<br \/>\nmul_hash: start=64, step=64, time=30.61 msec, coll_cnt=289511<\/p>\n<p>mod_hash: start=8, step=8, time=24.38 msec, coll_cnt=0<br \/>\nmod_hash: start=64, step=64, time=35.54 msec, coll_cnt=0<\/p>\n<p>The multiplicative hashing (as originally suggested by Knuth) results in around 100K and 289K collisions, while modulus hashing results in 0 collisions. If you change \u201chcode = n\u201d to any chosen hash function that returns 64-bit hash code based on value \u201cn\u201d, you still end up with huge number of collisions, and this is with a hash table at 50% load. I\u2019ll take modulus hashing any time of the day \u2013 simple and effective.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-10998\" class=\"comment byuser comment-author-1yk0s odd alt depth-3\">\n<div id=\"div-comment-10998\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10998\">June 21, 2021 at 06:49<\/a><\/div>\n<p>Have a look at my answer on stackoverflow here:<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/664014\/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key\/57556517#57556517\" rel=\"nofollow ugc\">https:\/\/stackoverflow.com\/questions\/664014\/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key\/57556517#57556517<\/a><\/p>\n<p>Or use splitmix64:<\/p>\n<div>\n<div id=\"highlighter_318027\" class=\"syntaxhighlighter  plain\">\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td class=\"gutter\">\n<div class=\"line number1 index0 alt2\">1<\/div>\n<div class=\"line number2 index1 alt1\">2<\/div>\n<div class=\"line number3 index2 alt2\">3<\/div>\n<div class=\"line number4 index3 alt1\">4<\/div>\n<div class=\"line number5 index4 alt2\">5<\/div>\n<div class=\"line number6 index5 alt1\">6<\/div>\n<\/td>\n<td class=\"code\">\n<div class=\"container\">\n<div class=\"line number1 index0 alt2\"><code class=\"\" data-line=\"\">uint64_t hash(uint64_t x) {<\/code><\/div>\n<div class=\"line number2 index1 alt1\"><code class=\"\" data-line=\"\">\u00a0\u00a0\u00a0\u00a0<\/code><code class=\"\" data-line=\"\">x = (x ^ (x &gt;&gt; 30)) * UINT64_C(0xbf58476d1ce4e5b9);<\/code><\/div>\n<div class=\"line number3 index2 alt2\"><code class=\"\" data-line=\"\">\u00a0\u00a0\u00a0\u00a0<\/code><code class=\"\" data-line=\"\">x = (x ^ (x &gt;&gt; 27)) * UINT64_C(0x94d049bb133111eb);<\/code><\/div>\n<div class=\"line number4 index3 alt1\"><code class=\"\" data-line=\"\">\u00a0\u00a0\u00a0\u00a0<\/code><code class=\"\" data-line=\"\">x = x ^ (x &gt;&gt; 31);<\/code><\/div>\n<div class=\"line number5 index4 alt2\"><code class=\"\" data-line=\"\">\u00a0\u00a0\u00a0\u00a0<\/code><code class=\"\" data-line=\"\">return x;<\/code><\/div>\n<div class=\"line number6 index5 alt1\"><code class=\"\" data-line=\"\">}<\/code><\/div>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p>For a well distributed hash function. But then you\u2019ll say this is two multiplications, and you\u2019d be correct. But even one multiplication is often sufficient, and then we have arrived exactly where Malte had.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-10997\" class=\"comment even depth-3\">\n<div id=\"div-comment-10997\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-10997\">June 21, 2021 at 06:44<\/a><\/div>\n<p>Actually the collision rate is even worse, as I copied and pasted the wrong lines of text<\/p>\n<p>mul_hash: start=8, step=8, time=15.81 msec, coll_cnt=201861<br \/>\nmul_hash: start=64, step=64, time=30.76 msec, coll_cnt=579039<\/p>\n<p>mod_hash: start=8, step=8, time=23.89 msec, coll_cnt=0<br \/>\nmod_hash: start=64, step=64, time=35.34 msec, coll_cnt=0<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-11000\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-11000\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11000\">June 21, 2021 at 08:51<\/a><\/div>\n<p>It doesn\u2019t matter what hash function you use, when it comes to multiplicative hashing for mapping hash code to hash table index, you will get large number of collisions for certain data sets. Here is a link to my code, plug in whatever hash function you like:<br \/>\n<a href=\"https:\/\/drive.google.com\/file\/d\/1Qphu8JfZDZ9CBfIxk5jqAT8fqviPjBRv\/view\" rel=\"nofollow ugc\">https:\/\/drive.google.com\/file\/d\/1Qphu8JfZDZ9CBfIxk5jqAT8fqviPjBRv\/view<\/a><\/p>\n<\/div>\n<\/li>\n<li id=\"comment-11001\" class=\"comment even depth-3 parent\">\n<div id=\"div-comment-11001\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Steinar H. Gunderson<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11001\">June 21, 2021 at 09:14<\/a><\/div>\n<p><em>Any<\/em> hash function will give large amounts of collision for \u201ccertain data sets\u201d. For the very specific case of a super-dense range of pointers whose range happens to fit exactly into the hash table, prime modulo will do much better than a multiplicative hash. Likewise, if you have modulo with a prime p, hashing integers that are exactly p apart will give you catastrophic amounts of collisions. But if either is a very important case for you, perhaps you shouldn\u2019t be using hashing in the first place.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-11003\" class=\"comment byuser comment-author-1yk0s odd alt depth-3\">\n<div id=\"div-comment-11003\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/8ac9dc426fac8668626b3ffb2398b0a8d6a93f326e073ea225d6047aa720caa0?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">1yk0s<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11003\">June 21, 2021 at 10:04<\/a><\/div>\n<p>I could not have said it better myself, Steinar H. Gunderson. I just want to add one thing. If you have well distributed values and a 50% load factor the expected number of collisions is about 10.6% with respect to the number of buckets (equivalent to 21.3% w.r.t number of elements).<br \/>\n<a href=\"https:\/\/stackoverflow.com\/questions\/9104504\/expected-number-of-hash-collisions#11362027\" rel=\"nofollow ugc\">https:\/\/stackoverflow.com\/questions\/9104504\/expected-number-of-hash-collisions#11362027<\/a><br \/>\nIf you think you can do better you might be able to create a perfect hash function specifically for your data.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-11002\" class=\"comment even depth-3\">\n<div id=\"div-comment-11002\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11002\">June 21, 2021 at 09:50<\/a><\/div>\n<p>I think you\u2019re missing a point. Quite often input data is not completely random and follows a specific pattern. A good hash table should handle a variety of different input data efficiently. Using modulus arithmetic with a prime number will give you the least number of collisions with many different patterns. It just works for all the usual cases. There are many patterns where using power of 2 hash table is quite sub-optimal due to high number of collisions.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-11004\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-11004\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/0.gravatar.com\/avatar\/cc2fd4102a2b970d591a80a05018cfe94856dabfbeb779490c13e33e579b8b10?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Steinar H. Gunderson<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11004\">June 21, 2021 at 10:36<\/a><\/div>\n<p>Modulo will only have that (very nice!) property if you are hashing using the identity function (ie. no hash at all), which is a dangerous game to play.<\/p>\n<p>Note that it\u2019s not really the case that multiplicative hashing is doing particularly bad in this case; it\u2019s doing fairly well evenly across almost all inputs. It\u2019s just that the identity function + modulo does extremely well in this one specific case.<\/p>\n<p>Your benchmark is misleading, by the way; you\u2019re doing the modulo by a constant, which means the compiler can rewrite it into multiplicative form. This is only really possible if your hash table has a fixed number of buckets; otherwise, for non-constant, you will either have to have a real division, oor something like libdivide and\/or a huge switch\/case (like ska_flat_map). Changing that to a more universally usable variant is going to increase your (already big!) speed penalty, and\/or code size.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-11005\" class=\"comment even depth-3\">\n<div id=\"div-comment-11005\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11005\">June 21, 2021 at 13:14<\/a><\/div>\n<p>You have access to my code, so you can make whatever changes you like that inhibit compiler optimizations and see for yourself. I tried various things and not noticed much difference on my hardware. Using identity function with prime hash table size is the key method that gives low collision rates. Depending on the hardware, modulus arithmetic can be relatively quick. There are other things happening in tandem and will cause CPU pipeline stalls, no matter how quick your indexing function is. In my experience, for non-random integer types that increase sequentially (file descriptors, pointer values, etc), using prime hash table sizes gives the best performance due to very low collision rates.<\/p>\n<\/div>\n<\/li>\n<li id=\"comment-11009\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-3\">\n<div id=\"div-comment-11009\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11009\">June 23, 2021 at 17:58<\/a><\/div>\n<p>I actually think the benchmark results are OK. Sure, at 2^22 = 4 million pointers (integers with stride 8) you start getting a few collisions. If you try the same test with 2^21 pointers, you get zero collisions.<\/p>\n<p>If you try it with an actual hash function, you get more collisions: Roughly the 10% that 1yk0s mentioned, when I tried with std::_Hash_impl::hash(). (not officially supported, but it was an easy choice)<\/p>\n<p>I still think on average this is pretty good and the performance benefits over integer modulo is still worth it. Going from 16ms to 24ms is not a small difference. (plus, you don\u2019t quite measure the overhead of modulo correctly because presumably you won\u2019t be using only one known prime number. You want to allow tables of different sizes. The compiler has a harder time optimizing when there are multiple possible constants)<\/p>\n<p>The goal isn\u2019t zero collisions. You won\u2019t get that on any data that\u2019s not regularly spaced. The goal is few collisions for all common patterns. And I mention one common pattern in the article where prime number modulo has problems: mostly sequential numbers. If your numbers are mostly sequential, meaning 1,2,3,4,\u2026,10000, except occasionally you have other numbers in there like -20 or 2^22 or 2^22+4, that behaves really badly in prime number modulo, because the hash table will be densely packed and on collisions you have to search for a long time to find a free slot. Fibonacci hashing will spread these out, which gives you plenty of space to cheaply resolve hash collisions.<\/p>\n<p>About the claim that the slower instructions are worth it because you get fewer collisions: Maybe. I\u2019d like to see it on real data. Intuitively it shouldn\u2019t be worth it because by far the most common case is that you immediately find the item you\u2019re looking for in a hash table, so you want to optimize for that and use the fastest instructions possible on the happy path.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-11023\" class=\"comment even depth-2 parent\">\n<div id=\"div-comment-11023\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11023\">June 26, 2021 at 01:38<\/a><\/div>\n<p>Not quite, the results speak for themselves. Inserting 2 million integers into a hash table with 4 million buckets (only 50% usage) and you get:<\/p>\n<p>Modulus hashing using identity function: 0 collisions with most sequential numbers.<br \/>\nMultiplicative hashing using identity function: between 200K and 1 million collision, depending on the sequence.<br \/>\nMultiplicative hashing using randomisation function: around 450K collisions.<\/p>\n<p>The best you can get with multiplicative hashing is by applying a randomisation hashing function prior to calculating hash table index. This shuffles the bits in random order, but this is still 450K more collisions than with a modulus functions. And has the additional expense of randomisation code \u2013 multiplication, bit rotations and xor operations. After all that, a single modulus instruction doesn\u2019t look that bad.<\/p>\n<p>Yes sometimes input data is quite random, sometimes you need to randomise it to avoid hash table attacks, etc. And sometimes you don\u2019t need any of that \u2013 you already have a unique sequence of integers and modulus by a prime number will give you the least number of collisions. The fact that this gives sequential index values is a desirable property and what results in so few collisions. If you use something like linear probing then this is not a very smart way of resolving collisions, there are better ways. There is no optimal hash table design for all use cases, you need to take measurements and match to a specific hash table design. All I\u2019m saying is that multiplicative hashing is no silver bullet and on many occasions involving sequential number sets, prime number modulus hashing is a much better choice.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=11023#respond\" rel=\"nofollow\" data-commentid=\"11023\" data-postid=\"9623\" data-belowelement=\"div-comment-11023\" data-respondelement=\"respond\" data-replyto=\"Reply to SadClouds\" aria-label=\"Reply to SadClouds\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-11026\" class=\"comment odd alt depth-3\">\n<div id=\"div-comment-11026\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/27c41d7508b7fb2f40c9c25fb9f7193ff679e7efd1d3e0a6e554a6af64651113?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">SadClouds<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-11026\">June 26, 2021 at 08:45<\/a><\/div>\n<p>And here is another test, I put the following functions into a separate shared library. So people can\u2019t complain that compiler is optimising static constants.<\/p>\n<p><a rel=\"tag\" class=\"hashtag u-tag u-category\" href=\"https:\/\/monodes.com\/predaelli\/tag\/include\/\">#include<\/a> &lt;stdint.h&gt;<\/p>\n<p>uint64_t mul_index(uint64_t hash_code, uint64_t bits)<br \/>\n{<br \/>\nconst uint64_t gr = UINT64_C(11400714819323198485);<\/p>\n<pre><code class=\"\" data-line=\"\">\/\/ Do some hash code randomisation\nhash_code = (hash_code ^ (hash_code &gt;&gt; 30)) * UINT64_C(0xbf58476d1ce4e5b9);\nhash_code = (hash_code ^ (hash_code &gt;&gt; 27)) * UINT64_C(0x94d049bb133111eb);\nhash_code = hash_code ^ (hash_code &gt;&gt; 31);\n\nreturn (hash_code * gr) &gt;&gt; (64 - bits);\n<\/code><\/pre>\n<p>}<\/p>\n<p>uint64_t mod_index(uint64_t hash_code, uint64_t table_size)<br \/>\n{<br \/>\nreturn hash_code % table_size;<br \/>\n}<\/p>\n<p>Measured time for a loop with 10,000,000 iterations for each function<\/p>\n<p>Intel Xeon E5620 2.4GHz:<br \/>\nmul_index time = 91.76 msec<br \/>\nmod_index time = 157.84 msec<\/p>\n<p>ARM Cortex-A72 1.5GHz:<br \/>\nmul_index time = 170.44 msec<br \/>\nmod_index time = 112.83 msec<\/p>\n<p>The modulus function is slightly slower on Intel Xeon, but faster on ARM CPU. Feel free to run your own tests. On modern hardware, the speed of integer division is quite reasonable and not as horrible as some of you claim.<\/p>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-17452\" class=\"comment even thread-even depth-1 parent\">\n<div id=\"div-comment-17452\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/acb26653cfe5e03ff40f4d07ce5e7c1270e84e2af3e8e158a4ac31e4530ba47b?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/acb26653cfe5e03ff40f4d07ce5e7c1270e84e2af3e8e158a4ac31e4530ba47b?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/acb26653cfe5e03ff40f4d07ce5e7c1270e84e2af3e8e158a4ac31e4530ba47b?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/acb26653cfe5e03ff40f4d07ce5e7c1270e84e2af3e8e158a4ac31e4530ba47b?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/acb26653cfe5e03ff40f4d07ce5e7c1270e84e2af3e8e158a4ac31e4530ba47b?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/acb26653cfe5e03ff40f4d07ce5e7c1270e84e2af3e8e158a4ac31e4530ba47b?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"http:\/\/avodonosov.blogspot.com\/\" rel=\"ugc external nofollow\">Anton Vodonosov<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-17452\">April 30, 2023 at 01:22<\/a><\/div>\n<p>Typo: \u201cdesperate for a faster solution of this the problem\u201d.<\/p>\n<p>\u201cthis the\u201d<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=17452#respond\" rel=\"nofollow\" data-commentid=\"17452\" data-postid=\"9623\" data-belowelement=\"div-comment-17452\" data-respondelement=\"respond\" data-replyto=\"Reply to Anton Vodonosov\" aria-label=\"Reply to Anton Vodonosov\">Reply<\/a><\/div>\n<\/div>\n<ul class=\"children\">\n<li id=\"comment-17459\" class=\"comment byuser comment-author-sagan1338 bypostauthor odd alt depth-2\">\n<div id=\"div-comment-17459\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/ee7110bc70d8ac628b157c947ab9ff65ba60c57c5b771c71bc85f5589803bd28?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/probablydance.wordpress.com\" rel=\"ugc external nofollow\">Malte Skarupke<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-17459\">April 30, 2023 at 18:03<\/a><\/div>\n<p>Thanks. I tried fixing it, but at some point wordpress changed how its editor worked. Now whenever I edit old blog posts, all the formatting gets messed up. So I had to undo the change. Which still broke some formatting that worked before, but its the lesser evil. This blog post now has to be frozen as is\u2026<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=17459#respond\" rel=\"nofollow\" data-commentid=\"17459\" data-postid=\"9623\" data-belowelement=\"div-comment-17459\" data-respondelement=\"respond\" data-replyto=\"Reply to Malte Skarupke\" aria-label=\"Reply to Malte Skarupke\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li id=\"comment-17502\" class=\"comment even thread-odd thread-alt depth-1\">\n<div id=\"div-comment-17502\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/439b70b58642004c4211d4e0314f7a87e29736186b0db3fb50dd7a0af651780d?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/439b70b58642004c4211d4e0314f7a87e29736186b0db3fb50dd7a0af651780d?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/439b70b58642004c4211d4e0314f7a87e29736186b0db3fb50dd7a0af651780d?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/439b70b58642004c4211d4e0314f7a87e29736186b0db3fb50dd7a0af651780d?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/439b70b58642004c4211d4e0314f7a87e29736186b0db3fb50dd7a0af651780d?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/439b70b58642004c4211d4e0314f7a87e29736186b0db3fb50dd7a0af651780d?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Tony Warnock<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-17502\">May 7, 2023 at 11:49<\/a><\/div>\n<p>With fast integer multiplication and division (even if the latter through shifting), one can just use a rational approximation of a Weyl sequence. Take (simple case), M=256, A=181, then compute the hash index as 181*X mod 256. If the input X doesn\u2019t systematically have lots of powers of two, the result is very nicely distributed. The main point is to choose A and M so that A\/M has only small partial quotients in its continued fraction expansion. This does not eliminate secondary clustering due to hash results being close. Fibonacci stuff works well with Fj-1\/Fj always having 1s in its continued fraction expansion.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=17502#respond\" rel=\"nofollow\" data-commentid=\"17502\" data-postid=\"9623\" data-belowelement=\"div-comment-17502\" data-respondelement=\"respond\" data-replyto=\"Reply to Tony Warnock\" aria-label=\"Reply to Tony Warnock\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-17545\" class=\"comment odd alt thread-even depth-1\">\n<div id=\"div-comment-17545\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/1.gravatar.com\/avatar\/4b365082481c342b03e8a223789095ecbd381aceb26f334106f96aed96c27f0f?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/1.gravatar.com\/avatar\/4b365082481c342b03e8a223789095ecbd381aceb26f334106f96aed96c27f0f?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/1.gravatar.com\/avatar\/4b365082481c342b03e8a223789095ecbd381aceb26f334106f96aed96c27f0f?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/1.gravatar.com\/avatar\/4b365082481c342b03e8a223789095ecbd381aceb26f334106f96aed96c27f0f?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/1.gravatar.com\/avatar\/4b365082481c342b03e8a223789095ecbd381aceb26f334106f96aed96c27f0f?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/1.gravatar.com\/avatar\/4b365082481c342b03e8a223789095ecbd381aceb26f334106f96aed96c27f0f?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\"><a class=\"url\" href=\"https:\/\/gms.tf\" rel=\"ugc external nofollow\">Georg Sauthoff<\/a><\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-17545\">May 20, 2023 at 14:53<\/a><\/div>\n<p>FWIW, \u2018Introduction to Algorithms\u2019 (2nd ed, Cormen et al., 1990) already gives a good exposition of what you call \u2018Fibonacci Hashing\u2019. However, there it\u2019s called \u2018The multiplication method\u2019 (Section 11.3.2). That section cites Knuth for selection of a good multiplicand (i.e. (sqrt(5)-1)\/2 * 2^w where w is the word size) and illustrates right-shifting the result by w-p bits (where the table size m = 2^p). That means the usage\/relation of\/to the golden-ratio isn\u2019t mentioned, but the result is the same function as detailed in your article.<\/p>\n<p>The point is that in that book the multiplication method section directly follows \u2018The division method\u2019 section and isn\u2019t neglected at all. Since it follows the modulo prime hashing presentation in the division method section the reader is rather left with the impression that modulo prime isn\u2019t the last word and using the multiplication method hashing is advantageous because of it\u2019s simplicity. (the 4th edition also explicitly mentions that it\u2019s fast).<\/p>\n<p>It\u2019s probably fair to say that this book is required reading in many computer science beginner courses all over the world since the 90ies (the 4th edition was published in 2022). Definitely it\u2019s to a larger extend than \u2018The Art of Programming\u2019. Thus, your assessment that Ficonacci Hashing somehow was forgotten isn\u2019t very convincing.<\/p>\n<div class=\"reply\"><a class=\"comment-reply-link\" href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/?replytocom=17545#respond\" rel=\"nofollow\" data-commentid=\"17545\" data-postid=\"9623\" data-belowelement=\"div-comment-17545\" data-respondelement=\"respond\" data-replyto=\"Reply to Georg Sauthoff\" aria-label=\"Reply to Georg Sauthoff\">Reply<\/a><\/div>\n<\/div>\n<\/li>\n<li id=\"comment-17829\" class=\"comment even thread-odd thread-alt depth-1\">\n<div id=\"div-comment-17829\" class=\"comment-body\">\n<div class=\"comment-author vcard\"><img loading=\"lazy\" decoding=\"async\" class=\"avatar avatar-48 wp-hovercard-attachment grav-hashed grav-hijack\" src=\"https:\/\/2.gravatar.com\/avatar\/b98d4e4629f080be6e969a88c43c9d3a8f24b0a3110fbc57b14ca20ae5b4d9de?s=48&amp;d=identicon&amp;r=G\" srcset=\"https:\/\/2.gravatar.com\/avatar\/b98d4e4629f080be6e969a88c43c9d3a8f24b0a3110fbc57b14ca20ae5b4d9de?s=48&amp;d=identicon&amp;r=G 1x, https:\/\/2.gravatar.com\/avatar\/b98d4e4629f080be6e969a88c43c9d3a8f24b0a3110fbc57b14ca20ae5b4d9de?s=72&amp;d=identicon&amp;r=G 1.5x, https:\/\/2.gravatar.com\/avatar\/b98d4e4629f080be6e969a88c43c9d3a8f24b0a3110fbc57b14ca20ae5b4d9de?s=96&amp;d=identicon&amp;r=G 2x, https:\/\/2.gravatar.com\/avatar\/b98d4e4629f080be6e969a88c43c9d3a8f24b0a3110fbc57b14ca20ae5b4d9de?s=144&amp;d=identicon&amp;r=G 3x, https:\/\/2.gravatar.com\/avatar\/b98d4e4629f080be6e969a88c43c9d3a8f24b0a3110fbc57b14ca20ae5b4d9de?s=192&amp;d=identicon&amp;r=G 4x\" alt=\"\" width=\"48\" height=\"48\" \/> <cite class=\"fn\">Karsten Blees<\/cite><\/div>\n<div class=\"comment-meta commentmetadata\"><a href=\"https:\/\/probablydance.com\/2018\/06\/16\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/#comment-17829\">July 4, 2023 at 14:14<\/a><\/div>\n<p>Great article, I also often use multiplicative hashing in my hash tables.<\/p>\n<p>However, I think one important point is missing\u2026<\/p>\n<p>Back when I started programming in BASIC, I often did things like \u201cINT(6 * RND()) + 1\u201d to roll dice or \u201cINT(32 * RND())\u201d to pick from a deck of cards. RND() would return a real number in range [0..1), and it was kinda obvious that multiplying with the desired size M and rounding down would yield an integer in range [0..M-1].<\/p>\n<p>That\u2019s the gist of multiplicative hashing: you <em>multiply<\/em> with the table size rather than dividing by it, hence the name.<\/p>\n<p>The problem with this is that it only works well for uniformly distributed data such as random numbers, while consecutive input values produce largely the same hash code. Therefore it is desirable to quasi-randomize the input data, e.g. using a simple linear congruential generator.<\/p>\n<pre><code class=\"\" data-line=\"\">uint32 knuth_multiplicative_hash(uint32 key, uint32 M) {\n    uint64 rndQ32_32 = (0x00000000_9e3779b9ull * key) &amp; 0x00000000_ffffffffull; \/\/ (0.618034 * key) mod 1\n    uint64 hashQ32_32 = rndQ32_32 * M; \/\/ calculate the multiplicative hash\n    return (uint32) (hashQ32_32 &gt;&gt; 32); \/\/ strip fractional bits (i.e. round down)\n}\n<\/code><\/pre>\n<p>Now, if the table size M is a power of two, lets say 2^m, multiplying by the table size can be optimized to a shift-left m, which can be combined with the shift-right that strips the fractional bits, i.e. instead of \u201c(rnd * M) &gt;&gt; 32\u201d you do \u201crnd &gt;&gt; (32 \u2013 m)\u201d. This is the version you presented in this article.<\/p>\n<p>Unfortunately, the reason why its called multiplicative hashing and that it actually works for any table size (not just powers of two) is lost in this optimized form.<\/p>\n<p>Btw. \u201cfastrange\u201d is Knuth\u2019s multiplicative hash without the randomization step (or A := 1). So if you\u2019ve combined your power-of-two fibonacci hash with fastrange, as mentioned in one of the comments, you probably ended up with the full version <img decoding=\"async\" class=\"emoji\" role=\"img\" draggable=\"false\" src=\"https:\/\/s0.wp.com\/wp-content\/mu-plugins\/wpcom-smileys\/twemoji\/2\/svg\/1f609.svg\" alt=\"\ud83d\ude09\" \/><\/p>\n<\/div>\n<\/li>\n<\/ol>\n<\/div>\n<\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p class=\"excerpt\">Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer Modulo)<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"https:\/\/monodes.com\/predaelli\/2024\/11\/06\/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":4,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"federated","footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[1],"tags":[421,374],"class_list":["post-12012","post","type-post","status-publish","format-standard","hentry","category-senza-categoria","tag-define","tag-include"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p6daft-37K","jetpack-related-posts":[{"id":10940,"url":"https:\/\/monodes.com\/predaelli\/2023\/11\/03\/versioning-data-in-postgres-testing-a-git-like-approach-specfy\/","url_meta":{"origin":12012,"position":0},"title":"Versioning data in Postgres? Testing a git like approach &#8211; Specfy","author":"Paolo Redaelli","date":"2023-11-03","format":false,"excerpt":"Versioning data in Postgres? Testing a git like approach - Specfy is fashinating but I think that most of the time these two proposed alternatives fit most of the needs: In-Table versioning, the Wordpress way of doing thing. Add a a column version (or modify date) and SELECT the maximum\u2026","rel":"","context":"In &quot;Tricks&quot;","block_context":{"text":"Tricks","link":"https:\/\/monodes.com\/predaelli\/category\/documentations\/tricks\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11626,"url":"https:\/\/monodes.com\/predaelli\/2024\/05\/05\/40-tools-for-ethical-hacking\/","url_meta":{"origin":12012,"position":1},"title":"40 tools for ethical hacking","author":"Paolo Redaelli","date":"2024-05-05","format":false,"excerpt":"I know many of them, but not everyone! Shame on me! Here are 40 tools for ethical hacking! Nmap: Network scanner used for network discovery and security auditing. Wireshark: Network protocol analyzer for packet inspection and troubleshooting. Metasploit: Penetration testing framework for exploiting vulnerabilities. John the Ripper: Password cracking tool\u2026","rel":"","context":"In &quot;Tricks&quot;","block_context":{"text":"Tricks","link":"https:\/\/monodes.com\/predaelli\/category\/documentations\/tricks\/"},"img":{"alt_text":"\ud83d\udd0d","src":"https:\/\/static.xx.fbcdn.net\/images\/emoji.php\/v9\/tc1\/1\/16\/1f50d.png","width":350,"height":200},"classes":[]},{"id":13371,"url":"https:\/\/monodes.com\/predaelli\/2025\/05\/16\/saturating-the-name-space\/","url_meta":{"origin":12012,"position":2},"title":"Saturating the name-space","author":"Paolo Redaelli","date":"2025-05-16","format":false,"excerpt":"We are saturating the name space for programming languages. These days I discovered the Odin Programming Language \"\"The Data-Oriented Language for Sane Software Development.\" According to its FAQs there are some things we may learn for Eiffel. Its guiding principles are Simplicity and readability Minimal: there ought to be one\u2026","rel":"","context":"In &quot;Eiffel&quot;","block_context":{"text":"Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":13542,"url":"https:\/\/monodes.com\/predaelli\/2025\/06\/08\/futura-agenda\/","url_meta":{"origin":12012,"position":3},"title":"Futura agenda","author":"Paolo Redaelli","date":"2025-06-08","format":false,"excerpt":"5 strutture dati strane (ma utili) nell'informatica (\"5 Strange (but useful) Data Structures in Computer Science\"). Let's look at five weird data structures that will help you when the arrays and hashmaps of this world aren't enough. B-Tree Self-Balancing. We do have AVL-Trees which are self-balancing but are all-in-memory trees.\u2026","rel":"","context":"In &quot;Agenda&quot;","block_context":{"text":"Agenda","link":"https:\/\/monodes.com\/predaelli\/category\/agenda\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":872,"url":"https:\/\/monodes.com\/predaelli\/2016\/01\/06\/saturation-arithmetic-wikipedia-the-free-encyclopedia\/","url_meta":{"origin":12012,"position":4},"title":"Saturation arithmetic &#8211; Wikipedia, the free encyclopedia","author":"Paolo Redaelli","date":"2016-01-06","format":false,"excerpt":"Saturation arithmetic for integers has also been implemented in software for a number of programming languages including C, C++, and Eiffel. Sorgente: Saturation arithmetic - Wikipedia, the free encyclopedia Oh, I almost forgot that. Such a library was conceived by Geoff 24 who gave up using Eiffel in Mar 2003\u2026","rel":"","context":"In &quot;Eiffel&quot;","block_context":{"text":"Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":15394,"url":"https:\/\/monodes.com\/predaelli\/2026\/03\/29\/nobody-gets-fired-for-picking-json-but-maybe-they-should\/","url_meta":{"origin":12012,"position":5},"title":"Nobody Gets Fired for Picking JSON, but Maybe They Should?","author":"Paolo Redaelli","date":"2026-03-29","format":"link","excerpt":"Nobody Gets Fired for Picking JSON, but Maybe They Should? By Miguel Young de la Sota Nobody Gets Fired for Picking JSON, but Maybe They Should? JSON is extremely popular but deeply flawed. This article discusses the details of JSON\u2019s design, how it\u2019s used (and misused), and how seemingly helpful\u2026","rel":"","context":"In &quot;Javascript&quot;","block_context":{"text":"Javascript","link":"https:\/\/monodes.com\/predaelli\/category\/javascript\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts\/12012","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/comments?post=12012"}],"version-history":[{"count":0,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts\/12012\/revisions"}],"wp:attachment":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/media?parent=12012"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/categories?post=12012"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/tags?post=12012"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}