Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Besides the standard tricks to speed up the running time like threads and mmap, the biggest speed up is to use a custom hash table that is tuned for the data. I have implemented 5 different versions of hash tables including the Swiss hash table and more recent researched ones. None of them beats a simple linear probing approach. It’s because there’re so few keys, only couple hundreds of them. Tuning a custom hashing function to minimize key collision helps more than anything.

There are also lots of SIMD operations, which Zig has great support in the form of vector programming.

The only thing novel I did was to line up the temperature data from the back instead of from the front for the SIMD operations. That is for temperatures 12.3, 4.5, 67.8, 0.1, I line them up by the last digit after the decimal before packing them into a SIMD register. That way I know their exact positions in the register.



I had similar results exploring hash tables for a nearly identical task here[0].

It seems like swiss tables are tuned for workloads of mostly gets of keys that are not present. Otherwise the two-level lookup is an extra cost with not much benefit.

IIRC Joad Nacer says he is using a hash table for this (again, essentially identical) task on highload.fun[1]. This was sort if surprising for the other competitors because the top couple solutions below that one use some variety of bloom filter instead.

0: https://easyperf.net/blog/2022/05/28/Performance-analysis-an...

1: https://highload.fun/tasks/2/leaderboard


Yes. Swiss is good for non-duplicate lookups. For highly duplicate data the extra memory fetch for the metadata byte really kills the performance.

For this contest, there’re a billion lookups with only couple hundreds distinct keys. That means for most lookups, the full path of locating the key is executed - hashing, metadata comparison, hashed value comparison, and full key comparison. It’s actually quite expensive. Removing any part from the execution path really helps.


When it comes to custom hash tables, I always remember this video [0] by Strager about implementing the perfect hash table and hash function. Fascinating insights and perf hints.

[0]: https://youtu.be/DMQ_HcNSOAI


Many videos on hash table are great. I found this one particularly good.

[https://www.youtube.com/watch?v=M2fKMP47slQ] C++Now 2018: You Can Do Better than std::unordered_map: New Improvements to Hash Table Performance


Gosh, ant thnx for sharing. Hope this would never become a mainstream during coding interview. Imaging - please list top 10 hash table implementations and its time and space complexity


Ha. I didn’t know the detail of any of these tables beforehand. It’s just a matter of doing a bit of research. The main takeaway is there’s no perfect hash table. It all depends on the data and the operations. Contests like this are the perfect excuse to deep dive into these kinds of topic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: