Abstract
We demonstrate an efficient data-parallel algorithm for building large hash tables of millions of elements in real-time. We consider two parallel algorithms for the construction: a classical sparse perfect hashing approach, and cuckoo hashing, which packs elements densely by allowing an element to be stored in one of multiple possible locations. Our construction is a hybrid approach that uses both algorithms. We measure the construction time, access time, and memory usage of our implementations and demonstrate real-time performance on large datasets: for 5 million key-value pairs, we construct a hash table in 35.7 ms using 1.42 times as much memory as the input data itself, and we can access all the elements in that hash table in 15.3 ms. For comparison, sorting the same data requires 36.6 ms, but accessing all the elements via binary search requires 79.5 ms. Furthermore, we show how our hashing methods can be applied to two graphics applications: 3D surface intersection for moving data and geometric hashing for image matching.
- Adalsteinsson, D., and Sethian, J. A. 1995. A fast level set method for propagating interfaces. Journal of Computational Physics 118, 2, 269--277. Google Scholar
Digital Library
- Azar, Y., Broder, A. Z., Karlin, A. R., and Upfal, E. 2000. Balanced allocations. SIAM Journal on Computing 29, 1 (Feb.), 180--200. Google Scholar
Digital Library
- Barequet, G. 1997. Using geometric hashing to repair CAD objects. IEEE Computational Science&Engineering 4, 4 (Oct./Dec.), 22--28. Google Scholar
Digital Library
- Bast, H., and Hagerup, T. 1991. Fast and reliable parallel hashing. In ACM Symposium on Parallel Algorithms and Architectures, 50--61. Google Scholar
Digital Library
- Bastos, T., and Celes, W. 2008. GPU-accelerated adaptively sampled distance fields. In IEEE International Conference on Shape Modeling and Applications, 171--178.Google Scholar
- Bhosle, U., Chaudhuri, S., and Roy, S. D. 2002. The use of geometric hashing for automatic image mosaicing. In Proceedings of the National Conference on Communication, 533--537.Google Scholar
- Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proceedings of SIGGRAPH 96, Computer Graphics Proceedings, Annual Conference Series, 303--312. Google Scholar
Digital Library
- DeCoro, C., and Tatarchuk, N. 2007. Real-time mesh simplification using the GPU. In Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games, 161--166. Google Scholar
Digital Library
- Devroye, L., and Morin, P. 2003. Cuckoo hashing: Further analysis. Information Processing Letters 86, 4, 215--219. Google Scholar
Digital Library
- Fotakis, D., Pagh, R., Sanders, P., and Spirakis, P. 2005. Space efficient hash tables with worst case constant access time. Theory of Computing Systems 38, 2, 229--248.Google Scholar
Cross Ref
- Fox, E. A., Heath, L. S., Chen, Q. F., and Daoud, A. M. 1992. Practical minimal perfect hash functions for large databases. Communications of the ACM 35, 1 (Jan.), 105--121. Google Scholar
Digital Library
- Fredman, M. L., Komlós, J., and Szemerédi, E. 1984. Storing a sparse table with O(1) worst case access time. Journal of the ACM 31, 3 (July), 538--544. Google Scholar
Digital Library
- Frieze, A., Mitzenmacher, M., and Melsted, P. 2009. An analysis of random-walk cuckoo hashing. In submission.Google Scholar
- Gal, R., and Cohen-Or, D. 2006. Salient geometric features for partial shape matching and similarity. ACM Transactions on Graphics 25, 1 (July), 130--150. Google Scholar
Digital Library
- Germain, R. S., Califano, A., and Colville, S. 1997. Fingerprint matching using transformation parameter clustering. IEEE Computational Science&Engineering 4, 4 (Oct./Dec.), 42--49. Google Scholar
Digital Library
- Gil, J., and Matias, Y. 1991. Fast hashing on a PRAM---designing by expectation. In Proceedings of the Second Annual ACM-SIAM Symposium on Discrete Algorithms, 271--280. Google Scholar
Digital Library
- Gil, J., and Matias, Y. 1998. Simple fast parallel hashing by oblivious execution. SIAM Journal of Computing 27, 5, 1348--1375. Google Scholar
Digital Library
- Guéziec, A. P., Pennec, X., and Ayache, N. 1997. Medical image registration using geometric hashing. IEEE Computational Science&Engineering 4, 4 (Oct./Dec.), 29--41. Google Scholar
Digital Library
- Kim, J., and Pellacini, F. 2002. Jigsaw image mosaics. ACM Transactions on Graphics 21, 3 (July), 657--664. Google Scholar
Digital Library
- Lamdan, Y., and Wolfson, H. J. 1988. Geometric hashing: A general and efficient model-based recognition scheme. In Second International Conference on Computer Vision (ICCV), 238--249.Google Scholar
- Lamdan, Y., Schwartz, J. T., and Wolfson, H. J. 1988. On recognition of 3-D objects from 2-D images. In IEEE International Conference on Robotics and Automation, vol. 3, 1407--1413.Google Scholar
- Lamdan, Y., Schwartz, J. T., and Wolfson, H. J. 1990. Affine invariant model-based object recognition. IEEE Transactions on Robotics and Automation 6, 5 (Oct.), 578--589.Google Scholar
Cross Ref
- Lefebvre, S., and Hoppe, H. 2006. Perfect spatial hashing. ACM Transactions on Graphics 25, 3 (July), 579--588. Google Scholar
Digital Library
- Lefohn, A. E., Kniss, J., Strzodka, R., Sengupta, S., and Owens, J. D. 2006. Glift: Generic, efficient, random-access GPU data structures. ACM Transactions on Graphics 26, 1 (Jan.), 60--99. Google Scholar
Digital Library
- Matias, Y., and Vishkin, U. 1990. On parallel hashing and integer sorting. In Proceedings of the Seventeenth International Colloquium on Automata, Languages and Programming, 729--743. Google Scholar
Digital Library
- Matias, Y., and Vishkin, U. 1991. Converting high probability into nearly-constant time, with application to parallel hashing. In ACM Symposium on the Theory of Computing (STOC), 307--316. Google Scholar
Digital Library
- Nehab, D., and Hoppe, H. 2007. Texel programs for random-access antialiased vector graphics. Tech. Rep. MSR-TR-2007-95, Microsoft.Google Scholar
- Nussinov, R., and Wolfson, H. J. 1991. Efficient detection of three-dimensional structural motifs in biological macro-molecules by computer vision techniques. In Proceedings of the National Academy of Sciences of the United States of America, National Academy of Sciences, vol. 88, 10495--10499.Google Scholar
Cross Ref
- Pagh, R., and Rodler, F. F. 2001. Cuckoo hashing. In 9th Annual European Symposium on Algorithms, Springer, vol. 2161 of Lecture Notes in Computer Science, 121--133. Google Scholar
Digital Library
- Qin, Z., McCool, M. D., and Kaplan, C. 2008. Precise vector textures for real-time 3D rendering. In Proceedings of the 2008 Symposium on Interactive 3D Graphics and Games, 199--206. Google Scholar
Digital Library
- Satish, N., Harris, M., and Garland, M. 2009. Designing efficient sorting algorithms for manycore GPUs. In Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium. Google Scholar
Digital Library
- Sengupta, S., Harris, M., Zhang, Y., and Owens, J. D. 2007. Scan primitives for GPU computing. In Graphics Hardware 2007, 97--106. Google Scholar
Digital Library
- Sun, X., Zhou, K., Stollnitz, E., Shi, J., and Guo, B. 2008. Interactive relighting of dynamic refractive objects. ACM Transactions on Graphics 27, 3 (Aug.), 35:1--35:9. Google Scholar
Digital Library
- Vöcking, B. 2003. How asymmetry helps load balancing. Journal of the ACM 50, 4 (July), 568--589. Google Scholar
Digital Library
- Zhou, K., Gong, M., Huang, X., and Guo, B. 2008. Highly parallel surface reconstruction. Tech. Rep. MSR-TR-2008-53, Microsoft Research, 1 Apr.Google Scholar
- Zhou, K., Hou, Q., Wang, R., and Guo, B. 2008. Real-time KD-tree construction on graphics hardware. ACM Transactions on Graphics 27, 5 (Dec.), 126:1--126:11. Google Scholar
Digital Library
- Zhou, K., Ren, Z., Lin, S., Bao, H., Guo, B., and Shum, H.-Y. 2008. Real-time smoke rendering using compensated ray marching. ACM Transactions on Graphics 27, 3 (Aug.), 36:1--36:12. Google Scholar
Digital Library
Index Terms
Real-time parallel hashing on the GPU
Recommendations
Real-time parallel hashing on the GPU
SIGGRAPH Asia '09: ACM SIGGRAPH Asia 2009 papersWe demonstrate an efficient data-parallel algorithm for building large hash tables of millions of elements in real-time. We consider two parallel algorithms for the construction: a classical sparse perfect hashing approach, and cuckoo hashing, which ...
Entropy-Learned Hashing: Constant Time Hashing with Controllable Uniformity
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataHashing is a widely used technique for creating uniformly random numbers from arbitrary data. This is required in a large range of core data-driven operations including indexing, partitioning, filters, and sketches. As such, hashing is a core component ...
Weaknesses of Cuckoo Hashing with a Simple Universal Hash Class: The Case of Large Universes
SOFSEM '09: Proceedings of the 35th Conference on Current Trends in Theory and Practice of Computer ScienceCuckoo hashing was introduced by Pagh and Rodler in 2001 [12]. A set S of n keys is stored in two tables T 1 and T 2 each of which has m cells of capacity 1 such that constant access time is guaranteed. For m ≥ (1 + ε)n and hash functions h 1, h 2 that ...






Comments