skip to main content
10.1145/1508244.1508283acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Architectural support for SWAR text processing with parallel bit streams: the inductive doubling principle

Published:07 March 2009Publication History

ABSTRACT

Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF-8 to UTF-16 transcoding, XML parsing, string search and regular expression matching. Direct architectural support for these algorithms in future SWAR instruction sets could further increase performance as well as simplifying the programming task. A set of simple SWAR instruction set extensions are proposed for this purpose based on the principle of systematic support for inductive doubling as an algorithmic technique. These extensions are shown to significantly reduce instruction count in core parallel bit stream algorithms, often providing a 3X or better improvement. The extensions are also shown to be useful for SWAR programming in other application areas, including providing a systematic treatment for horizontal operations. An implementation model for these extensions involves relatively simple circuitry added to the operand fetch components in a pipelined processor.

References

  1. Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, Dec 2006.Google ScholarGoogle Scholar
  2. Robert D. Cameron. u8u16 -- a high-speed UTF-8 to UTF-16 transcoder using parallel bit streams. Technical Report TR 2007-18, Simon Fraser University, Burnaby, BC, Canada, 2007.Google ScholarGoogle Scholar
  3. Robert D. Cameron. A case study in SIMD text processing with parallel bit streams. In ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), Salt Lake City, Utah, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Robert D. Cameron, Kenneth S. Herdy, and Dan Lin. High performance XML parsing using parallel bit stream technology. In CASCON '08: Proceedings of the 2008 conference of the Centre for Advanced Studies on Collaborative research, Toronto, Ontario, Canada, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. James R. Green, Hanan Mahmoud, and Michel Dumontier. Towards real time protein identification using the Cell BE. In Workshop on Cell BE and Heterogeneous Multicore Systems: Architectures and Applications at CASCON '08, Toronto, Ontario, Canada, 2008.Google ScholarGoogle Scholar
  6. Kenneth S. Herdy, David S. Burggraf, and Robert D. Cameron. High performance GML to SVG transformation for the visual presentation of geographic data in web-based mapping systems. In Proceedings of SVG Open 2008, Nuremburg, Germany, 2008.Google ScholarGoogle Scholar
  7. Donald E. Knuth. The Art of Computer Programming Volume 4 Pre-Fascicle 1A: Bitwise Tricks and Techniques. Draft of 22 December 2008, Stanford University.Google ScholarGoogle Scholar
  8. R. Raman and D.S. Wise. Converting to and from dilated integers. IEEE Transactions on Computers, 57(4):567--573, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chester Rebeiro, A. David Selvakumar, and A. S. L. Devi. Bitslice implementation of AES. In David Pointcheval, Yi Mu, and Kefei Chen, editors, CANS, volume 4301 of Lecture Notes in Computer Science, pages 203--212. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Leo Stocco and Günther Schrack. Integer dilation and contraction for quadtrees and octrees. In Proceedings of the IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, pages 426--428, Victoria, B.C., 1995.Google ScholarGoogle ScholarCross RefCross Ref
  11. Henry S. Warren. Hacker's Delight. Addison-Wesley, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Architectural support for SWAR text processing with parallel bit streams: the inductive doubling principle

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!