skip to main content
research-article

Automated Real-Time Analysis of Streaming Big and Dense Data on Reconfigurable Platforms

Published:19 December 2016Publication History
Skip Abstract Section

Abstract

We propose SSketch, a novel automated framework for efficient analysis of dynamic big data with dense (non-sparse) correlation matrices on reconfigurable platforms. SSketch targets streaming applications where each data sample can be processed only once and storage is severely limited. Our framework adaptively learns from the stream of input data and updates a corresponding ensemble of lower-dimensional data structures, a.k.a., a sketch matrix. A new sketching methodology is introduced that tailors the problem of transforming the big data with dense correlations to an ensemble of lower-dimensional subspaces such that it is suitable for hardware-based acceleration performed by reconfigurable hardware. The new method is scalable, while it significantly reduces costly memory interactions and enhances matrix computation performance by leveraging coarse-grained parallelism existing in the dataset. SSketch provides an automated optimization methodology for creating the most accurate data sketch for a given set of user-defined constraints, including runtime and power as well as platform constraints such as memory. To facilitate automation, SSketch takes advantage of a Hardware/Software (HW/SW) co-design approach: It provides an Application Programming Interface that can be customized for rapid prototyping of an arbitrary matrix-based data analysis algorithm. Proof-of-concept evaluations on a variety of visual datasets with more than 11 million non-zeros demonstrate up to a 200-fold speedup on our hardware-accelerated realization of SSketch compared to a software-based deployment on a general-purpose processor.

References

  1. Mircea Andrecut. 2008. Fast GPU implementation of sparse signal recovery from random projections. arXiv preprint arXiv:0809.1833.Google ScholarGoogle Scholar
  2. Lin Bai, Patrick Maechler, Michael Muehlberghuber, and Hubert Kaeslin. 2012. High-speed compressed sensing reconstruction on FPGA using OMP and AMP. In Proceedings of the 2012 19th IEEE International Conference on Electronics, Circuits and Systems (ICECS). IEEE, 53--56.Google ScholarGoogle ScholarCross RefCross Ref
  3. Jeffrey D. Blanchard and Jared Tanner. 2013. GPU accelerated greedy algorithms for compressed sensing. Math. Program. Comput. 5, 3 (2013), 267--304.Google ScholarGoogle ScholarCross RefCross Ref
  4. Kenneth L. Clarkson and David P. Woodruff. 2009. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing. ACM, 205--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jason Cong, Muhuan Huang, and Peng Zhang. 2014. Combining computation and communication optimizations in system synthesis for streaming applications. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-programmable Gate Arrays. ACM, 213--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. Council. 2013. Frontiers in massive data analysis. (2013).Google ScholarGoogle Scholar
  7. Xilinx Datasheet. 2014. Xilinx Virtex 6 Datasheet. Retrieved 2014 from http://www.xilinx.com/publications/prod_mktg/Virtex6_Product_Table.pdf.Google ScholarGoogle Scholar
  8. Petros Drineas and Michael W. Mahoney. 2005. On the Nyström method for approximating a gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6 (2005), 2153--2175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Eva L. Dyer, Aswin C. Sankaranarayanan, and Richard G. Baraniuk. 2013. Greedy feature selection for subspace clustering. J. Mach. Learn. Res. 14, 1 (2013), 2487--2517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yong Fang, Liang Chen, Jiaji Wu, and Bormin Huang. 2011. GPU implementation of orthogonal matching pursuit for compressive sensing. In Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 1044--1047. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Gene H. Golub and Christian Reinsch. 1970. Singular value decomposition and least squares solutions. Numer. Math. 14, 5 (1970), 403--420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Pierre Greisen, Marian Runo, Patrice Guillet, Simon Heinzle, Aljoscha Smolic, Hubert Kaeslin, and Markus Gross. 2013. Evaluation and FPGA implementation of sparse linear solvers for video processing applications. IEEE Trans. Circ. Syst. Vid. Technol. 23, 8 (2013), 1402--1407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Kulkarni, T. Abtahi, E. Smith, and T. Mohsenin. 2016. Low energy sketching engines on many-core platform for big data acceleration. In Proceedings of the 26th Edition on Great Lakes Symposium on VLSI (GLSVLSI’16). ACM, New York, NY, 57--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Kulkarni, A. Jafari, C. Sagedy, and T. Mohsenin. 2016a. Sketching-based high-performance biomedical big data processing accelerator. In Proceedings of the 2016 IEEE International Symposium on Circuits and Systems (ISCAS). 1138--1141.Google ScholarGoogle Scholar
  15. A. Kulkarni, A. Jafari, C. Shea, and T. Mohsenin. 2016b. CS-based secured big data processing on FPGA. In Proceedings of the 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 201--201.Google ScholarGoogle Scholar
  16. Amey M. Kulkarni, Houman Homayoun, and Tinoosh Mohsenin. 2014. A parallel and reconfigurable architecture for efficient OMP compressive sensing reconstruction. In Proceedings of the 24th Edition of the Great Lakes Symposium on VLSI. ACM, 299--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Luis M. Ledesma-Carrillo, Eduardo Cabal-Yepez, Rene de J. Romero-Troncoso, Arturo Garcia-Perez, Roque Osornio-Rios, Tobia D. Carozzi, and others. 2011. Reconfigurable FPGA-Based unit for singular value decomposition of large mxn matrices. In Proceedings of the 2011 International Conference on Reconfigurable Computing and FPGAs (ReConFig). IEEE, 345--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Edo Liberty. 2013. Simple and deterministic matrix sketching. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 581--588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Stanford Dataset Archive LightField. 2014. Retrieved from http://lightfield.stanford.edu/.Google ScholarGoogle Scholar
  20. Patrick Maechler, Pierre Greisen, Norbert Felber, and Andreas Burg. 2010. Matching pursuit: Evaluation and implementatio for LTE channel estimation. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 589--592.Google ScholarGoogle ScholarCross RefCross Ref
  21. Gunnar Martinsson, Adrianna Gillman, Edo Liberty, Nathan Halko, Vladimir Rokhlin, Sijia Hao, Yoel Shkolnisky, Patrick Young, Joel Tropp, Mark Tygert, and others. 2010. Randomized methods for computing the singular value decomposition (SVD) of very large matrices. In Proceedings of the Workshop on Algorithms for Modern Massive Data Sets, Palo Alto.Google ScholarGoogle Scholar
  22. Kshitij Marwah, Gordon Wetzstein, Yosuke Bando, and Ramesh Raskar. 2013. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM Trans. Graph. 32, 4 (2013), 46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Azalia Mirhoseini, Eva Dyer, Ebrahim Songhori, Richard Baraniuk, Farinaz Koushanfar, and others. 2015. RankMap: A platform-aware framework for distributed learning from dense datasets. arXiv preprint arXiv:1503.08169 (2015).Google ScholarGoogle Scholar
  24. Azalia Mirhoseini, Bita Darvish Rouhani, Ebrahim M. Songhori, and Farinaz Koushanfar. 2016. Perform-ML: Performance optimized machine learning by platform and content aware customization. In Proceedings of the 53rd Annual Design Automation Conference. ACM, 20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Douglas C. Montgomery, Elizabeth A. Peck, and G. Geoffrey Vining. 2012. Introduction to Linear Regression Analysis, Vol. 821. John Wiley 8 Sons.Google ScholarGoogle Scholar
  26. Dimitris S. Papailiopoulos, Alexandros G. Dimakis, and Stavros Korokythakis. 2013. Sparse pca through low-rank approximations. arXiv preprint arXiv:1303.0551 (2013).Google ScholarGoogle Scholar
  27. Franjo Plavec, Zvonko Vranesic, and Stephen Brown. 2013. Exploiting task-and data-level parallelism in streaming applications implemented in FPGAs. ACM Trans. Reconf. Technol. Syst. 6, 4 (2013), 16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Antonio Plaza, Javier Plaza, Alexander Paz, and Sergio Sanchez. 2011. Parallel hyperspectral image and signal processing {applications corner}. Sign. Process. Mag. 28, 3 (2011), 119--126.Google ScholarGoogle ScholarCross RefCross Ref
  29. Sanguthevar Rajasekaran and Mingjun Song. 2006. A novel scheme for the parallel computation of SVDs. In High Performance Computing and Communications. Springer, 129--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Fengbo Ren, Richard Dorrace, Wenyao Xu, and Dejan Markovic. 2013. A single-precision compressive sensing signal reconstruction engine on FPGAs. In Proceedings of the 2013 23rd International Conference on Field Programmable Logic and Applications (FPL). IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  31. Bita Darvish Rouhani, Ebrahim Songhori, Azalia Mirhoseini, and Farinaz Koushanfar. 2015. SSketch: An automated framework for streaming sketch-based analysis of big data on FPGA. In Proceedings of the 23rd IEEE International Symposium on Field-Programmable Custom Computing Machines Conference (FCCM) (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. Rubinstein. 2009. Omp-Box v10. (2009).Google ScholarGoogle Scholar
  33. Hyperspectral Remote Sensing Dataset Salina. 2014. Retrieved 2014 from http://www.ehu.es/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes.Google ScholarGoogle Scholar
  34. Avi Septimus and Raphael Steinberg. 2010. Compressive sampling hardware reconstruction. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 3316--3319.Google ScholarGoogle ScholarCross RefCross Ref
  35. Anatoli Sergyienko and Oleg Maslennikov. 2002. Implementation of givens QR-decomposition in FPGA. In Parallel Processing and Applied Mathematics. Springer, 458--465. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hyperspectral Dataset Stanford. 2014. Retrieved 2014 from http://scien.stanford.edu/index.php/landscapes.Google ScholarGoogle Scholar
  37. Jerome L. V. M. Stanislaus and Tinoosh Mohsenin. 2012. High performance compressive sensing reconstruction hardware with QRD process. In Proceedings of the 2012 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 29--32.Google ScholarGoogle Scholar
  38. Jerome L. V. M. Stanislaus and Tinoosh Mohsenin. 2013. Low-complexity FPGA implementation of compressive sensing reconstruction. In Proceedings of the 2013 International Conference on Computing, Networking and Communications (ICNC). IEEE, 671--675. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B (1996), 267--288.Google ScholarGoogle Scholar
  40. Wei Zhang, Vaughn Betz, and Jonathan Rose. 2012. Portable and scalable FPGA-based acceleration of a direct linear system solver. ACM Trans. Reconfig. Technol. Syst. 5, 1 (2012), 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Daniel Zinn, Quinn Hart, Timothy McPhillips, Bertram Ludascher, Yogesh Simmhan, Michail Giakkoupis, and Viktor K. Prasanna. 2011. Towards reliable, performant workflows for streaming-applications on cloud platforms. In Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE Computer Society, 235--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Hui Zou, Trevor Hastie, and Robert Tibshirani. 2006. Sparse principal component analysis. J. Comput. Graph. Stat. 15, 2 (2006), 265--286.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Automated Real-Time Analysis of Streaming Big and Dense Data on Reconfigurable Platforms

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!