skip to main content
10.1145/3466752.3480133acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems

Published: 17 October 2021 Publication History

Abstract

Simple graph algorithms such as PageRank have been the target of numerous hardware accelerators. Yet, there also exist much more complex graph mining algorithms for problems such as clustering or maximal clique listing. These algorithms are memory-bound and thus could be accelerated by hardware techniques such as Processing-in-Memory (PIM). However, they also come with non-straightforward parallelism and complicated memory access patterns. In this work, we address this problem with a simple yet surprisingly powerful observation: operations on sets of vertices, such as intersection or union, form a large part of many complex graph mining algorithms, and can offer rich and simple parallelism at multiple levels. This observation drives our cross-layer design, in which we (1) expose set operations using a novel programming paradigm, (2) express and execute these operations efficiently with carefully designed set-centric ISA extensions called SISA, and (3) use PIM to accelerate SISA instructions. The key design idea is to alleviate the bandwidth needs of SISA instructions by mapping set operations to two types of PIM: in-DRAM bulk bitwise computing for bitvectors representing high-degree vertices, and near-memory logic layers for integer arrays representing low-degree vertices. Set-centric SISA-enhanced algorithms are efficient and outperform hand-tuned baselines, offering more than 10 × speedup over the established Bron-Kerbosch algorithm for listing maximal cliques. We deliver more than 10 SISA set-centric algorithm formulations, illustrating SISA’s wide applicability.

References

[1]
Christopher R Aberger, Andrew Lamb, Susan Tu, Andres Nötzli, Kunle Olukotun, and Christopher Ré. 2017. Emptyheaded: A relational engine for graph processing. ACM Transactions on Database Systems (TODS) 42, 4 (2017), 1–44.
[2]
Abraham Addisie, Hiwot Kassa, Opeoluwa Matthews, and Valeria Bertacco. 2018. Heterogeneous memory subsystem for natural graph analytics. In 2018 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 134–145.
[3]
Shaizeen Aga, Supreet Jeloka, Arun Subramaniyan, Satish Narayanasamy, David Blaauw, and Reetuparna Das. 2017. Compute caches. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 481–492.
[4]
Charu C Aggarwal and Haixun Wang. 2010. Managing and mining graph data. Vol. 40. Springer.
[5]
Rakesh Agrawal, Ramakrishnan Srikant, 1994. Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB, Vol. 1215. Citeseer, 487–499.
[6]
Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing. In ISCA.
[7]
Junwhan Ahn, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture. In Computer Architecture (ISCA), 2015 ACM/IEEE 42nd Annual International Symposium on. IEEE, 336–348.
[8]
Mohammad Al Hasan, Vineet Chaoji, Saeed Salem, and Mohammed Zaki. 2006. Link prediction using supervised learning. In SDM06: workshop on link analysis, counter-terrorism and security.
[9]
Mohammad Al Hasan and Mohammed J Zaki. 2011. A survey of link prediction in social networks. In Social network data analytics. Springer, 243–275.
[10]
Shaahin Angizi and Deliang Fan. 2019. Graphide: A graph processing accelerator leveraging in-dram-computing. In Proceedings of the 2019 on Great Lakes Symposium on VLSI. 45–50.
[11]
Shaahin Angizi, Jiao Sun, Wei Zhang, and Deliang Fan. 2019. GraphS: A graph processing accelerator leveraging SOT-MRAM. In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 378–383.
[12]
Omar Batarfi, Radwa El Shawi, Ayman G Fayoumi, Reza Nouri, Ahmed Barnawi, and Sherif Sakr. 2015. Large scale graph processing systems: survey and an experimental evaluation. Cluster Computing 18, 3 (2015), 1189–1213.
[13]
Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, 2018. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261(2018).
[14]
Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP benchmark suite. arXiv preprint arXiv:1508.03619(2015).
[15]
Tal Ben-Nun, Maciej Besta, Simon Huber, Alexandros Nikolaos Ziogas, Daniel Peter, and Torsten Hoefler. 2019. A modular benchmarking infrastructure for high-performance and reproducible deep learning. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 66–77.
[16]
Maciej Besta, Armon Carigiet, Zur Vonarburg-Shmaria, Kacper Janda, Lukas Gianinazzi, and Torsten Hoefler. 2020. High-performance parallel graph coloring with strong guarantees on work, depth, and quality. arXiv preprint arXiv:2008.11321(2020).
[17]
Maciej Besta, Raphael Grob, Cesare Miglioli, Nicola Bernold, Grzegorz Kwasniewski, Gabriel Gjini, Raghavendra Kanakagiri, Saleh Ashkboos, Lukas Gianinazzi, Nikoli Dryden, 2021. Motif Prediction with Graph Neural Networks. arXiv preprint arXiv:2106.00761(2021).
[18]
Maciej Besta and Torsten Hoefler. 2015. Accelerating irregular computations with hardware transactional memory and active messages. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing. 161–172.
[19]
Maciej Besta, Raghavendra Kanakagiri, Harun Mustafa, Mikhail Karasikov, Gunnar Rätsch, Torsten Hoefler, and Edgar Solomonik. 2020. Communication-efficient jaccard similarity for high-performance distributed genome comparisons. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 1122–1132.
[20]
Maciej Besta, Florian Marending, Edgar Solomonik, and Torsten Hoefler. 2017. SlimSell: A Vectorizable Graph Representation for Breadth-First Search. In Parallel and Distributed Processing Symposium (IPDPS), 2017 IEEE International. IEEE, 32–41.
[21]
Maciej Besta, Michał Podstawski, Linus Groner, Edgar Solomonik, and Torsten Hoefler. 2017. To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 93–104.
[22]
Maciej Besta, Dimitri Stanojevic, Johannes De Fine Licht, Tal Ben-Nun, and Torsten Hoefler. 2019. Graph Processing on FPGAs: Taxonomy, Survey, Challenges. arXiv preprint arXiv:1903.06697(2019).
[23]
Maciej Besta, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi, Jakub Beranek, Kacper Janda, Tobias Holenstein, Sebastian Leisinger, 2021. GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra. VLDB (2021).
[24]
Guy E. Blelloch and Bruce M. Maggs. 2010. Parallel Algorithms (2ed.). Chapman & Hall/CRC, 25.
[25]
Otakar Boruvka. 1926. O jistém problému minimálním. (1926).
[26]
Coen Bron and Joep Kerbosch. 1973. Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16, 9 (1973), 575–577.
[27]
Lázaro Bustio, René Cumplido, Raudel Hernández, José M Bande, and Claudia Feregrino. 2015. Frequent itemsets mining in data streams using reconfigurable hardware. In International Workshop on New Frontiers in Mining Complex Patterns. Springer, 32–45.
[28]
Lázaro Bustio-Martínez, René Cumplido, Martín Letras-Luna, Claudia Feregrino Uribe, Raudel Hernández-León, and José M Bande-Serrano. 2017. Approximate frequent itemsets mining on data streams using hashing and lexicographie order in hardware. In 2017 IEEE 8th Latin American Symposium on Circuits & Systems (LASCAS). IEEE, 1–4.
[29]
Frédéric Cazals and Chinmay Karande. 2008. A note on the problem of reporting maximal cliques. Theoretical Computer Science 407, 1-3 (2008), 564–568.
[30]
Deepayan Chakrabarti and Christos Faloutsos. 2006. Graph mining: Laws, generators, and algorithms. ACM computing surveys (CSUR) 38, 1 (2006), 2.
[31]
Nagadastagiri Challapalle, Sahithi Rampalli, Linghao Song, Nandhini Chandramoorthy, Karthik Swaminathan, John Sampson, Yiran Chen, and Vijaykrishnan Narayanan. 2020. GaaS-X: Graph Analytics Accelerator Supporting Sparse Data Representation using Crossbar Architectures. ISCA (2020).
[32]
Rohit Chandra, Leo Dagum, David Kohr, Ramesh Menon, Dror Maydan, and Jeff McDonald. 2001. Parallel programming in OpenMP. Morgan kaufmann.
[33]
Hongzhi Chen, Miao Liu, Yunjian Zhao, Xiao Yan, Da Yan, and James Cheng. 2018. G-Miner: an efficient task-oriented graph mining system. In Proceedings of the Thirteenth EuroSys Conference. ACM, 32.
[34]
Langshi Chen, Jiayu Li, Ariful Azad, Lei Jiang, Madhav Marathe, Anil Vullikanti, Andrey Nikolaev, Egor Smirnov, Ruslan Israfilov, and Judy Qiu. 2019. A GraphBLAS approach for subgraph counting. arXiv preprint arXiv:1903.04395(2019).
[35]
Xuhao Chen, Roshan Dathathri, Gurbinder Gill, and Keshav Pingali. 2019. Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU. arXiv preprint arXiv:1911.06969(2019).
[36]
Jiefeng Cheng, Jeffrey Xu Yu, Bolin Ding, S Yu Philip, and Haixun Wang. 2008. Fast graph pattern matching. In 2008 IEEE 24th International Conference on Data Engineering. IEEE, 913–922.
[37]
James Cheng, Linhong Zhu, Yiping Ke, and Shumo Chu. 2012. Fast algorithms for maximal clique enumeration with limited memory. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 1240–1248.
[38]
Norishige Chiba and Takao Nishizeki. 1985. Arboricity and subgraph listing algorithms. SIAM Journal on computing 14, 1 (1985), 210–223.
[39]
Diane J Cook and Lawrence B Holder. 2006. Mining graph data. John Wiley & Sons.
[40]
Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman. 2005. Linux device drivers. ” O’Reilly Media, Inc.”.
[41]
Luigi P Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. 2004. A (sub) graph isomorphism algorithm for matching large graphs. IEEE transactions on pattern analysis and machine intelligence 26, 10(2004), 1367–1372.
[42]
Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. 2009. Introduction to algorithms. MIT press.
[43]
Guohao Dai, Tianhao Huang, Yuze Chi, Jishen Zhao, Guangyu Sun, Yongpan Liu, Yu Wang, Yuan Xie, and Huazhong Yang. 2018. Graphh: A processing-in-memory architecture for large-scale graph processing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 38, 4(2018), 640–653.
[44]
Maximilien Danisch, Oana Balalau, and Mauro Sozio. 2018. Listing k-cliques in sparse real-world graphs. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 589–598.
[45]
William HE Day and David Sankoff. 1986. Computational complexity of inferring phylogenies by compatibility. Systematic Biology 35, 2 (1986), 224–229.
[46]
Laxman Dhulipala, Guy E Blelloch, and Julian Shun. 2018. Theoretically efficient parallel graph algorithms can be fast and scalable. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures. 393–404.
[47]
Laxman Dhulipala, Charles McGuffey, Hongbo Kang, Yan Gu, Guy Blelloch, Phillip Gibbons, and Julian Shun. 2020. Sage: Parallel Semi-Asymmetric Graph Algorithms for NVRAMs. PVLDB (2020).
[48]
Vinicius Dias, Carlos HC Teixeira, Dorgival Guedes, Wagner Meira, and Srinivasan Parthasarathy. 2019. Fractal: A General-Purpose Graph Pattern Mining System. In Proceedings of the 2019 International Conference on Management of Data. ACM, 1357–1374.
[49]
Sumeet Dua and Xian Du. 2016. Data mining and machine learning in cybersecurity. CRC press.
[50]
John D Eblen, Charles A Phillips, Gary L Rogers, and Michael A Langston. 2012. The maximum clique enumeration problem: algorithms, applications, and implementations. In BMC bioinformatics, Vol. 13. Springer, S5.
[51]
David Eppstein, Maarten Löffler, and Darren Strash. 2010. Listing All Maximal Cliques in Sparse Graphs in Near-Optimal Time. In Algorithms and Computation - 21st International Symposium, ISAAC 2010, Jeju Island, Korea, December 15-17, 2010, Proceedings, Part I. 403–414. https://doi.org/10.1007/978-3-642-17517-6_36
[52]
Brian Gallagher. 2006. Matching Structure and Semantics: A Survey on Graph-Based Pattern Matching. In AAAI Fall Symposium: Capturing and Using Patterns for Evidence Detection. 45–53.
[53]
Fei Gao, Georgios Tziantzioulis, and David Wentzlaff. 2019. Computedram: In-memory compute using off-the-shelf drams. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 100–113.
[54]
Mingyu Gao, Grant Ayers, and Christos Kozyrakis. 2015. Practical near-data processing for in-memory analytics frameworks. In 2015 International Conference on Parallel Architecture and Compilation (PACT). IEEE, 113–124.
[55]
Mingyu Gao and Christos Kozyrakis. 2016. HRL: Efficient and flexible reconfigurable logic for near-data processing. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). Ieee, 126–137.
[56]
Saugata Ghose, Amirali Boroumand, Jeremie S Kim, Juan Gómez-Luna, and Onur Mutlu. 2019. Processing-in-Memory: A Workload-driven Perspective. IBM JRD (2019).
[57]
Saugata Ghose, Kevin Hsieh, Amirali Boroumand, Rachata Ausavarungnirun, and Onur Mutlu. 2019. The processing-in-memory paradigm: Mechanisms to enable adoption. In Beyond-CMOS Technologies for Next Generation Computer Design. Springer, 133–194.
[58]
Lukas Gianinazzi, Maciej Besta, Yannick Schaffner, and Torsten Hoefler. 2021. Parallel Algorithms for Finding Large Cliques in Sparse Graphs. In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures. 243–253.
[59]
Lukas Gianinazzi, Maximilian Fries, Nikoli Dryden, Tal Ben-Nun, and Torsten Hoefler. 2021. Learning Combinatorial Node Labeling Algorithms. arXiv preprint arXiv:2106.03594(2021).
[60]
Lukas Gianinazzi, Pavel Kalvoda, Alessandro De Palma, Maciej Besta, and Torsten Hoefler. 2018. Communication-avoiding parallel minimum cuts and connected components. ACM SIGPLAN Notices 53, 1 (2018), 219–232.
[61]
David Gibson, Ravi Kumar, and Andrew Tomkins. 2005. Discovering large dense subgraphs in massive graphs. In Proceedings of the 31st international conference on Very large data bases. 721–732.
[62]
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In International Conference on Machine Learning. PMLR, 1263–1272.
[63]
Juan Gómez-Luna, Izzat El Hajj, Ivan Fernandez, Christina Giannoula, Geraldo F Oliveira, and Onur Mutlu. 2021. Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture. arXiv preprint arXiv:2105.03814(2021).
[64]
Nastaran Hajinazar, Geraldo F Oliveira, Sven Gregorio, João Dinis Ferreira, Nika Mansouri Ghiasi, Minesh Patel, Mohammed Alser, Saugata Ghose, Juan Gómez-Luna, and Onur Mutlu. 2021. SIMDRAM: a framework for bit-serial SIMD processing using DRAM. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 329–345.
[65]
Tae Jun Ham, Lisa Wu, Narayanan Sundaram, Nadathur Satish, and Margaret Martonosi. 2016. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1–13.
[66]
J Han and M Kamber. 2006. Data Mining Concepts and Techniques (A. Stephan, Ed.), 2nd edn., vol. 40.
[67]
Shuo Han, Lei Zou, and Jeffrey Xu Yu. 2018. Speeding Up Set Intersections in Graph Algorithms using SIMD Instructions. In Proceedings of the 2018 International Conference on Management of Data. ACM, 1587–1602.
[68]
Lei He. 2019. EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks. arXiv preprint arXiv:1909.00155(2019).
[69]
Eric Robert Hein. 2018. Near-data processing for dynamic graph analytics. Ph.D. Dissertation. Georgia Institute of Technology.
[70]
Wim Heirman, Trevor Carlson, and Lieven Eeckhout. 2012. Sniper: Scalable and accurate parallel multi-core simulation. In 8th International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES-2012). High-Performance and Embedded Architecture and Compilation Network of …, 91–94.
[71]
Maurice Herlihy, Nir Shavit, Victor Luchangco, and Michael Spear. 2020. The art of multiprocessor programming. Newnes.
[72]
Shohei Hido and Hiroyuki Kawano. 2005. AMIOT: induced ordered tree mining in tree-structured databases. In Fifth IEEE International Conference on Data Mining (ICDM’05). IEEE, 8–pp.
[73]
Torsten Hoefler and Roberto Belli. 2015. Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results. In Proceedings of the international conference for high performance computing, networking, storage and analysis. 1–12.
[74]
Tamás Horváth, Thomas Gärtner, and Stefan Wrobel. 2004. Cyclic pattern kernels for predictive graph mining. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 158–167.
[75]
Kevin Hsieh, Samira Khan, Nandita Vijaykumar, Kevin K Chang, Amirali Boroumand, Saugata Ghose, and Onur Mutlu. 2016. Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation. In 2016 IEEE 34th International Conference on Computer Design (ICCD). IEEE, 25–32.
[76]
Tianhao Huang, Guohao Dai, Yu Wang, and Huazhong Yang. 2018. HyVE: Hybrid vertex-edge memory hierarchy for energy-efficient graph processing. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 973–978.
[77]
Yu Huang, Long Zheng, Xiaofei Liao, Hai Jin, Pengcheng Yao, and Chuangyi Gui. 2019. RAGra: Leveraging Monolithic 3D ReRAM for Massively-Parallel Graph Processing. In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1273–1276.
[78]
Anand Padmanabha Iyer, Zaoxing Liu, Xin Jin, Shivaram Venkataraman, Vladimir Braverman, and Ion Stoica. 2018. {ASAP}: Fast, Approximate Graph Pattern Mining at Scale. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 745–761.
[79]
Said Jabbour, Nizar Mhadhbi, Badran Raddaoui, and Lakhdar Sais. 2018. Pushing the Envelope in Overlapping Communities Detection. In International Symposium on Intelligent Data Analysis. Springer, 151–163.
[80]
Kasra Jamshidi, Rakesh Mahadasa, and Keval Vora. 2020. Peregrine: a pattern-aware graph mining system. In Proceedings of the Fifteenth European Conference on Computer Systems. 1–16.
[81]
Raymond Austin Jarvis and Edward A Patrick. 1973. Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on computers 100, 11 (1973), 1025–1034.
[82]
Thomas Jech. 2013. Set theory. Springer Science & Business Media.
[83]
Joe Jeddeloh and Brent Keeth. 2012. Hybrid memory cube new DRAM architecture increases density and performance. In VLSI Technology (VLSIT), 2012 Symposium on. IEEE, 87–88.
[84]
Chuntao Jiang, Frans Coenen, and Michele Zito. 2013. A survey of frequent subgraph mining algorithms. The Knowledge Engineering Review 28, 1 (2013), 75–105.
[85]
Daxin Jiang and Jian Pei. 2009. Mining frequent cross-graph quasi-cliques. ACM Transactions on Knowledge Discovery from Data (TKDD) 2, 4(2009), 1–42.
[86]
Aparna Joshi, Yu Zhang, Petko Bogdanov, and Jeong-Hyon Hwang. 2018. An Efficient System for Subgraph Discovery. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 703–712.
[87]
Sang-Woo Jun, Andy Wright, Sizhuo Zhang, and Shuotao Xu. 2018. GraFBoost: Using accelerated flash storage for external graph analytics. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 411–424.
[88]
Vasiliki Kalavri, Vladimir Vlassov, and Seif Haridi. 2017. High-level programming abstractions for distributed graph processing. IEEE Transactions on Knowledge and Data Engineering 30, 2(2017), 305–324.
[89]
Oren Kalinsky, Benny Kimelfeld, and Yoav Etsion. 2020. The TrieJax Architecture: Accelerating Graph Operations Through Relational Joins. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 1217–1231.
[90]
Jeremy Kepner, Peter Aaltonen, David Bader, Aydin Buluç, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, and Henning Meyerhenke. 2016. Mathematical foundations of the GraphBLAS. In High Performance Extreme Computing Conference (HPEC), 2016 IEEE. IEEE, 1–9.
[91]
Arijit Khan. 2016. Vertex-centric graph processing: The good, the bad, and the ugly. arXiv preprint arXiv:1612.07404(2016).
[92]
Wissam Khaouid, Marina Barsky, Venkatesh Srinivasan, and Alex Thomo. 2015. K-core decomposition of large networks on a single PC. Proceedings of the VLDB Endowment 9, 1 (2015), 13–23.
[93]
Seongyun Ko and Wook-Shin Han. 2018. Turbograph++: A scalable and fast graph analytics system. In Proceedings of the 2018 International Conference on Management of Data. ACM, 395–410.
[94]
Michihiro Kuramochi and George Karypis. 2001. Frequent subgraph discovery. In Proceedings 2001 IEEE international conference on data mining. IEEE, 313–320.
[95]
Michihiro Kuramochi and George Karypis. 2004. An efficient algorithm for discovering frequent subgraphs. IEEE transactions on Knowledge and Data Engineering 16, 9(2004), 1038–1051.
[96]
Dominique Lavenier, Jean-Francois Roy, and David Furodet. 2016. DNA mapping using Processor-in-Memory architecture. In 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 1429–1435.
[97]
Victor E Lee, Ning Ruan, Ruoming Jin, and Charu Aggarwal. 2010. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data. Springer, 303–336.
[98]
Elizabeth A Leicht, Petter Holme, and Mark EJ Newman. 2006. Vertex similarity in networks. Physical Review E 73, 2 (2006), 026120.
[99]
Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, and Zoubin Ghahramani. 2010. Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research 11, Feb (2010), 985–1042.
[100]
Shuangchen Li, Dimin Niu, Krishna T Malladi, Hongzhong Zheng, Bob Brennan, and Yuan Xie. 2017. Drisa: A dram-based reconfigurable in-situ accelerator. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 288–301.
[101]
Shuangchen Li, Cong Xu, Qiaosha Zou, Jishen Zhao, Yu Lu, and Yuan Xie. 2016. Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In Proceedings of the 53rd Annual Design Automation Conference. 1–6.
[102]
David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. Journal of the American society for information science and technology 58, 7 (2007), 1019–1031.
[103]
Siyuan Liu and Arijit Khan. 2018. An Empirical Analysis on Expressibility of Vertex Centric Graph Processing Paradigm. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 242–251.
[104]
Gabriel H Loh. 2008. 3D-stacked memory architectures for multi-core processors. In ACM SIGARCH computer architecture news, Vol. 36. IEEE Computer Society, 453–464.
[105]
Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications 390, 6(2011), 1150–1170.
[106]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: building customized program analysis tools with dynamic instrumentation. Acm sigplan notices 40, 6 (2005), 190–200.
[107]
Andrew Lumsdaine, Douglas Gregor, Bruce Hendrickson, and Jonathan W. Berry. 2007. Challenges in Parallel Graph Processing. Par. Proc. Let. 17, 1 (2007), 5–20.
[108]
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, 135–146.
[109]
Jasmina Malicevic, Baptiste Lepers, and Willy Zwaenepoel. 2017. Everything you always wanted to know about multicore graph processing but were afraid to ask. In 2017 USENIX Annual Technical Conference (USENIX ATC’17). 631–643.
[110]
Kiran Kumar Matam, Gunjae Koo, Haipeng Zha, Hung-Wei Tseng, and Murali Annavaram. 2019. GraphSSD: graph semantics aware SSD. In Proceedings of the 46th International Symposium on Computer Architecture. 116–128.
[111]
Daniel Mawhirter, Sam Reinehr, Connor Holmes, Tongping Liu, and Bo Wu. 2019. GraphZero: Breaking Symmetry for Efficient Graph Mining. arXiv preprint arXiv:1911.12877(2019).
[112]
Daniel Mawhirter and Bo Wu. 2019. AutoMine: harmonizing high-level abstraction and high performance for graph mining. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. ACM, 509–523.
[113]
Robert Ryan McCune, Tim Weninger, and Greg Madey. 2015. Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Computing Surveys (CSUR) 48, 2 (2015), 25.
[114]
Ulrich Meyer and Peter Sanders. 2003. Δ-stepping: a parallelizable shortest path algorithm. Journal of Algorithms 49, 1 (2003), 114–152.
[115]
Gary L Miller, Richard Peng, Adrian Vladu, and Shen Chen Xu. 2015. Improved parallel algorithms for spanners and hopsets. In Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 192–201.
[116]
Sparsh Mittal, Jeffrey S Vetter, and Dong Li. 2014. Improving energy efficiency of embedded DRAM caches for high-end computing systems. In Proceedings of the 23rd international symposium on High-performance parallel and distributed computing. 99–110.
[117]
O. Mutlu 2019. Processing Data Where It Makes Sense: Enabling In-Memory Computation. MicPro (2019).
[118]
Onur Mutlu, Saugata Ghose, Juan Gómez-Luna, and Rachata Ausavarungnirun. 2020. A Modern Primer on Processing in Memory. arXiv preprint arXiv:2012.03112(2020).
[119]
Anirban Nag, CN Ramachandra, Rajeev Balasubramonian, Ryan Stutsman, Edouard Giacomin, Hari Kambalasubramanyam, and Pierre-Emmanuel Gaillardon. 2019. Gencache: Leveraging in-cache operators for efficient sequence alignment. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 334–346.
[120]
Lifeng Nai, Ramyad Hadidi, Jaewoong Sim, Hyojong Kim, Pranith Kumar, and Hyesoon Kim. 2017. Graphpim: Enabling instruction-level PIM offloading in graph computing frameworks. In High Performance Computer Architecture (HPCA), 2017 IEEE International Symposium on. IEEE, 457–468.
[121]
Neo4j, Inc.2019. The Neo4j Graph Algorithms User Guide v3.5. https://neo4j.com/docs/graph-algorithms/current.
[122]
Geraldo F Oliveira, Juan Gómez-Luna, Lois Orosa, Saugata Ghose, Nandita Vijaykumar, Ivan Fernandez, Mohammad Sadrosadati, and Onur Mutlu. 2021. DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks. arXiv preprint arXiv:2105.03725(2021).
[123]
Muhammet Mustafa Ozdal, Serif Yesil, Taemin Kim, Andrey Ayupov, John Greth, Steven Burns, and Ozcan Ozturk. 2016. Energy efficient architecture for graph analytics accelerators. In Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on. IEEE, 166–177.
[124]
Subhankar Pal, Jonathan Beaumont, Dong-Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun-Seok Kim, David Blaauw, Trevor Mudge, and Ronald Dreslinski. 2018. Outerspace: An outer product based sparse matrix multiplication accelerator. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 724–736.
[125]
Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, and Mario Méndez-Lojo. 2011. The tao of parallelism in algorithms. In ACM Sigplan Notices, Vol. 46. ACM, 12–25.
[126]
T Ramraj and R Prabhakar. 2015. Frequent subgraph mining algorithms-a survey. Procedia Computer Science 47 (2015), 197–204.
[127]
Gengyu Rao, Jingji Chen, Jason Yik, and Xuehai Qian. 2021. IntersectX: An Accelerator for Graph Mining. arXiv preprint arXiv:2012.10848(2021).
[128]
Saif Ur Rehman, Asmat Ullah Khan, and Simon Fong. 2012. Graph mining: A survey of graph mining techniques. In Seventh International Conference on Digital Information Management (ICDIM 2012). IEEE, 88–92.
[129]
Nicholas Rhodes, Peter Willett, Alain Calvet, James B Dunbar, and Christine Humblet. 2003. CLIP: similarity searching of 3D databases using clique detection. Journal of chemical information and computer sciences 43, 2 (2003), 443–448.
[130]
Pedro Ribeiro, Pedro Paredes, Miguel EP Silva, David Aparicio, and Fernando Silva. 2019. A Survey on Subgraph Counting: Concepts, Algorithms and Applications to Network Motifs and Graphlets. arXiv preprint arXiv:1910.13011(2019).
[131]
Ian Robinson, Jim Webber, and Emil Eifrem. 2013. Graph databases. ” O’Reilly Media, Inc.”.
[132]
Ryan A. Rossi and Nesreen K. Ahmed. 2016. An Interactive Data Repository with Visual Analytics. SIGKDD Explor. 17, 2 (2016), 37–41. http://networkrepository.com
[133]
Ryan A Rossi and Nesreen K Ahmed. 2016. An interactive data repository with visual analytics. ACM SIGKDD Explorations Newsletter 17, 2 (2016), 37–41.
[134]
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 472–488.
[135]
Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid Aref, Marcelo Arenas, Maciej Besta, Peter A Boncz, 2020. The Future is Big Graphs! A Community View on Graph Processing Systems. arXiv preprint arXiv:2012.06171(2020).
[136]
Semih Salihoglu and Jennifer Widom. 2014. Optimizing graph algorithms on Pregel-like systems. Proceedings of the VLDB Endowment 7, 7 (2014), 577–588.
[137]
Satu Elisa Schaeffer. 2007. Graph clustering. Computer science review 1, 1 (2007), 27–64.
[138]
Thomas Schank. 2007. Algorithmic aspects of triangle-based network analysis. Phd in computer science, University Karlsruhe 3 (2007).
[139]
Vivek Seshadri, Abhishek Bhowmick, Onur Mutlu, Phillip B Gibbons, Michael A Kozuch, and Todd C Mowry. 2014. The dirty-block index. ACM SIGARCH Computer Architecture News 42, 3 (2014), 157–168.
[140]
Vivek Seshadri, Yoongu Kim, Chris Fallin, Donghyuk Lee, Rachata Ausavarungnirun, Gennady Pekhimenko, Yixin Luo, Onur Mutlu, Phillip B Gibbons, and Michael A Kozuch. 2013. RowClone: fast and energy-efficient in-DRAM bulk data copy and initialization. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. 185–197.
[141]
Vivek Seshadri, Donghyuk Lee, Thomas Mullins, Hasan Hassan, Amirali Boroumand, Jeremie Kim, Michael A Kozuch, Onur Mutlu, Phillip B Gibbons, and Todd C Mowry. 2017. Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 273–287.
[142]
Xuanhua Shi, Zhigao Zheng, Yongluan Zhou, Hai Jin, Ligang He, Bo Liu, and Qiang-Sheng Hua. 2018. Graph processing on GPUs: A survey. ACM Computing Surveys (CSUR) 50, 6 (2018), 81.
[143]
Yossi Shiloach and Uzi Vishkin. 1980. An O (log n) parallel connectivity algorithm. Technical Report. Computer Science Department, Technion.
[144]
Yossi Shiloach and Uzi Vishkin. 1982. An O (logn) parallel connectivity algorithm. Journal of Algorithms 3, 1 (1982), 57–67.
[145]
Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In ACM SIGPLAN Notices, Vol. 48. 135–146.
[146]
Julian Shun and Kanat Tangwongsan. 2015. Multicore triangle computations without tuning. In Data Engineering (ICDE), 2015 IEEE 31st International Conference on. IEEE, 149–160.
[147]
S Skiena. 1990. Dijkstra’s algorithm. Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica, Reading, MA: Addison-Wesley (1990), 225–227.
[148]
Edgar Solomonik, Maciej Besta, Flavio Vella, and Torsten Hoefler. 2017. Scaling betweenness centrality using communication-efficient sparse matrix multiplication. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 47.
[149]
Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Li, and Yiran Chen. 2018. GraphR: Accelerating graph processing using ReRAM. In High Performance Computer Architecture (HPCA), 2018 IEEE International Symposium on. IEEE, 531–543.
[150]
Victor Spirin and Leonid A Mirny. 2003. Protein complexes and functional modules in molecular networks. Proceedings of the National Academy of Sciences 100, 21(2003), 12123–12128.
[151]
Narayanan Sundaram, Nadathur Satish, Md Mostofa Ali Patwary, Subramanya R Dulloor, Michael J Anderson, Satya Gautam Vadlamudi, Dipankar Das, and Pradeep Dubey. 2015. Graphmat: High performance graph analytics made productive. Proceedings of the VLDB Endowment 8, 11 (2015), 1214–1225.
[152]
Michael Sutton, Tal Ben-Nun, and Amnon Barak. [n. d.]. Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling. ([n. d.]).
[153]
Ichigaku Takigawa and Hiroshi Mamitsuka. 2013. Graph mining: procedure, application to drug discovery and recent advances. Drug discovery today 18, 1-2 (2013), 50–57.
[154]
Lei Tang and Huan Liu. 2010. Graph mining applications to social network analysis. In Managing and Mining Graph Data. Springer, 487–513.
[155]
Ben Taskar, Ming-Fai Wong, Pieter Abbeel, and Daphne Koller. 2004. Link prediction in relational data. In Advances in neural information processing systems. 659–666.
[156]
Carlos HC Teixeira, Alexandre J Fonseca, Marco Serafini, Georgos Siganos, Mohammed J Zaki, and Ashraf Aboulnaga. 2015. Arabesque: a system for distributed graph mining. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 425–440.
[157]
Sutapat Thiprungsri and Miklos A Vasarhelyi. 2011. Cluster Analysis for Anomaly Detection in Accounting Data: An Audit Approach.International Journal of Digital Accounting Research 11 (2011).
[158]
Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363, 1 (2006), 28–42. https://doi.org/10.1016/j.tcs.2006.06.015
[159]
Julian R Ullmann. 1976. An algorithm for subgraph isomorphism. Journal of the ACM (JACM) 23, 1 (1976), 31–42.
[160]
Kenzo Van Craeynest, Shoaib Akram, Wim Heirman, Aamer Jaleel, and Lieven Eeckhout. 2013. Fairness-aware scheduling on single-ISA heterogeneous multi-cores. In Proceedings of the 22nd international conference on Parallel architectures and compilation techniques. IEEE, 177–187.
[161]
Kai Wang, Zhiqiang Zuo, John Thorpe, Tien Quang Nguyen, and Guoqing Harry Xu. 2018. Rstream: marrying relational algebra with streaming for efficient graph mining on a single machine. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 763–782.
[162]
Liang Wang, Ke Hu, and Yi Tang. 2014. Robustness of link-prediction algorithm based on similarity and application to biological networks. Current Bioinformatics 9, 3 (2014), 246–252.
[163]
Takashi Washio and Hiroshi Motoda. 2003. State of the art of graph-based data mining. Acm Sigkdd Explorations Newsletter 5, 1 (2003), 59–68.
[164]
Stanley Wasserman and Katherine Faust. 1994. Social network analysis: Methods and applications. Vol. 8. Cambridge university press.
[165]
Andrew Waterman, Yunsup Lee, David A Patterson, and Krste Asanovic. 2011. The risc-v instruction set manual, volume i: Base user-level isa. EECS Department, UC Berkeley, Tech. Rep. UCB/EECS-2011-62 116 (2011).
[166]
Andrew Shell Waterman. 2016. Design of the RISC-V instruction set architecture. Ph.D. Dissertation. UC Berkeley.
[167]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems (2020).
[168]
Xin Xin, Youtao Zhang, and Jun Yang. 2020. ELP2IM: Efficient and Low Power Bitwise Operation Processing in DRAM. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 303–314.
[169]
Chongchong Xu, Chao Wang, Lei Gong, Lihui Jin, Xi Li, and Xuehai Zhou. 2018. Domino: Graph Processing Services on Energy-Efficient Hardware Accelerator. In 2018 IEEE International Conference on Web Services (ICWS). IEEE, 274–281.
[170]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826(2018).
[171]
Da Yan, Hongzhi Chen, James Cheng, M Tamer Özsu, Qizhen Zhang, and John Lui. 2017. G-thinker: big graph mining made easier and faster. arXiv preprint arXiv:1709.03110(2017).
[172]
Da Yan, James Cheng, Kai Xing, Yi Lu, Wilfred Ng, and Yingyi Bu. 2014. Pregel algorithms for graph connectivity problems with performance guarantees. Proceedings of the VLDB Endowment 7, 14 (2014), 1821–1832.
[173]
Da Yan, Wenwen Qu, Guimu Guo, and Xiaoling Wang. 2020. PrefixFPM: A Parallel Framework for General-Purpose Frequent Pattern Mining. In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE) 2020.
[174]
Mingyu Yan, Lei Deng, Xing Hu, Ling Liang, Yujing Feng, Xiaochun Ye, Zhimin Zhang, Dongrui Fan, and Yuan Xie. 2020. Hygcn: A gcn accelerator with hybrid architecture. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 15–29.
[175]
Pengcheng Yao, Long Zheng, Zhen Zeng, Yu Huang, Chuangyi Gui, Xiaofei Liao, Hai Jin, and Jingling Xue. [n. d.]. A Locality-Aware Energy-Efficient Accelerator for Graph Mining Applications. ([n. d.]).
[176]
Pengcheng Yao, Long Zheng, Zhen Zeng, Yu Huang, Chuangyi Gui, Xiaofei Liao, Hai Jin, and Jingling Xue. 2020. A Locality-Aware Energy-Efficient Accelerator for Graph Mining Applications. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 895–907.
[177]
Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, and Xuehai Qian. 2018. GraphP: Reducing Communication for PIM-based Graph Processing with Efficient Data Partition. In High Performance Computer Architecture (HPCA), 2018 IEEE International Symposium on. IEEE, 544–557.
[178]
Yun Zhang, Faisal N Abu-Khzam, Nicole E Baldwin, Elissa J Chesler, Michael A Langston, and Nagiza F Samatova. 2005. Genome-scale computational approaches to memory-intensive applications in systems biology. In SC’05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing. IEEE, 12–12.
[179]
Cheng Zhao, Zhibin Zhang, Peng Xu, Tianqi Zheng, and Xueqi Cheng. 2019. Kaleido: An Efficient Out-of-core Graph Mining System on A Single Machine. arXiv preprint arXiv:1905.09572(2019).
[180]
Kangfei Zhao and Jeffrey Xu Yu. 2017. All-in-one: Graph processing in rdbmss revisited. In Proceedings of the 2017 ACM International Conference on Management of Data. 1165–1180.
[181]
Long Zheng, Jieshan Zhao, Yu Huang, Qinggang Wang, Zhen Zeng, Jingling Xue, Xiaofei Liao, and Hai Jin. 2020. Spara: An Energy-Efficient ReRAM-Based Accelerator for Sparse Graph Analytics Applications. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 696–707.
[182]
Minxuan Zhou, Mohsen Imani, Saransh Gupta, Yeseong Kim, and Tajana Rosing. 2019. GRAM: graph processing in a ReRAM-based computational memory. In ASP-DAC. 591–596.
[183]
Youwei Zhuo, Chao Wang, Mingxing Zhang, Rui Wang, Dimin Niu, Yanzhi Wang, and Xuehai Qian. 2019. Graphq: Scalable PIM-based graph processing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 712–725.

Cited By

View all
  • (2024)$\mathcal{O}(n)$O(n) Key–Value Sort With Active Compute MemoryIEEE Transactions on Computers10.1109/TC.2024.337177373:5(1341-1356)Online publication date: 29-Feb-2024
  • (2024)NDPBridge: Enabling Cross-Bank Coordination in Near-DRAM-Bank Processing Architectures2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00052(628-643)Online publication date: 29-Jun-2024
  • (2024)MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00024(186-203)Online publication date: 2-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture
October 2021
1322 pages
ISBN:9781450385572
DOI:10.1145/3466752
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Clique Enumeration
  2. Clique Listing
  3. Clique Mining
  4. Graph Accelerators
  5. Graph Learning
  6. Graph Mining
  7. Graph Pattern Matching
  8. Instruction Set Architecture
  9. Parallel Graph Algorithms
  10. Processing In Memory
  11. Processing Near Memory
  12. Subgraph Isomorphism

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MICRO '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)293
  • Downloads (Last 6 weeks)26
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)$\mathcal{O}(n)$O(n) Key–Value Sort With Active Compute MemoryIEEE Transactions on Computers10.1109/TC.2024.337177373:5(1341-1356)Online publication date: 29-Feb-2024
  • (2024)NDPBridge: Enabling Cross-Bank Coordination in Near-DRAM-Bank Processing Architectures2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00052(628-643)Online publication date: 29-Jun-2024
  • (2024)MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00024(186-203)Online publication date: 2-Mar-2024
  • (2024)Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00024(99-114)Online publication date: 24-Jun-2024
  • (2024)Distributed Memory Implementation of Bron-Kerbosch AlgorithmIEEE Access10.1109/ACCESS.2024.339377112(59575-59588)Online publication date: 2024
  • (2023)Flip: Data-centric Edge CGRA AcceleratorACM Transactions on Design Automation of Electronic Systems10.1145/363111829:1(1-25)Online publication date: 18-Dec-2023
  • (2023)GraphINC: Graph Pattern Mining at Network SpeedProceedings of the ACM on Management of Data10.1145/35893291:2(1-28)Online publication date: 20-Jun-2023
  • (2023)Phases, Modalities, Spatial and Temporal Locality: Domain Specific ML Prefetcher for Accelerating Graph AnalyticsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607043(1-15)Online publication date: 12-Nov-2023
  • (2023)Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear AlgebraIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.332202934:12(3147-3161)Online publication date: 4-Oct-2023
  • (2023)Evaluating Machine LearningWorkloads on Memory-Centric Computing Systems2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00013(35-49)Online publication date: Apr-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media