Author image not provided
 John Kim

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article21.74
Citation Count761
Publication count35
Publication years2005-2017
Available for download26
Average downloads per article596.65
Downloads (cumulative)15,513
Downloads (12 Months)1,437
Downloads (6 Weeks)124
SEARCH
ROLE
Arrow RightAuthor only


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


35 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 20 of 35
Result page: 1 2

Sort by:

1 published by ACM
April 2017 ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 19,   Downloads (12 Months): 274,   Downloads (Overall): 274

Full text available: PDFPDF
NUMA (non-uniform memory access) servers are commonly used in high-performance computing and datacenters. Within each server, a processor-interconnect (e.g., Intel QPI, AMD HyperTransport) is used to communicate between the different sockets or nodes. In this work, we explore the impact of the processor-interconnect on overall performance -- in particular, the ...
Keywords: arbitration, processor-interconnect, numa servers, router concentration
Also published in:
May 2017  ACM SIGPLAN Notices - ASPLOS '17: Volume 52 Issue 4, April 2017 May 2017  ACM SIGARCH Computer Architecture News - Asplos'17: Volume 45 Issue 1, March 2017

2 published by ACM
December 2016 ACM Transactions on Architecture and Code Optimization (TACO): Volume 13 Issue 4, December 2016
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 9,   Downloads (12 Months): 181,   Downloads (Overall): 213

Full text available: PDFPDF
In this article, we describe how to ease memory management between a Central Processing Unit (CPU) and one or multiple discrete Graphic Processing Units (GPUs) by architecting a novel hardware-based Unified Memory Hierarchy (UMH). Adopting UMH, a GPU accesses the CPU memory only if it does not find its required ...
Keywords: Unified memory architecture, graphics processing units, memory hierarchy, high performance computing

3
February 2016 IEEE Transactions on Computers: Volume 65 Issue 2, February 2016
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 0

A cost-efficient network-on-chip is needed in a scalable many-core systems. Recent multicore processors have leveraged a ring topology and hierarchical ring can increase scalability but presents different challenges, including higher hop count and global ring bottleneck. In this work, we describe a hierarchical ring topology that we refer to as ...

4 published by ACM
April 2015 CHI EA '15: Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 3,   Downloads (12 Months): 50,   Downloads (Overall): 157

Full text available: PDFPDF
Various conditions can result in scratching behavior and severe itching conditions such as atopic dermatitis can significantly impact on one's quality of life. Because the management of many itching conditions is not necessarily about curing the condition but instead about properly maintaining or controlling the condition, a proper understanding of ...
Keywords: mobile system, smartwatch, scratch recognition

5
November 2014 SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Publisher: IEEE Press
Bibliometrics:
Citation Count: 7
Downloads (6 Weeks): 4,   Downloads (12 Months): 34,   Downloads (Overall): 310

Full text available: PDFPDF
Through-Silicon Interposer (TSI) has recently been proposed to provide high memory bandwidth and improve energy efficiency of the main memory system. However, the impact of TSI on main memory system architecture has not been well explored. While TSI improves the I/O energy efficiency, we show that it results in an ...

6 published by ACM
November 2014 CCS '14: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 7,   Downloads (12 Months): 54,   Downloads (Overall): 422

Full text available: PDFPDF
Servers that consist of multiple nodes and sockets are interconnected together with a high-bandwidth, low latency processor interconnect network, such as Intel QPI or AMD Hypertransport technologies. The different nodes exchange packets through routers which communicate with other routers. A key component of a router is the routing table which ...
Keywords: processor-interconnect, routing table, vulnerability, router

7 published by ACM
June 2014 ICS '14: Proceedings of the 28th ACM international conference on Supercomputing
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 4,   Downloads (12 Months): 34,   Downloads (Overall): 212

Full text available: PDFPDF
The scalability trends of modern semiconductor technology lead to increasingly dense multicore chips. Unfortunately, physical limitations in area, power, off-chip bandwidth, and yield constrain single-chip designs to a relatively small number of cores, beyond which scaling becomes impractical. Multi-chip designs overcome these constraints, and can reach scales impossible to realize ...
Keywords: interconnection networks, energy efficiency, nanophotonics

8
June 2014 IEEE Transactions on Computers: Volume 63 Issue 6, June 2014
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 1

Many-core processors will have many processing cores with a network-on-chip (NoC) that provides access to shared resources such as main memory and on-chip caches. However, locally-fair arbitration in multi-stage NoC can lead to globally unfair access to shared resources and impact system-level performance depending on where each task is physically ...

9
October 2013 PACT '13: Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Publisher: IEEE Press
Bibliometrics:
Citation Count: 23
Downloads (6 Weeks): 11,   Downloads (12 Months): 97,   Downloads (Overall): 698

Full text available: PDFPDF
Memory bandwidth has been one of the most critical system performance bottlenecks. As a result, the HMC (Hybrid Memory Cube) has recently been proposed to improve DRAM bandwidth as well as energy efficiency. In this paper, we explore different system interconnect designs with HMCs. We show that processor-centric network architectures ...
Keywords: processor interconnect, hybrid memory cube, memory network, memory bandwidth wall

10
February 2012 HPCA '12: Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 3

Cost-efficient networks are critical in creating scalable large-scale systems, including those found in supercomputers and datacenters. High-radix routers reduce network cost by lowering the network diameter while providing a high bisection bandwidth and path diversity. However, as the port count increases, the high-radix router microarchitecture needs to scale efficiently. Hierarchical ...

11 published by ACM
December 2011 MICRO-44: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Publisher: ACM
Bibliometrics:
Citation Count: 6
Downloads (6 Weeks): 0,   Downloads (12 Months): 16,   Downloads (Overall): 253

Full text available: PDFPDF
The nanophotonic signaling technology enables efficient global communication and low-diameter networks such as crossbars that are often optically arbitrated. However, existing optical arbitration schemes incur costly overheads (e.g., waveguides, laser power, etc.) to avoid starvation caused by their inherent fixed priority, which limits their applicability in power-bounded future many-core processors. ...
Keywords: arbitration, nanophotonics, interconnection networks

12
October 2011 PACT '11: Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 1

The unique characteristics of prefetch traffic have not been considered in on-chip network design for multicore architectures. Most prefetchers are often oblivious to the network congestion when generating prefetech requests. In this work, we investigate the interaction between prefetchers and on-chip networks and exploit the synergy of these two components ...
Keywords: Multi-cores, Prefetch, On-chip Networks

13
October 2011 ICCD '11: Proceedings of the 2011 IEEE 29th International Conference on Computer Design
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 4

On-chip networks are becoming more important as the number of on-chip components continue to increase. 2D mesh topology is a commonly assumed topology for on-chip networks but in this work, we make the argument that 2D torus can provide a more cost-efficient on-chip network since the on-chip network datapath is ...

14 published by ACM
June 2011 DAC '11: Proceedings of the 48th Design Automation Conference
Publisher: ACM
Bibliometrics:
Citation Count: 15
Downloads (6 Weeks): 0,   Downloads (12 Months): 9,   Downloads (Overall): 135

Full text available: PDFPDF
The increasing number of integrated components on a single chip has increased the importance of on-chip networks. A significant part of on-chip network routers is the buffer, as it occupies a large area and consumes a significant amount of power. In this work, we propose FlexiBuffer, a microarchitecture in which ...
Keywords: on-chip networks, power gating, buffer organization, leakage power, routers

15
December 2010 MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 11
Downloads (6 Weeks): 1,   Downloads (12 Months): 12,   Downloads (Overall): 210

Full text available: PDFPDF
Emerging many-core chip multiprocessors will integrate dozens of small processing cores with an on-chip interconnect consisting of point-to-point links. The interconnect enables the processing cores to not only communicate, but to share common resources such as main memory resources and I/O controllers. In this work, we propose an arbitration scheme ...
Keywords: on-chip network, age-based arbitration, fairness, quality of service (QoS)

16
December 2010 MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 25
Downloads (6 Weeks): 6,   Downloads (12 Months): 54,   Downloads (Overall): 754

Full text available: PDFPDF
As the number of cores and threads in many core compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network design. This paper explores throughput-effective network-on-chips (NoC) for future many core accelerators that employ bulk-synchronous parallel (BSP) programming models such as CUDA and ...
Keywords: NoC, Compute accelerator, GPGPU

17
November 2010 SC '10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 1,   Downloads (12 Months): 10,   Downloads (Overall): 211

Full text available: PDFPDF
With the number of cores on a chip continuing to increase, proper evaluation of on-chip network is critical for not only network performance but also overall system performance. In this paper, we show how a network-only simulation can be limited as it does not provide an accurate representation of system ...

18 published by ACM
September 2010 PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 4,   Downloads (12 Months): 38,   Downloads (Overall): 344

Full text available: PDFPDF
There has been little work investigating the overall performance impact of on-chip communication in manycore compute accelerators. In this paper we evaluate performance of a GPU-like compute accelerator running CUDA workloads and consisting of compute nodes, interconnection network and the graphics DRAM memory system using detailed cycle-level simulation. First, we ...
Keywords: noc

19 published by ACM
September 2010 PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 0,   Downloads (12 Months): 8,   Downloads (Overall): 213

Full text available: PDFPDF
The on-chip network of emerging many-core CMPs enables the sharing of numerous on-chip components. This on-chip network needs to ensure fairness when accessing the shared resources. In this work, we propose providing equality of service (EoS) in future many-core CMPs on-chip networks by leveraging distance, or hop count, to approximate ...
Keywords: on-chip network, age-based arbitration, fairness

20 published by ACM
December 2009 NoCArc '09: Proceedings of the 2nd International Workshop on Network on Chip Architectures
Publisher: ACM
Bibliometrics:
Citation Count: 10
Downloads (6 Weeks): 3,   Downloads (12 Months): 21,   Downloads (Overall): 545

Full text available: PDFPDF
On-chip networks are critical to the scaling of future multicore processors. Recent multicore processors have adopted ring topologies because of its simplicity and high bandwidth. In this paper, we first describe a bufferless router microarchitecture for an on-chip network ring topology. We propose to extend the bufferless router with an ...
Keywords: ring topology, router microarchitecture, on-chip network



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us