Author image not provided
 Ping Xiang

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article12.80
Citation Count128
Publication count10
Publication years2010-2014
Available for download6
Average downloads per article725.00
Downloads (cumulative)4,350
Downloads (12 Months)273
Downloads (6 Weeks)25
SEARCH
ROLE
Arrow RightAuthor only


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


10 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 10 of 10
Sort by:

1
May 2014 IPDPS '14: Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 1

The wide availability and the Single-Instruction Multiple-Thread (SIMT)-style programming model have made graphics processing units (GPUs) a promising choice for high performance computing. However, because of the SIMT style processing, an instruction will be executed in every thread even if the operands are identical for all the threads. To overcome ...
Keywords: GPGPU, scalar unit, SIMT, Vector unit

2 published by ACM
June 2013 MSPC '13: Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 2,   Downloads (12 Months): 11,   Downloads (Overall): 138

Full text available: PDFPDF
In this paper we advocate formal locality analysis on memory references of GPGPU kernels. We investigate the locality of reference at different cache levels in the memory hierarchy. At the L1 cache level, we look into the locality behavior at the warp-, the thread block- and the streaming multiprocessor-level. Using ...
Keywords: GPGPU, locality of reference, matrix multiplication, tiling

3 published by ACM
June 2013 ICS '13: Proceedings of the 27th international ACM conference on International conference on supercomputing
Publisher: ACM
Bibliometrics:
Citation Count: 6
Downloads (6 Weeks): 5,   Downloads (12 Months): 49,   Downloads (Overall): 249

Full text available: PDFPDF
State-of-art graphics processing units (GPUs) employ the single-instruction multiple-data (SIMD) style execution to achieve both high computational throughput and energy efficiency. As previous works have shown, there exists significant computational redundancy in SIMD execution, where different execution lanes operate on the same operand values. Such value locality is referred to ...
Keywords: GPGPU, redundancy

4 published by ACM
September 2012 PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 0,   Downloads (12 Months): 14,   Downloads (Overall): 186

Full text available: PDFPDF
Keywords: energy, gpgpu, heterogeneous, ilp

5
September 2012 ICPP '12: Proceedings of the 2012 41st International Conference on Parallel Processing
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 3

Given the extraordinary computational power of modern graphics processing units (GPUs), general purpose computation on GPUs (GPGPU) has become an increasingly important platform for high performance computing. To better understand how well the GPU resource has been utilized by application developers and then to facilitate them to develop high performance ...
Keywords: GPGPU, Performance, Compiler

6 published by ACM
June 2012 ACM Transactions on Architecture and Code Optimization (TACO): Volume 9 Issue 2, June 2012
Publisher: ACM
Bibliometrics:
Citation Count: 5
Downloads (6 Weeks): 5,   Downloads (12 Months): 45,   Downloads (Overall): 869

Full text available: PDFPDF
This article presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and judicious management of parallelism. The input to our compiler is a naïve GPU kernel function, which ...
Keywords: GPGPU, CUDA, GPU Computing, CUBLAS, OpenCL

7
May 2012 IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 4

This paper revisits the fundamental concept of the locality of references and proposes to quantify it as a conditional probability: in an address stream, given the condition that an address is accessed, how likely the same address (temporal locality) or an address within its neighborhood (spatial locality) will be accessed ...
Keywords: locality of references, probability, memory hierarchy, cache

8
February 2012 HPCA '12: Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 14

This paper presents a novel approach to utilize the CPU resource to facilitate the execution of GPGPU programs on fused CPU-GPU architectures. In our model of fused architectures, the GPU and the CPU are integrated on the same die and share the on-chip L3 cache and off-chip memory, similar to ...

9 published by ACM
June 2010 PLDI '10: Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation
Publisher: ACM
Bibliometrics:
Citation Count: 85
Downloads (6 Weeks): 12,   Downloads (12 Months): 129,   Downloads (Overall): 2,350

Full text available: PDFPDF
This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and judicious management of parallelism. The input to our compiler is a naïve GPU kernel function, which ...
Keywords: gpgpu, compiler
Also published in:
June 2010  ACM SIGPLAN Notices - PLDI '10: Volume 45 Issue 6, June 2010

10 published by ACM
January 2010 PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 1,   Downloads (12 Months): 25,   Downloads (Overall): 558

Full text available: PDFPDF
Developing high performance GPGPU programs is challenging for application developers since the performance is dependent upon how well the code leverages the hardware features of specific graphics processors. To solve this problem and relieve application developers of low-level hardware-specific optimizations, we introduce a novel compiler to optimize GPGPU programs. Our ...
Keywords: compiler, gpgpu
Also published in:
May 2010  ACM SIGPLAN Notices - PPoPP '10: Volume 45 Issue 5, May 2010



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us