H. Esmaeilzadeh
H. Esmaeilzadeh

homepage
hadiatcc.gatech.edu

  Affiliation history
Bibliometrics: publication history
Average citations per article25.16
Citation Count780
Publication count31
Publication years2004-2017
Available for download22
Average downloads per article942.50
Downloads (cumulative)20,735
Downloads (12 Months)3,750
Downloads (6 Weeks)381
Professional ACM Member
SEARCH
ROLE
Arrow RightAuthor only


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


31 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 20 of 31
Result page: 1 2

Sort by:

1 published by ACM
October 2017 MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 136,   Downloads (12 Months): 609,   Downloads (Overall): 609

Full text available: PDFPDF
The growing scale and complexity of Machine Learning (ML) algorithms has resulted in prevalent use of distributed general-purpose systems. In a rather disjoint effort, the community is focusing mostly on high performance single-node accelerators for learning. This work bridges these two paradigms and offers CoSMIC, a full computing stack constituting ...
Keywords: accelerator, distributed, cloud, scale-out, machine learning

2 published by ACM
October 2016 Communications of the ACM: Volume 59 Issue 11, November 2016
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 18,   Downloads (12 Months): 333,   Downloads (Overall): 793

Full text available: HtmlHtml  PDFPDF
Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost. It is challenging to improve all of these factors simultaneously. To advance datacenter capabilities beyond what commodity server designs can provide, we designed and built a composable, reconfigurable hardware fabric based on field programmable gate arrays (FPGA). Each ...

3
June 2016 ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture
Publisher: IEEE Press
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 8,   Downloads (12 Months): 118,   Downloads (Overall): 135

Full text available: PDFPDF
Conventionally, an approximate accelerator replaces every invocation of a frequently executed region of code without considering the final quality degradation. However, there is a vast decision space in which each invocation can either be delegated to the accelerator---improving performance and efficiency--or run on the precise core---maintaining quality. In this paper ...
Keywords: approximate computing, quality control, statistical compiler optimization, accelerators, statistical guarantees
Also published in:
October 2016  ACM SIGARCH Computer Architecture News - ISCA'16: Volume 44 Issue 3, June 2016

4 published by ACM
March 2016 ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 9,   Downloads (12 Months): 156,   Downloads (Overall): 335

Full text available: PDFPDF
Approximate computing trades quality of application output for higher efficiency and performance. Approximation is useful only if its impact on application output quality is acceptable to the users. However, there is a lack of systematic solutions and studies that explore users' perspective on the effects of approximation. In this paper, ...
Keywords: approximate computing, games-with-a-purpose, crowdsourcing, energy-efficient computing
Also published in:
June 2016  ACM SIGPLAN Notices - ASPLOS '16: Volume 51 Issue 4, April 2016 July 2016  ACM SIGARCH Computer Architecture News - ASPLOS'16: Volume 44 Issue 2, May 2016

5
March 2016 DATE '16: Proceedings of the 2016 Conference on Design, Automation & Test in Europe
Publisher: EDA Consortium
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1,   Downloads (12 Months): 26,   Downloads (Overall): 28

Full text available: PDFPDF
Modern applications including graphics, multimedia, web search, and data analytics not only can benefit from acceleration, but also exhibit significant degrees of tolerance to imprecise computation. This amenability to approximation provides an opportunity to trade quality of the results for higher performance and better resource utilization. Exploiting this opportunity is ...

6 published by ACM
January 2016 ACM Transactions on Architecture and Code Optimization (TACO): Volume 12 Issue 4, January 2016
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 8,   Downloads (12 Months): 97,   Downloads (Overall): 215

Full text available: PDFPDF
This article aims to tackle two fundamental memory bottlenecks: limited off-chip bandwidth (bandwidth wall) and long access latency (memory wall). To achieve this goal, our approach exploits the inherent error resilience of a wide range of applications. We introduce an approximation technique, called Rollback-Free Value Prediction (RFVP). When certain safe-to-approximate ...
Keywords: Load value approximation, memory bandwidth, value prediction, memory latency, GPUs

7 published by ACM
December 2015 MICRO-48: Proceedings of the 48th International Symposium on Microarchitecture
Publisher: ACM
Bibliometrics:
Citation Count: 5
Downloads (6 Weeks): 20,   Downloads (12 Months): 175,   Downloads (Overall): 404

Full text available: PDFPDF
Graphics Processing Units (GPUs) can accelerate diverse classes of applications, such as recognition, gaming, data analytics, weather prediction, and multimedia. Many of these applications are amenable to approximate execution. This application characteristic provides an opportunity to improve GPU performance and efficiency. Among approximation techniques, neural accelerators have been shown to ...
Keywords: GPU, approximate computing, neural processing unit

8
October 2015 CASES '15: Proceedings of the 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems
Publisher: IEEE Press
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1,   Downloads (12 Months): 18,   Downloads (Overall): 109

Full text available: PDFPDF
Performance is the raw material for computing. For more than 40 years, consistent and exponential improvement in transistor scaling coupled with continuous advances in general-purpose processor design has exponentially reduced its cost. However, as we enter the dark silicon era (as we projected in our study [1, 2] and others ...

9 published by ACM
August 2015 ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering
Publisher: ACM
Bibliometrics:
Citation Count: 6
Downloads (6 Weeks): 5,   Downloads (12 Months): 42,   Downloads (Overall): 140

Full text available: PDFPDF
Energy efficiency is a primary constraint in modern systems. Approximate computing is a promising approach that trades quality of result for gains in efficiency and performance. State- of-the-art approximate programming models require extensive manual annotations on program data and operations to guarantee safe execution of approximate programs. The need for ...
Keywords: Language design, modular approximate programming

10
March 2015 DATE '15: Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition
Publisher: EDA Consortium
Bibliometrics:
Citation Count: 11
Downloads (6 Weeks): 3,   Downloads (12 Months): 14,   Downloads (Overall): 88

Full text available: PDFPDF
Relaxing the traditional abstraction of "near-perfect" accuracy in hardware design can lead to significant gains in energy efficiency, area, and performance. To exploit this opportunity, there is a need for design abstractions that can systematically incorporate approximation in hardware design. We introduce Axilog, a set of language annotations, that provides ...

11 published by ACM
December 2014 Communications of the ACM: Volume 58 Issue 1, January 2015
Publisher: ACM
Bibliometrics:
Citation Count: 6
Downloads (6 Weeks): 22,   Downloads (12 Months): 281,   Downloads (Overall): 1,632

Full text available: HtmlHtml  PDFPDF  PDF Chinese translationPDF Chinese translation
As improvements in per-transistor speed and energy efficiency diminish, radical departures from conventional approaches are needed to continue improvements in the performance and energy efficiency of general-purpose processors. One such departure is approximate computing, where error in computation is acceptable and the traditional robust digital abstraction of near-perfect accuracy is ...

12 published by ACM
August 2014 PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation
Publisher: ACM
Bibliometrics:
Citation Count: 5
Downloads (6 Weeks): 4,   Downloads (12 Months): 39,   Downloads (Overall): 151

Full text available: PDFPDF
This paper demonstrates how to utilize the inherent error resilience of a wide range of applications to mitigate the memory wall -- the discrepancy between core and memory speed. We define a new microarchitecturally-triggered approximation technique called rollback-free value prediction. This technique predicts the value of safe-to-approximate loads when they ...
Keywords: compilers, rollback-free value prediction, general-purpose approximate computing, memory systems

13
June 2014 ISCA '14: Proceeding of the 41st annual international symposium on Computer architecuture
Publisher: IEEE Press
Bibliometrics:
Citation Count: 102
Downloads (6 Weeks): 28,   Downloads (12 Months): 283,   Downloads (Overall): 887

Full text available: PDFPDF
Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost. It is challenging to improve all of these factors simultaneously. To advance datacenter capabilities beyond what commodity server designs can provide, we have designed and built a composable, reconfigurablefabric to accelerate portions of large-scale software services. Each instantiation ...
Also published in:
October 2014  ACM SIGARCH Computer Architecture News - ISCA '14: Volume 42 Issue 3, June 2014

14
June 2014 ISCA '14: Proceeding of the 41st annual international symposium on Computer architecuture
Publisher: IEEE Press
Bibliometrics:
Citation Count: 29
Downloads (6 Weeks): 7,   Downloads (12 Months): 131,   Downloads (Overall): 530

Full text available: PDFPDF
As improvements in per-transistor speed and energy efficiency diminish, radical departures from conventional approaches are becoming critical to improving the performance and energy efficiency of general-purpose processors. We propose a solution--from circuit to compiler-that enables general-purpose use of limited-precision, analog hardwareto accelerate "approximable" code---code that can tolerate imprecise execution. We ...
Also published in:
October 2014  ACM SIGARCH Computer Architecture News - ISCA '14: Volume 42 Issue 3, June 2014

15
July 2013 IEEE Computer Architecture Letters: Volume 12 Issue 2, July 2013
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 0

This paper describes a first order multicore model to project a tighter upper bound on performance than previous Amdahl's Law based approaches. The speedup over a known baseline is a function of the core performance, microarchitectural features, application parameters, chip organization, and multicore topology. The model is flexible enough to ...
Keywords: Multiple Data Stream Architectures (Multiprocessors),Computer Systems Organization,General,Modeling of computer architecture,Computer Systems Organization,Processor Architectures

16
May 2013 IEEE Micro: Volume 33 Issue 3, May 2013
Publisher: IEEE Computer Society Press
Bibliometrics:
Citation Count: 5

This work proposes an approximate algorithmic transformation and a new class of accelerators, called neural processing units (NPUs). NPUs leverage the approximate algorithmic transformation that converts regions of code from a Von Neumann model to a neural model. NPUs achieve an average 2.3� speedup and 3.0� energy savings for general-purpose ...
Keywords: Computer architecture,Neural networks,Approximation algorithms,Algorithm design and analysis,Accelerators,NPUs,approximate computing,accelerators,neural networks,Parrot algorithmic transformation,neural processing units

17
February 2013 HPCA '13: Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 5

Dynamic multicore architectures, that fuse and split cores at run time, potentially offer a level of performance/energy agility that static multicore designs cannot achieve. Conventional ISAs, however, have scalability limits to fusion. EDGE-based designs offer greater scalability but to date have been performance limited by significant microarchitectural bottlenecks. This paper ...

18 published by ACM
February 2013 Communications of the ACM: Volume 56 Issue 2, February 2013
Publisher: ACM
Bibliometrics:
Citation Count: 21
Downloads (6 Weeks): 12,   Downloads (12 Months): 157,   Downloads (Overall): 2,708

Full text available: HtmlHtml  PDFPDF
Starting in 2004, the microprocessor industry has shifted to multicore scaling---increasing the number of cores per die each generation---as its principal strategy for continuing performance growth. Many in the research community believe that this exponential core scaling will continue into the hundreds or thousands of cores per chip, auguring a ...

19
December 2012 MICRO-45: Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 111
Downloads (6 Weeks): 19,   Downloads (12 Months): 299,   Downloads (Overall): 1,212

Full text available: PDFPDF
This paper describes a learning-based approach to the acceleration of approximate programs. We describe the \emph{Parrot transformation}, a program transformation that selects and trains a neural network to mimic a region of imperative code. After the learning phase, the compiler replaces the original code with an invocation of a low-power ...
Keywords: Approximate Computing, Neural Networks, Accelerator, Neural Processing Unit, NPU

20 published by ACM
August 2012 ACM Transactions on Computer Systems (TOCS): Volume 30 Issue 3, August 2012
Publisher: ACM
Bibliometrics:
Citation Count: 6
Downloads (6 Weeks): 16,   Downloads (12 Months): 84,   Downloads (Overall): 995

Full text available: PDFPDF
Since 2004, processor designers have increased core counts to exploit Moore’s Law scaling, rather than focusing on single-core performance. The failure of Dennard scaling, to which the shift to multicore parts is partially a response, may soon limit multicore scaling just as single-core scaling has been curtailed. This paper models ...
Keywords: modeling, power, Dark silicon, multicore, technology scaling



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us