Author image not provided
 Dmitry V Ponomarev

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article9.04
Citation Count217
Publication count24
Publication years1998-2007
Available for download13
Average downloads per article336.77
Downloads (cumulative)4,378
Downloads (12 Months)102
Downloads (6 Weeks)13
SEARCH
ROLE
Arrow RightAuthor only
· Advisor only
· All roles


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


24 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 20 of 24
Result page: 1 2

Sort by:

1 published by ACM
June 2007 ICS '07: Proceedings of the 21st annual international conference on Supercomputing
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 1,   Downloads (12 Months): 7,   Downloads (Overall): 221

Full text available: PDFPDF
The register file is one of the most critical datapath components limiting the number of threads that can be supported on a Simultaneous Multithreading (SMT) processor. To allow the use of smaller register files without degrading performance, techniques that maximize the efficiency of using registers through aggressive register allocation/deallocation can ...
Keywords: register files, simultaneous multithreading

2 published by ACM
October 2006 ISLPED '06: Proceedings of the 2006 international symposium on Low power electronics and design
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 1,   Downloads (12 Months): 8,   Downloads (Overall): 180

Full text available: PDFPDF
Today's superscalar microprocessors use large, heavily-ported physical register files (RFs) to increase the instruction throughput. The high complexity and power dissipation of such RFs mainly stem from the need to maintain each and every result for a large number of cycles after the result generation. We observed that a significant ...
Keywords: energy-efficiency, register files

3 published by ACM
September 2006 PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 0,   Downloads (12 Months): 5,   Downloads (Overall): 181

Full text available: PDFPDF
High-performance microprocessors use large, heavily-ported physical register files (RFs) to increase the instruction throughput. The high complexity and power dissipation of such RFs mainly stem from the need to maintain each and every result for a large number of cycles after the result generation. We observed that a significant fraction ...
Keywords: energy-efficiency, register files

4 published by ACM
September 2006 PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 5
Downloads (6 Weeks): 2,   Downloads (12 Months): 12,   Downloads (Overall): 321

Full text available: PDFPDF
In SMT processors, the complex interplay between private and shared datapath resources needs to be considered in order to realize the full performance potential. In this paper, we show that blindly increasing the size of the per-thread reorder buffers to provide a larger number of in-flight instructions does not result ...
Keywords: reorder buffer, simultaneous multithreading

5
August 2006 ICPP '06: Proceedings of the 2006 International Conference on Parallel Processing
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 0

Simultaneous Multi-threading (SMT) architectures open up new avenues for datapath optimizations due to the presence of thread-level parallelism (TLP). One recent proposal for exploiting such parallelism is the 2OP_BLOCK scheduler design, which completely avoids the dispatch of instructions with two non-ready source operands into the issue queue. This technique reduces ...

6
August 2006 ICPP '06: Proceedings of the 2006 International Conference on Parallel Processing
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 3

We propose a series of aggressive register deallocation mechanisms to reduce the register file pressure and increase the parallelism exploited by superscalar microprocessors. Our techniques are based on a key observation that a register value can be temporarily decoupled from the register identifier. Specifically, even if a physical register is ...

7 published by ACM
June 2006 ACM Transactions on Architecture and Code Optimization (TACO): Volume 3 Issue 2, June 2006
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 2,   Downloads (12 Months): 9,   Downloads (Overall): 634

Full text available: PDFPDF
Traditional dynamic scheduler designs use one issue queue entry per instruction, regardless of the actual number of operands actively involved in the wakeup process. We propose Instruction Packing---a novel microarchitectural technique that reduces both delay and power consumption of the issue queue by sharing the associative part of an issue ...
Keywords: low power, Issue queue, instruction packing

8
October 2005 ICCD '05: Proceedings of the 2005 International Conference on Computer Design
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 2

The dynamic instruction scheduling logic is one of the most critical components of modern superscalar microprocessors, both from the delay and power dissipation standpoints. The delay and energy requirement of driving the wakeup tags across the associatively-addressed issue queue accounts for significant percentage of the scheduler�s overhead and also limits ...

9 published by ACM
August 2005 ISLPED '05: Proceedings of the 2005 international symposium on Low power electronics and design
Publisher: ACM
Bibliometrics:
Citation Count: 10
Downloads (6 Weeks): 2,   Downloads (12 Months): 5,   Downloads (Overall): 454

Full text available: PDFPDF
The instruction scheduling logic used in modern superscalar microprocessors often relies on associative searching of the issue queue entries to dynamically wakeup instructions for the execution. Traditional designs use one issue queue entry for each instruction, regardless of the actual number of operands actively used in the wakeup process. In ...
Keywords: low power, issue queue, instruction packing

10
December 2004 MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 26
Downloads (6 Weeks): 1,   Downloads (12 Months): 9,   Downloads (Overall): 384

Full text available: PDFPDF
A large percentage of computed results have fewer significant bits compared to the full width of a register. We exploit this fact to pack multiple results into a single physical register to reduce the pressure on the register file in a superscalar processor. Two schemes for dynamically packing multiple "narrow-width" ...

11
October 2004 ICCD '04: Proceedings of the IEEE International Conference on Computer Design
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 25

Modern superscalar microprocessors need sizable register files to support large number of in-flight instructions for exploiting ILP. An alternative to building large register files is to use smaller number of registers, but manage them more effectively. More efficient management of registers can also result in higher performance if the reduction ...

12
October 2003 ICCD '03: Proceedings of the 21st International Conference on Computer Design
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 3

We consider two approaches for reducing the complexity and power dissipation in processors that use separate register file to maintain committed register values. The first approach relies on a distributed implementation of the Reorder Buffer (ROB) that spreads the centralized ROB structure across the function units (FUs), with each distributed ...

13
October 2003 IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power: Volume 11 Issue 5, October 2003
Publisher: IEEE Educational Activities Department
Bibliometrics:
Citation Count: 17

The out-of-order issue queue (IQ), used in modern superscalar processors is a considerable source of energy dissipation. We consider design alternatives that result in significant reductions in the power dissipation of the IQ (by as much as 75%) through the use of comparators that dissipate energy mainly on a tag ...
Keywords: bitline segmentation, low-power instruction scheduling, low-power comparator, low-power superscalar data-path

14
September 2003 PACT '03: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 12

We present a technique for reducing the power dissipation in the course of writebacks and committments in a datapath that uses a dedicated architectural register file (ARF) to hold committed values. Our mechanism capitalizes on the observation that most of the producedregister values are short-lived, meaning that the destination registers ...

15 published by ACM
August 2003 ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and design
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 0,   Downloads (12 Months): 4,   Downloads (Overall): 199

Full text available: PDFPDF
Traditional pulldown comparators that are used to implement associative addressing logic in superscalar microprocessors dissipate energy on a mismatch in any bit position in the comparands. As mismatches occur much more frequently than matches in many situations, such circuits are extremely energy-inefficient. In recognition of this inefficiency, a series of ...
Keywords: low-power comparators, superscalar datapath

16 published by ACM
August 2003 ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and design
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 0,   Downloads (12 Months): 3,   Downloads (Overall): 279

Full text available: PDFPDF
Modern superscalar processors implement precise interrupts by using the Reorder Buffer (ROB). In some microarchitectures , such as the Intel P6, the ROB also serves as a repository for the uncommitted results. In these designs, the ROB is a complex multi-ported structure that dissipates a significant percentage of the overall ...
Keywords: low-power design, low-complexity datapath, reorder buffer, short-lived values

17
January 2003
Bibliometrics:
Citation Count: 0

The design of high-end microprocessors is increasingly constrained by high levels of power consumption. In this dissertation, we propose several microarchitectural-level, technology-independent solutions for reducing the power requirements of a superscalar microprocessor without seriously impacting its performance. First, we introduce a mechanism to dynamically allocate processor's resources to adjust to ...

18
September 2002 PATMOS '02: Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 4

Some of today's superscalar processors, such as the Intel Pentium III, implement physical registers using the Reorder Buffer (ROB) slots. As much as 27% of the total CPU power is expended within the ROB in such designs, making the ROB a dominant source of power dissipation within the processor. This ...

19 published by ACM
June 2002 ICS '02: Proceedings of the 16th international conference on Supercomputing
Publisher: ACM
Bibliometrics:
Citation Count: 8
Downloads (6 Weeks): 2,   Downloads (12 Months): 13,   Downloads (Overall): 561

Full text available: PDFPDF
In some of today's superscalar processors (e.g.the Pentium III), the result repositories are implemented as the Reorder Buffer (ROB) slots. In such designs, the ROB is a complex multi-ported structure that occupies a significant portion of the die area and dissipates a non-trivial fraction of the total chip power, as ...
Keywords: low-power design, low-complexity datapath, reorder buffer

20
March 2002 DATE '02: Proceedings of the conference on Design, automation and test in Europe
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 13
Downloads (6 Weeks): 0,   Downloads (12 Months): 3,   Downloads (Overall): 207

Full text available: PDFPDF
This paper describes the AccuPower toolset -- a set ofsimulation tools accurately estimating the powerdissipation within a superscalar microprocessor.AccuPower uses a true hardware level and cycle levelmicroarchitectural simulator and energy dissipationcoefficients gleaned from SPICE measurements of actualCMOS layouts of critical datapath components. Transitioncounts can be obtained at the level of ...



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us