Author image not provided
 Mark Henry Oskin

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article19.31
Citation Count753
Publication count39
Publication years1998-2017
Available for download30
Average downloads per article1,062.87
Downloads (cumulative)31,886
Downloads (12 Months)1,464
Downloads (6 Weeks)165
SEARCH
ROLE
Arrow RightAuthor only
· Editor only
· Advisor only
· All roles


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


41 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 20 of 41
Result page: 1 2 3

Sort by:

1
June 2018 ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture
Publisher: IEEE Press
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 5,   Downloads (12 Months): 94,   Downloads (Overall): 94

Full text available: PDFPDF
Recent studies on commercial hardware demonstrated that irregular GPU applications can bottleneck on virtual-to-physical address translations. In this work, we explore ways to reduce address translation overheads for such applications. We discover that the order of servicing a GPU's address translation requests (specifically, page table walks) plays a key role ...
Keywords: GPU, computer architecture, virtual address

2
June 2018 ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture
Publisher: IEEE Press
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 2,   Downloads (12 Months): 4,   Downloads (Overall): 4

Full text available: PDFPDF
GPUs are becoming first-class compute citizens and increasingly support programmability-enhancing features such as shared virtual memory and hardware cache coherence. This enables them to run a wider variety of programs. However, a key aspect of general-purpose programming where GPUs still have room for improvement is the ability to invoke system ...
Keywords: GPUS, accelerators and domain-specific architecture, graphics oriented architecture, multicore and parallel architectures, virtualization and OS

3 published by ACM
November 2017 SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 3,   Downloads (12 Months): 31,   Downloads (Overall): 162

Full text available: PDFPDF
Distributed systems incorporate GPUs because they provide massive parallelism in an energy-efficient manner. Unfortunately, existing programming models make it difficult to route a GPU-initiated network message. The traditional coprocessor model forces programmers to manually route messages through the host CPU. Other models allow GPU-initiated communication, but are inefficient for small ...
Keywords: fine-grain communication, graphics processing unit (GPU), message aggregation, partitioned global address space (PGAS)

4 published by ACM
July 2017 HPG '17: Proceedings of High Performance Graphics
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 15,   Downloads (12 Months): 117,   Downloads (Overall): 399

Full text available: PDFPDF
Rendering 3D-360° VR video from a camera rig is computation-intensive and typically performed offline. In this paper, we target the most time-consuming step of the VR video creation process, high-quality flow estimation with the bilateral solver. We propose a new algorithm, the hardware-friendly bilateral solver, that enables faster runtimes than ...
Keywords: FPGA design, GPU algorithm, hardware accelerators, parallelism, real-time image processing, virtual reality

5 published by ACM
May 2017 DAMON '17: Proceedings of the 13th International Workshop on Data Management on New Hardware
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 4,   Downloads (12 Months): 81,   Downloads (Overall): 270

Full text available: PDFPDF
General Purpose computing on Graphics Processing Units (GPGPU) has become an increasingly popular option for accelerating database queries. However, GPUs are not well-suited for all types of queries as data transfer costs can often dominate query execution. We develop a methodology for quantifying how well databases utilize GPU architectures using ...
Keywords: GPGPU, GPU database, profiling

6 published by ACM
October 2016 SoCC '16: Proceedings of the Seventh ACM Symposium on Cloud Computing
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 17,   Downloads (12 Months): 245,   Downloads (Overall): 245

Full text available: PDFPDF
Distributed applications and web services, such as online stores or social networks, are expected to be scalable, available, responsive, and fault-tolerant. To meet these steep requirements in the face of high round-trip latencies, network partitions, server failures, and load spikes, applications use eventually consistent datastores that allow them to weaken ...
Keywords: consistency, programming model, type system

7
October 2015 PACT '15: Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT)
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 2

Advances in die-stacking (3D) technology have enabled the tight integration of significant quantities of DRAM with high-performance computation logic. How to integrate this technology into the overall architecture of a computing system is an open question. While much recent effort has focused on hardware-based techniques for using die-stacked memory (e.g., ...

8 published by ACM
October 2015 MEMSYS '15: Proceedings of the 2015 International Symposium on Memory Systems
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 3,   Downloads (12 Months): 6,   Downloads (Overall): 103

Full text available: PDFPDF
Emerging classes of computer vision applications demand unprecedented computational resources and operate on large amounts of data. In particular, k-nearest neighbors (kNN), a cornerstone algorithm in these applications, incurs significant data movement. To address this challenge, the underlying architecture and memory subsystems must vertically evolve to address memory bandwidth and ...

9
July 2015 USENIX ATC '15: Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference
Publisher: USENIX Association
Bibliometrics:
Citation Count: 7

We present Grappa, a modern take on software distributed shared memory (DSM) for in-memory data-intensive applications. Grappa enables users to program a cluster as if it were a single, large, non-uniform memory access (NUMA) machine. Performance scales up even for applications that have poor locality and input-dependent load distribution. Grappa ...

10 published by ACM
April 2015 PaPoC '15: Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 4,   Downloads (12 Months): 29,   Downloads (Overall): 94

Full text available: PDFPDF
Out of the many NoSQL databases in use today, some that provide simple data structures for records, such as Redis and MongoDB, are now becoming popular. Building applications out of these complex data types provides a way to communicate intent to the database system without sacrificing flexibility or committing to ...

11 published by ACM
October 2014 OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 1,   Downloads (12 Months): 15,   Downloads (Overall): 130

Full text available: PDFPDF
Partitioned Global Address Space (PGAS) environments simplify writing parallel code for clusters because they make data movement implicit - dereferencing global pointers automatically moves data around. However, it does not free the programmer from needing to reason about locality - poor placement of data can lead to excessive and even ...
Keywords: continuation-passing style, llvm, locality, pgas, thread migration
Also published in:
December 2014  ACM SIGPLAN Notices - OOPSLA '14: Volume 49 Issue 10, October 2014

12 published by ACM
June 2014 MSPC '14: Proceedings of the workshop on Memory Systems Performance and Correctness
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1,   Downloads (12 Months): 4,   Downloads (Overall): 95

Full text available: PDFPDF
This paper introduces O-structures, a novel architectural memory element that can be used to facilitate parallelism in task-based execution models. Much like register renaming, each write to an O-structure creates a new version of program memory at that location. These versions can be accessed concurrently and out of program order. ...
Keywords: out-of-order, memory renaming, O-structures

13
May 2011 HotPar'11: Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Publisher: USENIX Association
Bibliometrics:
Citation Count: 8

Crunching large graphs is the basis of many emerging applications, such as social network analysis and bioinformatics. Graph analytics algorithms exhibit little locality and therefore present significant performance challenges. Hardware multithreading systems (e.g., Cray XMT) show that with enough concurrency, we can tolerate long latencies. Unfortunately, this solution is not ...

14
January 2010 IEEE Micro: Volume 30 Issue 1, January 2010
Publisher: IEEE Computer Society Press
Bibliometrics:
Citation Count: 7

Shared-memory multicore and multiprocessor systems are nondeterministic, which frustrates debugging and complicates testing of multithreaded code, impeding parallel programming's widespread adoption. The authors propose fully deterministic shared-memory multiprocessing that not only enhances debugging by offering repeatability by default, but also improves the quality of testing and the deployment of production ...
Keywords: debugging, multiprocessors, determinism, debugging, reliability, reliability, determinism, multiprocessors

15 published by ACM
March 2009 ASPLOS XIV: Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Publisher: ACM
Bibliometrics:
Citation Count: 131
Downloads (6 Weeks): 2,   Downloads (12 Months): 89,   Downloads (Overall): 1,531

Full text available: PDFPDF
Current shared memory multicore and multiprocessor systems are nondeterministic. Each time these systems execute a multithreaded application, even if supplied with the same input, they can produce a different output. This frustrates debugging and limits the ability to properly test multithreaded code, becoming a major stumbling block to the much-needed ...
Keywords: debugging, determinism, multicores, parallel programming
Also published in:
February 2009  ACM SIGPLAN Notices - ASPLOS 2009: Volume 44 Issue 3, March 2009 March 2009  ACM SIGARCH Computer Architecture News - ASPLOS 2009: Volume 37 Issue 1, March 2009

16 published by ACM
July 2008 Communications of the ACM - Web science: Volume 51 Issue 7, July 2008
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 8,   Downloads (12 Months): 94,   Downloads (Overall): 7,434

Full text available: HtmlHtml  PDFPDF
How changes in computer architecture are about to impact everyone in the IT business.

17 published by ACM
July 2008 Communications of the ACM - Web science: Volume 51 Issue 7, July 2008
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 29,   Downloads (12 Months): 237,   Downloads (Overall): 3,881

Full text available: HtmlHtml  PDFPDF
Researchers are optimistic, but a practical device is years away.

18
June 2008 ISCA '08: Proceedings of the 35th Annual International Symposium on Computer Architecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 8
Downloads (6 Weeks): 1,   Downloads (12 Months): 5,   Downloads (Overall): 490

Full text available: PDFPDF
In this paper we present the first ever systematic design space exploration of microcoded software fault tolerant ion-trap quantum computers. This exploration reveals the critical importance of a well-tuned microcode for providing high performance and ensuring system reliability. In addition, we find that, despite recent advances in the reliability of ...
Keywords: Microcoded, Quantum, Architecture, Ion-Trap
Also published in:
June 2008  ACM SIGARCH Computer Architecture News: Volume 36 Issue 3, June 2008

19
June 2008 ISCA '08: Proceedings of the 35th Annual International Symposium on Computer Architecture
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 25
Downloads (6 Weeks): 6,   Downloads (12 Months): 16,   Downloads (Overall): 657

Full text available: PDFPDF
As the number of cores per die increases, be they processors, memory blocks, or custom accelerators, the on-chip interconnect the cores use to communicate gains importance. We begin this study with an area-performance analysis of the interconnect design space. We find that there is no single network design that yields ...
Keywords: on-chip network, configurable hardware
Also published in:
June 2008  ACM SIGARCH Computer Architecture News: Volume 36 Issue 3, June 2008

20 published by ACM
June 2007 ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture
Publisher: ACM
Bibliometrics:
Citation Count: 4
Downloads (6 Weeks): 2,   Downloads (12 Months): 24,   Downloads (Overall): 493

Full text available: PDFPDF
We introduce a novel chip fabrication technique called "brick and mortar", in which chips are made from small, pre-fabricated ASIC bricks and bonded in a designer-specified arrangement to an inter-brick communication backbone chip. The goal of brick and mortar assembly is to provide a low-overhead method to produce custom chips, ...
Keywords: chip assembly, design re-use, interconnect design
Also published in:
June 2007  ACM SIGARCH Computer Architecture News: Volume 35 Issue 2, May 2007



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2019 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us