Memory continues to be a major bottleneck in almost all computing systems. It is becoming more so as more cores and other agents are sharing parts of the memory system, and as applications that run on the cores are becoming increasingly data intensive. Continuing the tradition of eight previous successful incarnations, MSPC 2014 provided a forum for discussing all aspects of memory performance and correctness on a variety of systems (multi-core, desktop, embedded, server/cloud, high-performance computing, sensor, etc) and related software and hardware innovations at various levels of the technology stack.
Proceeding Downloads
A study of connected object locality in NUMA heaps
Reference locality is vital to the performance of parallel Garbage Collection (GC) running on Non-Uniform Memory Access (NUMA) machines. A GC thread may trace remotely placed objects that descend from the root set or, for load balance, a GC thread may ...
Affinity-based hash tables
From a trace of data accesses, it is possible to calculate an affinity hierarchy that groups related data together. Combining this hierarchy with the extremely common hash table, there is an opportunity to both improve cache performance and enable novel ...
Feedback directed optimization of TCMalloc
TCMalloc [9] is an open-source memory allocator. Its use of thread-local caches of free objects enables most allocations/deallocations to be satisfied from thread-local heaps not requiring locks, making it a highly scalable memory allocator for multi-...
Main memory and cache performance of intel sandy bridge and AMD bulldozer
Application performance on multicore processors is seldom constrained by the speed of floating point or integer units. Much more often, limitations are caused by the memory subsystem, particularly shared resources such as last level caches or memory ...
Nonvolatile memory is a broken time machine
Energy harvesting enables intermittently powered devices to compute without built-in power. But frequent power failures, combined with nonvolatile memory intended to protect computational state, introduce strange control flow that turns sequential code ...
O-structures: semantics for versioned memory
This paper introduces O-structures, a novel architectural memory element that can be used to facilitate parallelism in task-based execution models. Much like register renaming, each write to an O-structure creates a new version of program memory at that ...
Outlawing ghosts: avoiding out-of-thin-air results
It is very difficult to define a programming language memory model for shared variables that both
• allows programmers to take full advantage of weakly-ordered memory operations, but still
• correctly disallows so-called "out-of-thin-air" results, i.e. ...
Trash in cache: detecting eternally silent stores
The gap between processing and storage speeds remains a concern for computer system designers and application developers. This disparity can be bridged in part by eliminating unnecessary stores, thereby reducing the amount of traffic that flows from the ...
Index Terms
Proceedings of the workshop on Memory Systems Performance and Correctness
Recommendations
Acceptance Rates
| Year | Submitted | Accepted | Rate |
|---|---|---|---|
| MSPC '14 | 20 | 6 | 30% |
| Overall | 20 | 6 | 30% |



