Domain Specific Memory Management for Large Scale Data Analytics

Shyamshankar, Panchapakesan Chitra

Domain Specific Memory Management for Large Scale Data Analytics

Files

SHYAMSHANKAR-DISSERTATION-2018.pdf (705.25 KB)

Embargo until

2019-05-01

Date

2018-04-12

Authors

Shyamshankar, Panchapakesan Chitra

Publisher

Johns Hopkins University

Abstract

Hardware trends over the last several decades have lead to shifting priorities with respect to performance bottlenecks in the implementations of dataflows typically present in large-scale data analytics applications. In particular, efficient use of main memory has emerged as a critical aspect of dataflow implementation, due to the proliferation of multi-core architectures, as well as the rapid development of faster-than-disk storage media. At the same time, the wealth of static domain-specific information about applications remains an untapped resource when it comes to optimizing the use of memory in a dataflow application. We propose a compilation-based approach to the synthesis of memory-efficient dataflow implementations, using static analysis to extract and leverage domain-specific information about the application. Our program transformations use the combined results of type, effect, and provenance analyses to infer time- and space- effective placement of primitive memory operations, precluding the need for dynamic memory management and its attendant costs. The experimental evaluation of implementations synthesized with our framework shows both the importance of optimizing for memory performance, as well as significant benefits of our approach, along multiple dimensions. Finally, we also demonstrate a framework for formally verifying the soundness of these transformations, laying the foundation for their use as a component of a more general implementation synthesis ecosystem.

Keywords

big data, data analytics, memory management, compilation, optimization, formal methods

URI

http://jhir.library.jhu.edu/handle/1774.2/61039

Collections

ETD -- Doctoral Dissertations

Full item page