Compute on Data Path: Combating Data Movement in High Performance Computing

High performance computing enabled simulation has been widely considered a third pillar of science along with theory and experimentation, and is a strategic tool in many aspects of scientific discovery and innovation. High performance computing simulations, however, have become highly data intensive in recent years due to data acquisition and generation becoming much cheaper, newer high-resolution multi-model scientific discovery producing and requiring more data, and the insight that useful data can be mined out of large amounts of data being substantially increased.

This project combats the increasingly critical data movement challenge in high performance computing. This project studies the feasibility of a new Compute on Data Path methodology that expects to improve the performance and energy efficiency for high performance computing. This new methodology models both computations and data as objects with a data model that encapsulates and binds them. It fuses data motion and computation leveraging programming model and compiler. It develops an object-based store and runtime to enable computations along data path pipeline. In recent years, a proliferation of advanced high performance computing architectures including multi- and many-core systems, co-processors and accelerators, and heterogeneous computing platforms have been observed. The software solution that addresses the critical data movement challenge, however, has significantly lagged behind. This project has the potential of advancing the understandings and the software solution and further unleashing the power of high performance computing enabled simulation.

This project is funded by the National Science Foundation under grant CCF-1409946.

ACKNOWLEDGMENT: We are grateful to the National Science Foundation for the sponsorship of this project.