for Next-Generation Hardware

To apply, please see details of the job postings here.

On modern computer architectures, floating point computations are much less expensive than data movement, both between levels of the memory hierarchy and between parallel processors. This high cost of data movement can present a significant obstacle to scalable implementation of algorithms for numerical linear algebra, which are ubiquitous throughout scientific and data analysis applications. This is particularly true in the case of sparse computations, which inherently suffer from a low computation-to-communication ratio. Common approaches to eliminating performance bottlenecks in numerical linear algebra include, for example, reordering computations, sparsification, approximation of linear operators, asynchronous execution, and use of low precision hardware.

Although such modifications may improve the performance of certain computations, the introduction of inexactness can drastically alter numerical behavior in terms of convergence rate and accuracy. This holds even for variants which are mathematically equivalent to the original algorithm due to finite-precision error. In practical settings, understanding this behavior is crucial: if we design a new algorithm to reduce the time per iteration, we must ensure that any change to the convergence rate does not prohibit an overall speedup; if we design a technique to improve the convergence rate, we must also ensure that the extra computation does not exacerbate performance bottlenecks. Clearly, we must also ensure that the attainable accuracy remains suitable for the target application.

Thus a clear, thorough understanding of how inexact computations, due to finite-precision error and/or intentional approximation, affect the numerical behavior of iterative methods is imperative in balancing the tradeoffs between accuracy, convergence rate, and cost per iteration in high-performance codes. Such a holistic approach to performance optimization becomes especially important at the exascale level.

This project aims to address this challenge through the following objectives:

- analyze the numerical properties of iterative methods for sparse linear algebra problems and their popular high-performance variants when computations are subject to inexactness;
- based on insights from numerical analysis, design and implement new efficient and reliable algorithms and tools for emerging HPC machines that can exploit tradeoffs between performance and accuracy;
- apply these new algorithms to improve performance in large-scale applications in both scientific computing and data science application domains.

The ScaNLA project will begin in Fall 2019, funding through the Charles University PRIMUS initiative.