83. A Novel Variable-Blocking Representation for Efficient Sparse Matrix-Vector Multiply on GPUs
Authors: Tuowen Zhao (University of Utah)Tharindu Rusira (University of Utah)Khalid Ahmad (University of Utah)Mary Hall (University of Utah)
Abstract: Fillrate-guided block compressed sparse row (FBCSR) is a novel approach to improve the performance of sparse matrix-vector multiply (SpMV) on GPUs. Motivated by the observation that in the finite element method, many of the matrices consist of dense block of different sizes and unaligned starting positions, FBCSR can identify and extract those local nonzero patterns that could improve the memory access and reduce intra-warp divergence of the corresponding SpMV kernels. As compared to other variable blocking methods such as unaligned block compressed sparse row, it relies on local patterns and can generate larger blocks well-suited for single instruction, multiple threads processing, while also tolerating a bounded number of generated zero-fillins. We present the SpMV performance results for FBCSR on two generations of Nvidia GPUs, which shows that FBCSR outperforms available alternatives for the matrices to which it is applicable.
Two-page extended abstract: pdf