Batched transpose-free ADI-type preconditioners for a Poisson solver on GPGPUs
Loading...
Downloads
0
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Location
Signature
Abstract
We investigate the iterative solution of a symmetric positive definite linear system involving the shifted Laplacian as the system matrix on General Purpose Graphics Processing Units (GPGPUs). We consider in particular the Chebyshev iteration for its reduced global communication. The ADI-type preconditioner involves solving multiple (batched) symmetric positive tridiagonal Toeplitz systems along each coordinate direction. We investigate several variants how to solve these tridiagonal systems, the Thomas algorithm, the Thomas combined with the SPIKE algorithm, and a polynomial approximation of the inverse. We test the various implementations numerically by means of two-and three-dimensional examples. It turns out that a combination of the Thomas algorithm and the approximate inverse leads to a solution that does not need either tiling or transpositions. As such none of the kernels uses an extensive amount of shared memory which yields a very high GPU utilization and more importantly optimal coalesced global memory access patterns.
Description
Subject(s)
shifted Poisson problem, ADI preconditioner, batched triangular systems, General Purpose Graphical Processing Unit (GPGPU)
Citation
Journal of Parallel and Distributed Computing. 2020, vol. 137, p. 148-159.