Welcome to FFTMatvec's Documentation!

This repository contains the code for FFTMatvec, described in the paper "Sreeram Venkat, Milinda Fernando, Stefan Henneking, and Omar Ghattas. Fast and Scalable FFT-Based GPU-Accelerated Algorithms for Block-Triangular Toeplitz Matrices with Application to Linear Inverse Problems Governed by Autonomous Dynamical Systems.. SIAM Journal of Scientific Computing. 2025. To appear. arXiv preprint arXiv:2407.13066."

FFTMatvec is now performance portable to AMD GPUs and supports mixed-precision computations. See "Sreeram Venkat, Kasia Swirydowicz, Noah Wolfe, and Omar Ghattas. Mixed-Precision Performance Portability of FFT-Based GPU-Accelerated Algorithms for Block-Triangular Toeplitz Matrices. Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis. 2025. To appear. arXiv preprint [arXiv:2508.10202] (https://arxiv.org/abs/2508.10202)."

Performance

Algorithm Animation

View an animation of the FFTMatvec algorithm here.

Source Code

The source code is available on GitHub.

Getting Started

To learn how to build and run the code, along with a working example, see the Getting Started guide.

I/O and Data Formats

To learn how matrices and vectors are stored on disk (HDF5 format, directory layout, and data ordering), see the I/O and Data Format guide.

pyFFTMatvec

For using FFTMatvec from Python — including installation, the pyFFTMatvec API, and PyTorch GPU integration — see the pyFFTMatvec guide.

License

This code is released under the MIT License. See LICENSE for more information.