Fast Matrix Multiplication in Small Formats: Discovering New Schemes with an Open-Source Flip Graph Framework
Abstract
An open-source C++ framework for discovering fast matrix multiplication schemes using the flip graph approach is presented. The framework supports multiple coefficient rings -- binary (Z_2), modular ternary (Z_3) and integer ternary (Z_T = {-1,0,1}) -- and implements both fixed-dimension and meta-dimensional search operators. Using efficient bit-level encoding of coefficient vectors and OpenMP parallelism, the tools enable large-scale exploration on commodity hardware. The study covers 680 schemes ranging from (2 times 2 times 2) to (16 times 16 times 16), with 276 schemes now in Z_T coefficients and 117 in integer coefficients. With this framework, the multiplicative complexity (rank) is improved for 79 matrix multiplication schemes. Notably, a new 4 times 4 times 10 scheme requiring only 115 multiplications is discovered, achieving ωapprox 2.80478 and beating Strassen's exponent for this specific size. Additionally, 93 schemes are rediscovered in ternary coefficients that were previously known only over rationals or integers, and 68 schemes in integer coefficients that previously required fractions. All tools and discovered schemes are made publicly available to enable reproducible research.
Community
71 improved ranks for matrix multiplication, a new 4x4x10 scheme with ω < 2.807 and 178 schemes rediscovered in ternary/integer coefficients. Results obtained with an open-source flip graph framework.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Tensor Decomposition for Non-Clifford Gate Minimization (2026)
- Complex to Rational Fast Matrix Multiplication (2026)
- Learning-Augmented Perfectly Secure Collaborative Matrix Multiplication (2026)
- Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach (2026)
- Massively Parallel Reductions in Multivariate Polynomial Systems: Bridging the Symbolic Preprocessing Gap on GPGPU Architectures (2026)
- Multivariate Polynomial Codes for Efficient Matrix Chain Multiplication in Distributed Systems (2026)
- Towards Minimal Fault-tolerant Error-Correction Sequence with Quantum Hamming Codes (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper