Schwartz, Oded; Vaknin, Noa Pebbling game and alternative basis for high performance matrix multiplication. (English) Zbl 1527.15001 SIAM J. Sci. Comput. 45, No. 6, C277-C303 (2023). MSC: 15-04 65F99 65K05 PDFBibTeX XMLCite \textit{O. Schwartz} and \textit{N. Vaknin}, SIAM J. Sci. Comput. 45, No. 6, C277--C303 (2023; Zbl 1527.15001) Full Text: DOI
Gareev, Roman A.; Akimova, Elena N. Analytical modeling of matrix-vector multiplication on multicore processors. (English) Zbl 1527.65028 Math. Methods Appl. Sci. 45, No. 15, 8769-8799 (2022). MSC: 65F99 65Y05 PDFBibTeX XMLCite \textit{R. A. Gareev} and \textit{E. N. Akimova}, Math. Methods Appl. Sci. 45, No. 15, 8769--8799 (2022; Zbl 1527.65028) Full Text: DOI
Van Zee, Field G.; Parikh, Devangi N.; Geijn, Robert A. Van De Supporting mixed-domain mixed-precision matrix multiplication within the BLIS framework. (English) Zbl 07467972 ACM Trans. Math. Softw. 47, No. 2, Article No. 12, 26 p. (2021). MSC: 65-XX PDFBibTeX XMLCite \textit{F. G. Van Zee} et al., ACM Trans. Math. Softw. 47, No. 2, Article No. 12, 26 p. (2021; Zbl 07467972) Full Text: DOI arXiv
Ji, Hao; Mascagni, Michael; Li, Yaohang Gaussian variant of Freivalds’ algorithm for efficient and reliable matrix product verification. (English) Zbl 1470.65087 Monte Carlo Methods Appl. 26, No. 4, 273-284 (2020). MSC: 65F99 65C05 62P99 PDFBibTeX XMLCite \textit{H. Ji} et al., Monte Carlo Methods Appl. 26, No. 4, 273--284 (2020; Zbl 1470.65087) Full Text: DOI arXiv
Huang, Jianyu; Matthews, Devin A.; van de Geijn, Robert A. Strassen’s algorithm for tensor contraction. (English) Zbl 1416.65117 SIAM J. Sci. Comput. 40, No. 3, C305-C326 (2018). MSC: 65F30 65Y20 15A69 PDFBibTeX XMLCite \textit{J. Huang} et al., SIAM J. Sci. Comput. 40, No. 3, C305--C326 (2018; Zbl 1416.65117) Full Text: DOI arXiv
Matthews, Devin A. High-performance tensor contraction without transposition. (English) Zbl 1379.65024 SIAM J. Sci. Comput. 40, No. 1, C1-C24 (2018). MSC: 65F30 15A69 65Y20 PDFBibTeX XMLCite \textit{D. A. Matthews}, SIAM J. Sci. Comput. 40, No. 1, C1--C24 (2018; Zbl 1379.65024) Full Text: DOI arXiv
Van Zee, Field G.; Smith, Tyler M. Implementing high-performance complex matrix multiplication via the 3m and 4m methods. (English) Zbl 1484.65093 ACM Trans. Math. Softw. 44, No. 1, Article No. 7, 36 p. (2017). MSC: 65F99 PDFBibTeX XMLCite \textit{F. G. Van Zee} and \textit{T. M. Smith}, ACM Trans. Math. Softw. 44, No. 1, Article No. 7, 36 p. (2017; Zbl 1484.65093) Full Text: DOI
Bosner, Nela; Karlsson, Lars Parallel and heterogeneous \(m\)-Hessenberg-triangular-triangular reduction. (English) Zbl 1355.65047 SIAM J. Sci. Comput. 39, No. 1, C29-C47 (2017). MSC: 65F05 15A21 PDFBibTeX XMLCite \textit{N. Bosner} and \textit{L. Karlsson}, SIAM J. Sci. Comput. 39, No. 1, C29--C47 (2017; Zbl 1355.65047) Full Text: DOI
Low, Tze Meng; Igual, Francisco D.; Smith, Tyler M.; Quintana-Orti, Enrique S. Analytical modeling is enough for high-performance BLIS. (English) Zbl 1369.65200 ACM Trans. Math. Softw. 43, No. 2, Article No. 12, 18 p. (2016). MSC: 65Y15 65Fxx PDFBibTeX XMLCite \textit{T. M. Low} et al., ACM Trans. Math. Softw. 43, No. 2, Article No. 12, 18 p. (2016; Zbl 1369.65200) Full Text: DOI Link
Liu, Hui; Chen, Zhangxin; Yang, Bo Accelerating preconditioned iterative linear solvers on GPU. (English) Zbl 1463.65051 Int. J. Numer. Anal. Model., Ser. B 5, No. 1-2, 136-146 (2014). MSC: 65F10 65F08 65F50 65Y05 65Y10 68W10 PDFBibTeX XMLCite \textit{H. Liu} et al., Int. J. Numer. Anal. Model., Ser. B 5, No. 1--2, 136--146 (2014; Zbl 1463.65051)
Rump, Siegfried M. Fast interval matrix multiplication. (English) Zbl 1264.65065 Numer. Algorithms 61, No. 1, 1-34 (2012). Reviewer: Günter Mayer (Rostock) MSC: 65F30 65G30 PDFBibTeX XMLCite \textit{S. M. Rump}, Numer. Algorithms 61, No. 1, 1--34 (2012; Zbl 1264.65065) Full Text: DOI
Ozaki, Katsuhisa; Ogita, Takeshi; Oishi, Shin’ichi; Rump, Siegfried M. Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications. (English) Zbl 1244.65062 Numer. Algorithms 59, No. 1, 95-118 (2012). Reviewer: Frank Uhlig (Auburn) MSC: 65F30 15A24 PDFBibTeX XMLCite \textit{K. Ozaki} et al., Numer. Algorithms 59, No. 1, 95--118 (2012; Zbl 1244.65062) Full Text: DOI
Ozaki, Katsuhisa; Ogita, Takeshi; Oishi, Shin’ichi Tight and efficient enclosure of matrix multiplication by using optimized BLAS. (English) Zbl 1249.65098 Numer. Linear Algebra Appl. 18, No. 2, 237-248 (2011). MSC: 65F30 65G30 65G20 PDFBibTeX XMLCite \textit{K. Ozaki} et al., Numer. Linear Algebra Appl. 18, No. 2, 237--248 (2011; Zbl 1249.65098) Full Text: DOI
Ballard, Grey; Demmel, James; Holtz, Olga; Schwartz, Oded Minimizing communication in numerical linear algebra. (English) Zbl 1246.68128 SIAM J. Matrix Anal. Appl. 32, No. 3, 866-901 (2011). Reviewer: Gudula Rünger (Chemnitz) MSC: 68Q25 68W10 68W15 68W40 65Y05 65Y10 65Y20 65F30 PDFBibTeX XMLCite \textit{G. Ballard} et al., SIAM J. Matrix Anal. Appl. 32, No. 3, 866--901 (2011; Zbl 1246.68128) Full Text: DOI arXiv
Chowdhury, Rezaul Alam; Ramachandran, Vijaya The cache-oblivious Gaussian elimination paradigm: Theoretical framework, parallelization and Experimental evaluation. (English) Zbl 1213.68070 Theory Comput. Syst. 47, No. 4, 878-919 (2010). MSC: 68M07 65F05 65Y10 68W40 PDFBibTeX XMLCite \textit{R. A. Chowdhury} and \textit{V. Ramachandran}, Theory Comput. Syst. 47, No. 4, 878--919 (2010; Zbl 1213.68070) Full Text: DOI Link
Buluç, Aydın; Gilbert, John R.; Budak, Ceren Solving path problems on the GPU. (English) Zbl 1204.68043 Parallel Comput. 36, No. 5-6, 241-253 (2010). MSC: 68M99 PDFBibTeX XMLCite \textit{A. Buluç} et al., Parallel Comput. 36, No. 5--6, 241--253 (2010; Zbl 1204.68043) Full Text: DOI
Kumar, Vinay B. Y.; Joshi, Siddharth; Patkar, Sachin B.; Narayanan, H. FPGA based high performance double-precision matrix multiplication. (English) Zbl 1206.68069 Int. J. Parallel Program. 38, No. 3-4, 322-338 (2010). MSC: 68M20 68M07 68M99 PDFBibTeX XMLCite \textit{V. B. Y. Kumar} et al., Int. J. Parallel Program. 38, No. 3--4, 322--338 (2010; Zbl 1206.68069) Full Text: DOI
Liberty, Edo; Zucker, Steven W. The Mailman algorithm: a note on matrix-vector multiplication. (English) Zbl 1191.68832 Inf. Process. Lett. 109, No. 3, 179-182 (2009). MSC: 68W05 PDFBibTeX XMLCite \textit{E. Liberty} and \textit{S. W. Zucker}, Inf. Process. Lett. 109, No. 3, 179--182 (2009; Zbl 1191.68832) Full Text: DOI
Guarracino, Mario R.; Perla, Francesca; Zanetti, Paolo A sparse nonsymmetric eigensolver for distributed memory architectures. (English) Zbl 1146.68483 Int. J. Parallel Emergent Distrib. Syst. 23, No. 3, 259-270 (2008). MSC: 68W30 65F15 65F50 68W10 PDFBibTeX XMLCite \textit{M. R. Guarracino} et al., Int. J. Parallel Emergent Distrib. Syst. 23, No. 3, 259--270 (2008; Zbl 1146.68483) Full Text: DOI
Ogita, Takeshi; Oishi, Shin’ichi Fast inclusion of interval matrix multiplication. (English) Zbl 1072.65063 Reliab. Comput. 11, No. 3, 191-205 (2005). MSC: 65F30 65G30 PDFBibTeX XMLCite \textit{T. Ogita} and \textit{S. Oishi}, Reliab. Comput. 11, No. 3, 191--205 (2005; Zbl 1072.65063) Full Text: DOI
Valsalam, Vinod; Skjellum, Anthony A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. (English) Zbl 1008.68530 Concurrency Comput. Pract. Exp. 14, No. 10, 805-839 (2002). MSC: 68U99 65F30 65Y10 68W15 PDFBibTeX XMLCite \textit{V. Valsalam} and \textit{A. Skjellum}, Concurrency Comput. Pract. Exp. 14, No. 10, 805--839 (2002; Zbl 1008.68530) Full Text: DOI
Rump, Siegfried M. Fast verification algorithms in Matlab. (English) Zbl 0987.65037 Alefeld, Götz (ed.) et al., Symbolic algebraic methods and verification methods. Wien: Springer. 209-226 (2001). Reviewer: Günter Mayer (Speyer) MSC: 65G20 65F30 65G30 65F15 65Y15 68W30 PDFBibTeX XMLCite \textit{S. M. Rump}, in: Symbolic algebraic methods and verification methods. Wien: Springer. 209--226 (2001; Zbl 0987.65037)
Kågström, Bo; Ling, Per; Van Loan, Charles GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark. (English) Zbl 0930.65047 ACM Trans. Math. Softw. 24, No. 3, 268-302 (1998). MSC: 65F30 65F05 65Y20 65Y15 65Y05 PDFBibTeX XMLCite \textit{B. Kågström} et al., ACM Trans. Math. Softw. 24, No. 3, 268--302 (1998; Zbl 0930.65047) Full Text: DOI
Choi, Jaeyoung A new parallel matrix multiplication algorithm on distributed-memory concurrent computers. (English) Zbl 0903.68088 Concurrency Pract. Exp. 10, No. 8, 655-670 (1998). MSC: 68W10 68M99 PDFBibTeX XMLCite \textit{J. Choi}, Concurrency Pract. Exp. 10, No. 8, 655--670 (1998; Zbl 0903.68088) Full Text: DOI
Dumitrescu, Bogdan Improving and estimating the accuracy of Strassen’s algorithm. (English) Zbl 0911.65034 Numer. Math. 79, No. 4, 485-499 (1998). Reviewer: W.Gander (Zürich) MSC: 65F30 65F35 65G50 PDFBibTeX XMLCite \textit{B. Dumitrescu}, Numer. Math. 79, No. 4, 485--499 (1998; Zbl 0911.65034) Full Text: DOI
Field, Martyn R. Optimizing a parallel conjugate gradient solver. (English) Zbl 0913.65028 SIAM J. Sci. Comput. 19, No. 1, 27-37 (1998). Reviewer: W.Schönauer (Karlsruhe) MSC: 65F10 65Y05 65F50 PDFBibTeX XMLCite \textit{M. R. Field}, SIAM J. Sci. Comput. 19, No. 1, 27--37 (1998; Zbl 0913.65028) Full Text: DOI
Nool, Margreet Explicit parallel block Cholesky algorithms on the CRAY APP. (English) Zbl 0866.65024 Appl. Numer. Math. 19, No. 1-2, 91-114 (1995). Reviewer: M.Vajterśic (Bratislava) MSC: 65F05 65Y05 65Y10 65Y20 65F30 PDFBibTeX XMLCite \textit{M. Nool}, Appl. Numer. Math. 19, No. 1--2, 91--114 (1995; Zbl 0866.65024) Full Text: DOI Link
Gutheil, Inge; Krotz-Vogel, Werner Performance of a parallel matrix multiplication routine on Intel iPSC/860. (English) Zbl 0815.65054 Parallel Comput. 20, No. 7, 953-974 (1994). Reviewer: C.A.De Moura (Fortaleza) MSC: 65F30 65Y05 65Y20 PDFBibTeX XMLCite \textit{I. Gutheil} and \textit{W. Krotz-Vogel}, Parallel Comput. 20, No. 7, 953--974 (1994; Zbl 0815.65054) Full Text: DOI
Mathur, Kapil K.; Johnsson, S. Lennart Multiplication of matrices of arbitrary shape on a data parallel computer. (English) Zbl 0808.65051 Parallel Comput. 20, No. 7, 919-951 (1994). Reviewer: W.Schönauer (Karlsruhe) MSC: 65F30 65Y05 PDFBibTeX XMLCite \textit{K. K. Mathur} and \textit{S. L. Johnsson}, Parallel Comput. 20, No. 7, 919--951 (1994; Zbl 0808.65051) Full Text: DOI Link
Alpern, B.; Carter, L.; Feig, E.; Selker, T. The uniform memory hierarchy model of computation. (English) Zbl 0938.68638 Algorithmica 12, No. 2-3, 72-109 (1994). MSC: 68Q05 68Q10 65Y05 65F30 65T50 PDFBibTeX XMLCite \textit{B. Alpern} et al., Algorithmica 12, No. 2--3, 72--109 (1994; Zbl 0938.68638) Full Text: DOI
Daydé, M.; Duff, I. S.; Petitet, A. A parallel block implementation of level-3 BLAS for MIMD vector processors. (English) Zbl 0888.65047 ACM Trans. Math. Softw. 20, No. 2, 178-193 (1994). MSC: 65F30 65Y05 PDFBibTeX XMLCite \textit{M. Daydé} et al., ACM Trans. Math. Softw. 20, No. 2, 178--193 (1994; Zbl 0888.65047) Full Text: DOI Link Link
Stathopoulos, Andreas; Fischer, Charlotte F. A Davidson program for finding a few selected extreme eigenpairs of a large, sparse, real, symmetric matrix. (English) Zbl 0878.65029 Comput. Phys. Commun. 79, No. 2, 268-290 (1994). MSC: 65F15 15-04 65F50 PDFBibTeX XMLCite \textit{A. Stathopoulos} and \textit{C. F. Fischer}, Comput. Phys. Commun. 79, No. 2, 268--290 (1994; Zbl 0878.65029) Full Text: DOI
Higham, Nicholas J. Stability of a method for multiplying complex matrices with three real matrix multiplications. (English) Zbl 0777.65027 SIAM J. Matrix Anal. Appl. 13, No. 3, 681-687 (1992). Reviewer: F.Ribière-Michaud (Paris) MSC: 65F30 65E05 PDFBibTeX XMLCite \textit{N. J. Higham}, SIAM J. Matrix Anal. Appl. 13, No. 3, 681--687 (1992; Zbl 0777.65027) Full Text: DOI Link
Puglisi, Chiara Modification of the Householder method based on the compact WY representation. (English) Zbl 0756.65040 SIAM J. Sci. Stat. Comput. 13, No. 3, 723-726 (1992). Reviewer: F.Szidarovszky (Tucson) MSC: 65F05 PDFBibTeX XMLCite \textit{C. Puglisi}, SIAM J. Sci. Stat. Comput. 13, No. 3, 723--726 (1992; Zbl 0756.65040) Full Text: DOI
Demmel, James W.; Higham, Nicholas J. Stability of block algorithms with fast level-3 BLAS. (English) Zbl 0892.65016 ACM Trans. Math. Softw. 18, No. 3, 274-291 (1992). MSC: 65F05 65G50 PDFBibTeX XMLCite \textit{J. W. Demmel} and \textit{N. J. Higham}, ACM Trans. Math. Softw. 18, No. 3, 274--291 (1992; Zbl 0892.65016) Full Text: DOI Link
Bailey, David H.; Lee, King; Simon, Horst D. Using Strassen’s algorithm to accelerate the solution of linear systems. (English) Zbl 1215.65049 J. Supercomput. 4, No. 4, 357-371 (1991). MSC: 65F05 65F30 PDFBibTeX XMLCite \textit{D. H. Bailey} et al., J. Supercomput. 4, No. 4, 357--371 (1991; Zbl 1215.65049) Full Text: DOI
Higham, Nicholas J. Exploiting fast matrix multiplication within the level 3 BLAS. (English) Zbl 0900.65118 ACM Trans. Math. Softw. 16, No. 4, 352-368 (1990). MSC: 65F30 PDFBibTeX XMLCite \textit{N. J. Higham}, ACM Trans. Math. Softw. 16, No. 4, 352--368 (1990; Zbl 0900.65118) Full Text: DOI Link Link
Johnsson, S. Lennart Data parallel programming and basic linear algebra subroutines. (English) Zbl 0646.68042 Mathematical aspects of scientific software, Proc. Workshop, IMA Vol. Math. Appl. 14, 183-196 (1988). MSC: 68W30 68N25 65F30 PDFBibTeX XML