×

Efficient dimensionality reduction for canonical correlation analysis. (English) Zbl 1307.65015

Recall first that the main goal of the canonical correlation analysis (CCA) is to analyze the relation between pairs of datasets, each in matrix form. From the algebraic point of view CCA measures the similarities between two subspaces, while from the geometric point of view CCA computes the cosine of the principal angle between the two subspaces.
The main contribution of this paper is a fast algorithm to compute an approximate CCA. The algorithm computes an approximation both to all the canonical correlations and a set of approximate canonical weights with provable guarantee.
The authors claim that it is the first subcubic time algorithm approximating CCA that has provable guarantees. The scheme is particularly effective when the matrices are tall and thin, i.e., there are many more rows than columns. It is due to the fact that the proposed algorithm is based on dimensionality reduction. More precisely, given a pair of matrices the authors propose to transform the pair into a new pair of matrices that has many fewer rows, and then to calculate the canonical correlations of the new pair exactly, alongside a set of canonical weights using the Björck and Golub algorithm.

MSC:

65C60 Computational problems in statistics (MSC2010)
62H20 Measures of association (correlation, canonical correlation, etc.)
65F30 Other matrix algorithms (MSC2010)
11K45 Pseudo-random numbers; Monte Carlo methods
PDFBibTeX XMLCite
Full Text: DOI arXiv Link