Largest eigenvalues in multivariate statistical analysis
Multivariate statistical analysis aims to discover and test for
the presence of structure from sample data in which the unit is a possibly high
dimensional vector with correlated components. In the classical techniques, the
eigenvalues of (Wishart-distributed) covariance matrices play a central role: we
introduce and illustrate with some examples from genetics and finance. The
distribution theory for the eigenvalues is complicated, but in recent years a
new impetus to simpler approximate results has come from random matrix theory,
by imagining the number of variables as large. We focus on the largest
eigenvalue in particular, and review null hypothesis distribution approximations
using the celebrated Tracy-Widom laws, aiming to show that they are accurate
enough for routine applied use even in quite low dimensions. Brief mention of
behavior under some non-null alternatives will be made, along with a few further
remarks about applications.