While a major use is in discombobulating sensory data, we tend to use PCA to interpret peptide profiles obtained by reverse-phase HPLC of cheese extracts. These chromatograms are typically very complex (perhaps 60-80 co-eltuting peaks):
The variables (peak height data) are preprocessed according to the method of Piraino et al. (2004). The output from this preprocessing consists of classes of retention time within which peak heights are accumulated using the distance from centre of class as a weight. Principal component (PC) analysis and hierarchical cluster analysis are then performed on the data using a covariance matrix and the between-groups linkage cluster method, respectively, and the output looks like the following. We plot the factor loadings for the two principal components on the same X-axis scale as the pre-processed data and original chromatograms thus allowing easy interpretation of which peptide classes are important for separating the data.
Overall, we now see much more from our HPLC chromatograms. Our application of multivariate statistics to cheese data was really started and has been helped immensely by two wizards of stats in biological systems, Prof Eugenio Parente and Dr Paolo Piraino of Universita della Basilicata, Potenza, Italy. Eugenio (below) visited UCC this week to teach a course on the use of multivariate techniques in food science and we extend our thanks to him once more.
Piraino, P., E. Parente and P.L.H. McSweeney (2004). Processing of chromatographic data for chemometric analysis of peptide profiles from cheese extracts: a novel approach. Journal of Agricultural and Food Chemistry 52, 6904-6911.