## Statistics & Methods Centre - Exploratory analysis

- Exploratory multivariate interdependence techniques aim to unravel relationships between variables and/or subjects without explicitly assuming specific distributions for the variables. The idea is to describe the patterns in the data without making (very) strong assumptions about the variables. In a sense the techniques are primarily treating samples as populations. Generalisability should come from replication rather than assumptions about sampling.
- The references for the books mentioned in this section can be found on the Courses and textbooks page

- Principal component analysis: Numerical variable
- Principal component analysis: Categorical, ordinal, & numerical variables
- Correspondence analysis
- Multidimensional scaling
- Cluster analysis

- Principal component analysis: Numerical variables
- Principal component (or components) analysis is (1) a multivariate technique to investigate the correlations among a (large) number of variables;
(2) A multivariate technique to identify linear combinations of a set of variables which explain as much
of the variance in the data as possible. There several other definitions and uses.

In textbooks and computer programs Principal component analysis is generally (and sometimes confusingly) discussed together and intertwined with exploratory factor analysis. The indicated references also explain the differences between principal component analysis and factor analysis **Basic reading**- Field, Chapter 15, section 15.3
- Hair et al., Chapter 3: Factor analysis
- Meyers et al., Chapter 12A: Principal components and Factor analysis, and Chapter 12B: Principal components and factor analysis using SPSS
**Advanced reading**- Joliffe, I.T. (2002).
*Principal component analysis*(2nd edition). New York: Springer **Software**- SPSS => Analyze => Data reduction => Factor (=> Extraction to see default is indeed Principal components)
- Annotated output for SPSS - UCLA-ATS site
- Annotated output for SPSS - Andy Field site
**Reporting Principal component analysis in publications****Reporting - examples in journal articles**

- Principal component analysis: Categorical, ordinal, and numerical variables
- Categorical (or nonlinear) principal component analysis is a multivariate technique which extends standard principal component analysis to variables with all kinds of measurement levels and can also handle nonlinear relationships between numerical variables. The SPSS program CAtPCA is very useful to make biplots, also in the case of numerical variables.
**Basic reading**- No standard English language extbook exists which treats this subject.
- De Heus, P. Van der Leeden, R., & Gazendam, B. (1995).
*Toegepaste data-analyse*Chapter xx. Utrecht: Lemma **Advanced reading**- Linting, M. et al. (2007). Nonlinear principal components analysis:Introduction and applications.
*Psychological Methods, 12*, 336-358. - Meulman, J.J., Van der Kooij, A.J., & Heiser, W.J. (2004). Princnipal component analysis with nonlinear
optimal scaling transformations for ordinal and nominal data. In Kaplan, D. (Ed.),
*Handbook of quantitative methodology for the social sciences*(pp.49-70). London: Sage. - Meulman, J.J., Heiser, W.J., & SPSS Inc. (2004).
*SPSS Categories 13.0, Chapter 3*. Chicago, IL:SPSS **Software**- SPSS => Analyze => Data reduction => Optimal scaling => Some variable(s) not multiple nominal
- Annotated output for SPSS in the SPSS Manual: Meulman, Heiser, & SPSS Inc.
**Reporting Categorical principal component analysis in publications****Reporting - examples in journal articles**

- Correspondence analysis
- Correspondence analysis is a technique which is applied to a contingency table (cross table) and it is the intention to represent the row categories and column categories as well as possible in a plot such that the interaction between rows and columns can be examined.
**Basic reading**- Hair et al. , Chapter 9 (p. 663-674, 691-697).
**Advanced reading**- Greenacre, M.J. (1985).
*Theory and applications of correspondence analysis*. Academic Press - Greenacre, M.J. (2007).
*Correspondence analysis in practice*(2nd edition). Chapman and Hall **Software**- SPSS => Analyze => Data reduction => Correspondence analysis
- Annotated output for SPSS in Meulman, Heiser, & SPSS Inc. (2004).
*SPSS Categories 14.0*(Chapter 5: Correspondence analysis). Chicago: SPSS **Reporting Correspondence analysis in publications****Reporting - examples in journal articles**

- Multidimensional scaling - MDS
- Multidmensional scaling refers to a collection of techniques which aim to portray (dis)similarity between objects (however defined) in a graph such that the distances in the graph correspond as well as possible with the observed (dis)similarities. Different MDS techniques can have somewhat different definitions from the one given here.
**Basic reading**- Hair et al., Chapter 9 (p. 629-662, 679-690).
**Advanced reading**- Borg, I. & Groenen, P.J.F.
*Modern multidimensional scaling: Theory and applications*. New York: Springer. **Software**- SPSS => Analyze => Scaling => Multidimensional scaling (ProxScal)
- Annotated output for SPSS in Meulman, Heiser, & SPSS Inc. (2004).
*SPSS Categories 14.0*(Chapter 7: Multidimensional scaling (ProxScal). Chicago: SPSS **Reporting Multidimensional scaling in publications****Reporting - examples in journal articles**

- Cluster analysis
- Cluster analysis consists of a group of methods aimed at grouping or clustering objects such that objects or subjects in a group share many of the characteristics measured. Clustering is generally done on objects rather than variables, even though the latter is possible.
**Basic reading**- Hair et al., Chapter 8: Cluster analysis.
**Advanced reading**- Everitt, B.S., Landau, S., & Leese, M. (2001).
*Cluster analysis*(4th edition). London: Hodder Arnold. **Software**- SPSS => Analyze => Classify => TwoStep clustering
- SPSS => Analyze => Classify => K-means clustering
- SPSS => Analyze => Classify => Hierarchical clustering
- Annotated output for SPSS
**Reporting Cluster analysis in publications****Reporting - examples in journal articles**