Statistics & Methods Centre - Exploratory analysis
- Exploratory multivariate interdependence techniques aim to unravel relationships between variables and/or subjects
without explicitly assuming specific distributions for the variables. The idea is to describe the patterns in the data
without making (very) strong assumptions about the variables. In a sense the techniques are primarily treating samples
as populations. Generalisability should come from replication rather than assumptions about sampling.
- The references for the books mentioned in this section can be found on the
Courses and textbooks page
- Principal component (or components) analysis is (1) a multivariate technique to investigate the correlations among a (large) number of variables;
(2) A multivariate technique to identify linear combinations of a set of variables which explain as much
of the variance in the data as possible. There several other definitions and uses.
In textbooks and computer programs Principal component analysis is generally (and sometimes
confusingly) discussed together and intertwined with exploratory factor analysis.
The indicated references also explain the differences between principal component analysis and factor analysis
- Basic reading
- Field, Chapter 15, section 15.3
- Hair et al., Chapter 3: Factor analysis
- Meyers et al., Chapter 12A: Principal components and Factor analysis,
and Chapter 12B: Principal components and factor analysis using SPSS
- Advanced reading
- Joliffe, I.T. (2002). Principal component analysis (2nd edition). New York: Springer
- Software
- SPSS => Analyze => Data reduction => Factor (=> Extraction to see default is indeed Principal components)
- Annotated output for SPSS - UCLA-ATS site
- Annotated output for SPSS - Andy Field site
- Reporting Principal component analysis in publications
- Reporting - examples in journal articles
Top
- Categorical (or nonlinear) principal component analysis is a multivariate technique which extends standard
principal component analysis to variables with all kinds of measurement levels and can also handle nonlinear relationships
between numerical variables. The SPSS program CAtPCA is very useful to make biplots, also in the case of
numerical variables.
- Basic reading
- No standard English language extbook exists which treats this subject.
- De Heus, P. Van der Leeden, R., & Gazendam, B. (1995).
Toegepaste data-analyse Chapter xx. Utrecht: Lemma
- Advanced reading
- Linting, M. et al. (2007). Nonlinear principal components analysis:Introduction and applications.
Psychological Methods, 12, 336-358.
- Meulman, J.J., Van der Kooij, A.J., & Heiser, W.J. (2004). Princnipal component analysis with nonlinear
optimal scaling transformations for ordinal and nominal data. In Kaplan, D. (Ed.),
Handbook of quantitative methodology for the social sciences (pp.49-70). London: Sage.
- Meulman, J.J., Heiser, W.J., & SPSS Inc. (2004). SPSS Categories 13.0, Chapter 3. Chicago, IL:SPSS
- Software
- SPSS => Analyze => Data reduction => Optimal scaling => Some variable(s) not multiple nominal
- Annotated output for SPSS in the SPSS Manual: Meulman, Heiser, & SPSS Inc.
- Reporting Categorical principal component analysis in publications
- Reporting - examples in journal articles
Top
- Correspondence analysis is a technique which is applied to a contingency table (cross table) and
it is the intention to represent the row categories and column categories as well as possible in a
plot such that the interaction between rows and columns can be examined.
- Basic reading
- Hair et al. , Chapter 9 (p. 663-674, 691-697).
- Advanced reading
- Greenacre, M.J. (1985). Theory and applications of correspondence analysis. Academic Press
- Greenacre, M.J. (2007). Correspondence analysis in practice (2nd edition). Chapman and Hall
- Software
- SPSS => Analyze => Data reduction => Correspondence analysis
- Annotated output for SPSS in Meulman, Heiser, & SPSS Inc. (2004).
SPSS Categories 14.0 (Chapter 5: Correspondence analysis). Chicago: SPSS
- Reporting Correspondence analysis in publications
- Reporting - examples in journal articles
Top
- Multidmensional scaling refers to a collection of techniques which aim to portray (dis)similarity
between objects (however defined) in a graph such that the distances in the graph correspond as well
as possible with the observed (dis)similarities. Different MDS techniques can have somewhat different
definitions from the one given here.
- Basic reading
- Hair et al., Chapter 9 (p. 629-662, 679-690).
- Advanced reading
- Borg, I. & Groenen, P.J.F. Modern multidimensional scaling:
Theory and applications. New York: Springer.
- Software
- SPSS => Analyze => Scaling => Multidimensional scaling (ProxScal)
- Annotated output for SPSS in Meulman, Heiser, & SPSS Inc. (2004).
SPSS Categories 14.0 (Chapter 7: Multidimensional scaling (ProxScal). Chicago: SPSS
- Reporting Multidimensional scaling in publications
- Reporting - examples in journal articles
Top
- Cluster analysis consists of a group of methods aimed at grouping or clustering objects such that
objects or subjects in a group share many of the characteristics measured. Clustering is generally done
on objects rather than variables, even though the latter is possible.
- Basic reading
- Hair et al., Chapter 8: Cluster analysis.
- Advanced reading
- Everitt, B.S., Landau, S., & Leese, M. (2001). Cluster analysis (4th edition).
London: Hodder Arnold.
- Software
- SPSS => Analyze => Classify => TwoStep clustering
- SPSS => Analyze => Classify => K-means clustering
- SPSS => Analyze => Classify => Hierarchical clustering
- Annotated output for SPSS
- Reporting Cluster analysis in publications
- Reporting - examples in journal articles
Top
Direct Links Statstics & Method Centre:
Direct Links
Master Thesis Lab:
Direct Links FSW: