Exploring Subject-Related Interactions in Repeated Measures Data Using Three-Mode Principal Components Analysis

[Methodology]

Gilbert, Dorothy Ann; Sutherland, Michael; Kroonenberg, Pieter M.

Dorothy Ann Gilbert, PhD, RN, is an Associate Professor, School of Nursing, University of Massachusetts, Amherst, MA where Michael Sutherland, PhD, is Director of the Statistical Consulting Center.

Pieter M. Kroonenberg, MS, PhD, is an Associate Professor, Leiden University, Leiden, The Netherlands.

Accepted for publication June 19, 1998.

The study used as an example in this paper was supported in part by grant R15-NR02460 from the National Institute of Nursing Research, National Institutes of Health.

Address reprint requests to Dorothy Ann Gilbert, School of Nursing, University of Massachusetts Amherst, Arnold House, Box 30420, Amherst, MA 01003-0420.

Outline

Extent of Subject-Related Interaction in a Three-Mode Array
Three-Mode Principal Components Analysis
Nature of Subject-Related Interaction in a Three-Mode Array
Discussion
References

Graphics

Table 1
Table 2
Table 3
Table 4

Repeated measures designs have several advantages for nursing research including control of error resulting from between-subjects differences, more efficient involvement of subjects, and an opportunity to clarify complex relationships in the resulting data such as three-way interactions among the subjects, variables, and conditions being repeated. Although information concerning three-way interactions is beyond the capability of most traditional methods of analysis (e.g., Girden, 1992), such information can be obtained using three-mode principal components analysis (PCA) (Kroonenberg, 1983; Tucker, 1966).

Information about subject-related interactions is important descriptively in itself. It also may indicate the presence of unanticipated structure in the data during preliminary phases of an experimental investigation, thus providing guidance for subsequent phases.

The purpose of this article is to describe the use of three-mode PCA to explore subject-related interactions in repeated measures data. A preliminary study of nurses' relational communication provides an illustration of the technique.

Repeated measurement of a set of variables under different conditions for the same subjects results in a three-mode data array. For example, in the preliminary study of nurses' relational communication, the variables were 23 items that communicated different relational themes (Burgoon & Hale, 1984) such as trust and nondominance. The conditions were six client-nurse interactions in which the nurses exhibited different styles of listening behavior (Gilbert, 1993) such as reciprocal and active styles. The participants were a convenience sample of 36 White women. The raw data, therefore, consisted of a 23 × 6 × 36 array of relational items for nurses' listening behavior as rated by study participants.

The 36 participants responded to the 23 items on a Likert-type scale concerning the relational communication they perceived in the videotaped behavior of the six nurses who were listening to the same client portrayed by a professional actress. The three-way interaction of interest was whether different styles of nurses' listening behavior communicated different kinds of relational items to different types of study participants.

Extent of Subject-Related Interaction in a Three-Mode Array^

Three-mode PCA offers investigators information concerning the extent of subject-related interaction in a three-mode data array, as well as information concerning the possible nature of the interaction. The extent of subject-related interaction is estimated by identifying the portion of the sum of squares in the array attributable to each of the modes and interactions. If the sum of squares of a mode or interaction is large relative to the associated number of degrees of freedom, its contribution to the data structure is likely to be substantial, thus representing a potentially productive place to begin data exploration.

A simple, fixed analysis of variance (ANOVA) model with a single observation per cell can be used descriptively to decompose the total sum of squares in the array into portions attributable to the main effects of the three modes, the two-way interactions, and the three-way interaction. Sums of squares for each of the modes and selected interactions are calculated from corresponding matrices of marginal means (Winer, 1971).

Table 1 shows the descriptive ANOVA decomposition of the preliminary nurses' relational communication data. The mean squares of the items and nurses' behavior modes were relatively large, suggesting that they were major contributors to the data structure as intended in the design of this study. Together, the sums of squares of the items mode, the nurses' behavior mode, and their two-way interaction accounted for 47.5% (df = 137) of the total sum of squares.

TABLE 1. Fixed Effects Three-Way Analysis of Variance (ANOVA) Without Replication: The Preliminary Nurses' Relational Communication Data

The remaining 52.5% (df = 4,830) resided in the terms related to the participants mode, the two-way interactions involving participants, and the three-way interaction. It contained the information on variability in participants' responses and the way in which that variability may have been affected by the nurses' behavior and items modes. In classical repeated measures ANOVA, it is essentially the within-subjects sum of squares. In three-mode PCA, it is the variability that remains after the sums of squares of the variables and conditions modes as well as their interaction have been accounted for.

It should be noted that for this part of the analysis, the participants mode is treated as fixed rather than random because the ANOVA is used in a descriptive rather than an inferential manner. Moreover, the individual differences between participants also are part of the investigation. Although pure error is certain to account for some of this subject-related variability, it also is true that some portion may have an interpretable structure, up to and including the three-way interaction. The problem is knowing how to go about discovering useful low-dimensional structure in a 4,830-df table of within-subjects variability so the extent of subject-related interaction can be estimated. In essence, the goal is to split the subject-related sum of squares into a structural part with few degrees of freedom and an error part with many degrees of freedom.

As a direct consequence of such an approach, a real error term emerges that can be used to (re)calculate the F statistics in contrast with the pseudo error term-the three-way interaction. Another approach would have involved tackling the two-way interactions and the three-way interaction separately (Kettenring, 1983) However, in this case it was preferred to model all subject-related variability simultaneously.

Three-Mode Principal Components Analysis^

Descriptive decomposition of three-way data using an ANOVA model suggests the extent of the largest contributors to the data structure, but it is provisional and incomplete. Strictly additive models, such as ANOVAs, require the investigator either to stop the analysis at the F test level with large amounts of variability and their degrees of freedom unaccounted for, or to specify, a priori, some set of contrasts so as to decompose fully the higher-order interaction terms. However, this is virtually impossible to do when many levels are involved.

Instead, multiplicative models such as three-mode PCA use the data's variability itself to discover the most parsimonious set of contrasts in order to explain the remaining interaction data structure. Therefore, the approach to the exploration of subject-related interaction described here actually is a hybrid additive and multiplicative model. After the major contributors have been accounted for (i.e., the items and nurses' behavior modes and their interaction in the case of the nurses' relational communication study), the modeling of the residual variability related to the subjects, with all its degrees of freedom, uses three-mode PCA to guide further modeling (Kroonenberg, 1983, 1994; Tucker, 1966). For two-way tables, there exists a long tradition, especially in agriculture, of analyzing interactions with multiplicative models (Van Eeuwijk, 1995).

In three-mode PCA, separate components are defined for each of the modes (e.g., items, nurses' behavior, and participants). These components may be interpreted independently of each other, in contrast to regular PCA in which there is only one type of component. In addition, the model contains parameters indicating the importance of component combinations from the different modes, and they are assembled in a three-dimensional core matrix. The elements of the core matrix fulfill the same role as the regression coefficients (or weights) in regression analysis in that large values signal large contributions toward explaining the structure in the data. The model thus comprises a summation of terms that have the form of a weight multiplied by a multiplicative combination of one component from each mode (see Kroonenberg 1983, 1994 for full technical details). The necessary computations were carried out with the program suite 3WAY-PACK (Kroonenberg, 1997).

The estimated extent of subject-related interaction depends on the model selected. As for regular PCA, there are no strict criteria in three-mode PCA for selecting the "correct" model or number of components for the different modes. Usually, the decision as to the number of components depends on the detail desired in a solution, the extent to which the observed patterns can be expected to recur in new samples, and the interpretability. Selection also depends on the fit of the solution relative to how many degrees of freedom are required.

The fit of several potential solutions for the nurses' relational communication data is summarized in Table 2, which reveals that the 2 × 4 × 4 solution explains 38% of the residual variability, with 212 degrees of freedom, and the 2 × 2 × 2 solution explains 29%, with 118 degrees of freedom. Thus, there are systematic patterns to the variability of participants' responses that may constitute a subject-related interaction to be explored with three-mode PCA, even though the data contain a considerable amount of random error or idiosyncratic judgment not well captured by relatively simple models of all observations that have so few degrees of freedom. For the purpose of using these data to illustrate how the subject-related interaction may be analyzed, the 2 × 2 × 2 solution was selected because it constitutes the most simple structure for describing the nature of subject-related variability.

TABLE 2. Fit Information on Three-Way Solutions for the Subject-Related Variability in the Preliminary Nurses' Relational Communication Data

Nature of Subject-Related Interaction in a Three-Mode Array^

To understand the substantive nature of the components (e.g. the "styles" of listening or the "kinds" of items), it is necessary to consider the components found by three-mode analysis for each mode separately. Table 3 provides the coordinates for the two components of the nurses' behavior mode. These two components grouped the nurses' behavior into two distinct styles based on the variability in participants' scores. Nurses 3, 5, and 6 had high scores on the first nurses' behavior component (B1), which explained 17.2% of the residual variability, and Nurses 1 and 4 had high scores on the second component (B2), which explained 11.4% of the residual variability. The variability of Nurse 2 was relatively low in this analysis. Thus, participants' scores for Nurse 2 did not differ substantially from the average pattern as indicated by scores close to zero.

TABLE 3. Components of the Nurses' Behavior Mode (Unit Length)

Because the scores on the first behavior component had a substantial correlation of .61 with the nurses' reciprocity scores (Gilbert, 1993), the component may represent a reciprocal listening style as intended in the original study design. The second component scores were highly correlated with the number of nurses' activities (r = -.88). This number ranged from 2.31 to 4.77 activities per second, in which activities included behavior such as head nodding and leaning toward the patient. However, scores departed from the average pattern most noticeably for Nurses 1 and 4, suggesting that the second component may be an inactive listening style, also as intended in the original design.

In a similar fashion, the components of the items and participants modes can be described, although their component scores are not presented here because of space limitations. In contrast to the nurses' behavior components, differential item usage did not account for much of the residual variability because the first component alone explained 27% of the variability in the items mode (Table 2). Twenty items, originally selected to identify the nurses' communication of trust, affection, receptivity, immediacy, composure, and informality, had relatively high positive scores on the first items component. This suggests that the first kind of item tended to be one generally considered to communicate a good client-nurse relationship. The remaining three items, concerning the nurses' nondominance in the interactions, had near-zero scores on the first component, indicating that they did not differ substantially from the average pattern. The separation of nondominance from other relational themes is one of the most widely recognized in relational communication (Burgoon & Hale, 1984), and it was expected in this study.

The second items component explained only 2% of the residual variability. Items containing the words "calm," "open," and "relaxed" had high positive scores for this component, and the "involved," "warm," and "not businesslike" items had high negative scores. Therefore, the component may have been complex, consisting of several subcomponents that cannot be seen in a low-dimensional analysis such as this.

The components of the participants mode remain to be considered. The first and second participants components explained 18% and 10% of the residual variability, respectively (Table 2). Participants who had high positive scores on the first participants component tended to be less outspoken in that they used the extreme points of the Likert-type scale less than the average participant. Participants' scores on the second participants' component were moderately correlated with their satisfaction scores (r = .42), such that participants who had high positive second component scores were generally more satisfied with all of the nurses' listening behavior than the average participant. Although the structure found in the participants mode was not anticipated, neither of the two participantscomponents could be linked to an outside characteristic such as participants' self-reported knowledge of communication.

Whereas the substantive nature of the components of a proposed model in three-mode analysis is understood by considering each mode separately, the nature of subject-related interaction is understood by considering the three modes together in the three-dimensional core matrix with relationships between components. The 2 × 2 × 2 solution for the preliminary nurses' relational communication data is presented as an example in Table 4. In this case, the core matrix shows that the two different components of participants (P1 and P2) perceived the two different components of relational items (I1 and I2) in the two different components of nurses' listening behavior (B1 and B2). In Table 4, the P1-I1-B1 and P2-I1-B2 combinations account for most of the subject-related interaction because they explain 16.8% and 10.1%, respectively, of the 29% of residual variability explained by the 2 × 2 × 2 solution.

TABLE 4. Core Matrix of the Preliminary Nurses' Relational Communication Data Using a 2 × 2 × 2 Model

Core matrices are most easily interpretable when the diagonal elements are nonzero numbers and the off-diagonal elements are very small numbers or zero, meaning that each component of a mode is exclusively linked to one component of another mode (Kroonenberg, 1983). In terms of the example, the nonzero, positive number 4.5 of the P1-I1-B1 cell in Table 4 (explaining 16.8% of the subject-related variability), along with the small values (.1 and -.1) for P2-I1-B1 and P1-I1-B2, both explaining less than .1%, means that a reciprocal listening style (B1) tended to communicate a kind of item characteristic of good client-nurse relationships (I1) to the less outspoken type of participant (P1) more than to the average participant. However, as can be seen by the low amount of variation in the P1-I1-B2 cell, the view of the less outspoken type of participant (P1) concerning the inactive style (B2) did not differ from that of the average participant.

Similarly, as indicated by a relatively high value in the P2-I1-B2 cell (core value 3.5, explaining 10.1% of the subject-related variability), an inactive listening style (B2) tended to communicate a kind of item characteristic of good client-nurse relation-ships (I1) to the generally satisfied type of participant (P2) more than to the average participant. The generally satisfied participant did not differ from the average participant in relation to B1. Finally, to a small but potentially interesting extent (core value 1.2, explaining 1.2%), an inactive listening style (B2) tended to communicate a second type of relational item (I2) to the less outspoken type of participant (P1). However, as noted in the presentation of the item components, the precise meaning of I2 remains unclear in this analysis.

Discussion^

Subject-related variability, especially as contained in three-way interactions, often is regarded as a trouble-some complexity hindering a clear understanding about the effects of a study's factors. However, such variability can be a primary focus used to understand multidimensional data in their fullest sense. The patterns of interaction among components, together with their substantive interpretation, allow an in-depth study of the relevant patterns in subject-related interaction. Furthermore, ignoring these patterns of interaction and treating the data as if each set of items were from a large number of individuals who each experienced one condition rather than from a smaller number who experienced several conditions violates the repeated measures design, perhaps prematurely assuming that subjects' variability is random, that conditions are interchangeable, or that there is no interaction among their components.

In the preliminary data of nurses' relational communication, different styles of nurses' listening behavior communicated different kinds of relational items to different types of participants. This phenomenon might occur in any study, but without the repeated measures design, it could not be detected. When such patterns are detected in preliminary data, they can be used to guide further data collection and analysis or to alter the study design as needed.

However, there are some limitations to the approach. First, it can be used only with certain kinds of repeated measures designs (i.e., only those in which a set of variables is repeatedly measured under different conditions for the same subjects). The second limitation concerns the tension between the goodness of fit of a complex model versus ease of interpretation. In the preliminary nurses' relational communication data, for instance, the 2 × 2 × 2 model explained 29% of the residual variability, but the meaning of the various components could be roughly described. The 2 × 4 × 4 solution might have explained 38% of the residual variability without a dramatic increase in the number of degrees of freedom, but interpreting this more complex model would have been daunting.

Despite these limitations, three-mode PCA provides a method for extracting existing systematic patterns even in the presence of a large amount of unsystematic variation. It complements more traditional factor analytic approaches and is a valuable technique that nurse researchers can add to their repertoire of methods.

References^

Burgoon, J. K., & Hale, J. L. (1984). The fundamental topoi of relational communication. Communication Monographs, 51, 193-214. ExternalResolverBasic [Context Link]

Gilbert, D. A. (1993). Reciprocity of involvement activities in client-nurse interactions. Western Journal of Nursing Research, 15, 674-687. ExternalResolverBasic Bibliographic Links [Context Link]

Girden, E. R. (1992). ANOVA: Repeated measures. Newbury Park, CA: Sage. [Context Link]

Kettenring, J. R. (1983). Components of interactions in analysis of variance models with no replications. In P. K. Sen (Ed.), Contributions to statistics: Essays in honor of Norman Lloyd Johnson (pp. 283-297). Amsterdam: North-Holland/Elsevier. [Context Link]

Kroonenberg, P. M. (1983). Three-mode principal-component analysis: Theory and applications. Leiden, The Netherlands: DSWO Press. [Context Link]

Kroonenberg, P. M. (1994). The TUCKALS line: A suite of programs for three-way data analysis. Computational Statistics and Data Analysis, 18, 73-96. [Context Link]

Kroonenberg, P. M. (1997). 3 WAYPACK user's manual: A package of three-way programs (technical report). Leiden, The Netherlands: Department of Education, Leiden University. Information on software: https://www.universiteitleiden.nl/en/social-behavioural-sciences/~kroonenb. [Context Link]

Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279-311. ExternalResolverBasic Bibliographic Links [Context Link]

Van Eeuwijk, F. A. (1995). Linear and bilinear models for the analysis of multienvironment trials: I. An inventory of models. Euphytica, 84, 1-7. ExternalResolverBasic Bibliographic Links [Context Link]

Winer, B. J. (1971). Statistical principles in experimental design (2nd ed.). New York: McGraw-Hill. [Context Link]

Accession Number: 00006199-200001000-00009