3WAYPACK THREE-MODE SOFTWARE The Three-Mode Company Pieter M. Kroonenberg Leiden Institute of Education and Child Studies Leiden University Leiden, The Netherlands =============================================================================== 3WAYPACK In this leaflet we introduce 3WAYPACK a suite of programs for three-way data analysis. The TUCKALS and TRILIN programs analyze three-way data via three- mode component models. INTERFACE3 is a completely menu-driven interface for running the analysis programs on windows machines. (The planned extensions with other programs mentioned in section 9, p.11 have now been included in the package). PREPROC3 and POSTPROC are programs to preprocess three-way data, and process output of three-way programs, respectively. Document date: 17 November 2003 (still incomplete) =============================================================================== TUCKALS2 & TUCKALS3 & TRILIN Programs for three-mode data analysis 1. Introduction =============== The TUCKALS and TRILIN programs are designed to perform three-way data analysis, primarily component analyses. The basic Tucker models (analyzed with the TUCKALS programs) and their estimation were developed by Tucker (1966), and improved estimation procedures were devised by Kroonenberg and De Leeuw (1980), and Weesie and Van Houwelingen (1985). The former technique has been fully described in Kroonenberg (1983a) and an annotated bibliography is Kroonenberg (1983b) and a theoretical overview is Kroonenberg (1992). Whereas TUCKALS3 is designed to reduce all three modes, in TUCKALS2 components are computed over only two of the three modes, so that the third mode retains its original order. This version of the technique was developed by Tucker (1972) building on his earlier work (Tucker, 1966). The PARAFAC model (analyzed with the TRILIN program) was independently developed by Harshman (1970; Harshman & Lundy, 1984a,b) and Carroll and Chang (1970). 2. Three-way data ================= Three-mode data analysis deals with three-way data which can be classified by three (or less) kinds of entities (called modes), say subjects, variables, and occasions. These terms should be considered generic, rather than specific ones. Three-way data can be arranged into a three-dimensional block or array X. The three ways will be called A, B, and C, respectively. The orders of X are I, J, and K (upper case), and i, j, and k (lower case) are the indices for the elements of the respective modes. A three-way array can be seen as composed of two-mode submatrices called slices, and of one-mode submatrices (or vectors), called fibers. These two-way submatrices will be referred to as frontal slices, horizontal slices, and lateral slices. The fibers will be called rows, columns, and tubes. Throughout this text X(k) will denote the k-th of K frontal slices of X. The matrices of component loadings are named after the modes they refer to, but as usual, the names of vectors and matrices are printed in bold face. Thus A is the component matrix for Mode A and so on. The core array of the Tucker3 model (see below) is denoted by G, and the extended core array of the Tucker2 model (see below) by H. The terminology presented here is largely based on Harshman and Lundy (1984a, b). The only difference lies in the choice of A and B. Unfortunately, (initially) Harshman and Lundy call Mode B what is called Mode A here, and vice versa. 3. Characteristics of input data ================================ The programs included in 3WAYPACK are primarily geared towards metric three-way three-mode data, which are fully crossed with respect to all modes. In version 5 missing data procedures are included in all three programs. The programs may be used for three-way two-mode data, such as multiple covariance matrices or (double-centred) (dis)similarity matrices, thereby allowing INDSCAL and IDIOSCAL analyses (Carroll & Chang, 1970, 1972). In the latter case it is implicitly assumed that the dissimilarities are equal to squared distances rather than ordinary distances. If this is unacceptable, corrections should be made before the analysis proper. Three-way interactions from analysis of variance or loglinear analyses may also be used as input. With special rescaling not yet included in 3WAYPACK three-way correspondence analysis can be performed as well (Carlier & Kroonenberg, 1996). There are no specific provisions in the program for nonmetric data, such as optimal scaling, for handling ordinal or nominal data. 4. Data manipulation ==================== Several centrings can be performed in the program, primarily on frontal slices of the three-way matrix, such as centring rows, columns or frontal slices, and normalisation of frontal and lateral slices, but the programs are not specifically geared towards comprehensive data manipulation. In practice, the centring options suffice for most data sets, especially as by transposing the data matrix all desired centrings can be performed. Centring on three modes at the same time is seldom necessary. However, for full data manipulation, the programs PREPROC3 as incorporated in INTERFACE3, or Harshman and Lundy's PARAFAC program (Harshman & Lundy, 1984a,b) can be used; they contain most of the desired options for centring and normalisation. The latter program also includes an (iterative) normalisation procedure for simultaneously normalising two or three modes. 5. Mathematical models ====================== 5.1 TUCKALS3 ------------ This program handles the Tucker3 model, in which orthonormal components are computed for each of the three modes. The weights for combinations of components of the three modes are computed as well. From the components the core array G is computed which has orders equal to the number of components of each mode, i.e. PxQxR. The model is formally described as z(i,j,k) = Sum(p=1..P) Sum(q=1..Q) Sum(r=1..R) a(i,p) b(j,q) c(k,r) g(p,q,r), where i=1,...,I, j=1,...,J, and k=1,...,K; P, Q, and R are the number of components in each mode, and A = (a(i,p)), B = (b(j,q)), and C = (c(k,r)) are the component matrices of the first, second, and third mode respectively. G = (g(p,q,r)) is the PxQxR core array, and E = (e(i,j,k)) the three-mode matrix with errors of approximation. A matrix formulation of the model is Z(k) = A H(k) B' in which the H(k), the individual characteristic matrices are equal to a linear combination of the R frontal slices, G(r) , of the core array, i.e. H(k) = Sum(r=1..R) G(r). When instead of direct fitting of the original data, indirect fitting is used for cross-product or covariance matrices, mostly A and B will become identical or sign permuted versions of each other, and the matrix C has in that case strong similarity to the compromise matrix in STATIS (Lavit, Escoufier, Sabatier, & Traissac, 1994). 5.2 TUCKALS2 (integrated in the TUCKALS3 part of the program) ------------------------------------------------------------- This program handles the Tucker2 model, in which orthonormal components are computed for two of the three modes. The weights for combinations of components of the first two modes of each of the elements of the third mode are computed as well. They form together the core array H which has orders equal to the number of components of two of the modes times the size of the third mode, i.e. PxQxK. The model is formally described as z(i,j,k) = Sum(p=1..P) Sum(q=1..Q) a(i,p) b(j,q) h(p,q,k), where i=1,...,I, j=1,...,J, and k=1,...,K; P and Q are the number of components for the first two modes, and A = (a(i,p)) and B = (b(j,q)) are the component matrices of the first and second mode respectively. H = (h(p,q,k)) is the PxQxK extended core array, and E = (e(i,j,k)) the three-mode matrix with errors of approximation. A matrix formulation of the model is Z(k) = A H(k) B' in which the H(k) are the (unrestricted) individual characteristic matrices. When instead of direct fitting of the original data, indirect fitting is used for cross-product or covariance matrices, mostly A and B will become identical or sign-permuted versions of each other, and the matrix H will in general be symmetric with possibly sign inversions. The Tucker2 model is then identical to the IDIOSCAL model of Carroll & Chang (1970, 1972). 5.3 TRILIN ---------- This program handles the Parafac model, in which components are computed for each of the three modes, but each component of each mode is only associated with one single component of the other modes. The weights for each combination of components of the three modes are computed as well. The model is formally described as z(i,j,k) = Sum(s=1..S) a(i,s) b(j,s) c(k,s), where i=1,...,I, j=1,...,J, and k=1,...,K; S is the number of component common to all modes, and A = (a(i,s)), B = (b(j,s)), and C = (c(k,s)) are the component matrices of the first, second, and third mode respectively. E = (e(i,j,k) the three-mode matrix with errors of approximation. When instead of direct fitting of the original data, indirect fitting is used for cross-product or covariance matrices, mostly A and B will become identical or sign-permuted versions of each other. When the input frontal slices are symmetric and similarity matrices generally the component matrices will be symmetric as well, and TRILIN will fit to the INDSCAL model (Carroll & Chang, 1970). 5.4 General remarks ------------------- The three-way models are essentially non-stochastic and data-analytic. The Tucker models suffer from rotational indeterminacy of the components, but this indeterminacy allows for a nonrestricted, easy to fit, model. The indeterminacy implies that after a solution is obtained the orthonormal solution may be transformed in several ways without loss of fit, if the appropriate inverse transformations are applied as well. Within 3WAYPACK such transformations may be performed via the program ROTATE, one of the POSTPROC procedures. The components of the Parafac model cannot be transformed without loss of fit to the data, because the model is an identified one. Generally the components will be correlated. Due to the restrictive character of the model, not uncommonly degeneracies may occur, i.e. no solution exists for the requested number of components. The program will clearly warn the user when this is the case (for details on degeneracy see Harshman & Lundy, 1984a). 6. Optimization algorithm ========================= 6.1 Tucker3 model ----------------- The estimation of the Tucker3 model is achieved via an alternating least squares algorithm which minimizes a loss function. The minimization problem can be reduced by solving first for G, and substituting the solution into the loss function. This revised loss function can be solved via cyclically estimating A for fixed B and C, followed by B for fixed C and A, and then C for fixed A and B, etc. Each subproblem is an eigenvalue-eigenvector problem of a dimension equal to the number of components for the mode in question, and it can be handled efficiently by using a Jacobi procedure embedded in Bauer-Rutishauser's simultaneous iteration method (for details, see Kroonenberg & De Leeuw, 1980), or in a Gram-Schmidt orthonormalisation procedure (see Kroonenberg, Ten Berge, Brouwer, & Kiers, 1990). 6.2 Tucker2 model ----------------- The estimation of the Tucker2 model is achieved via an alternating least squares algorithm which minimizes a loss function. The minimization problem can be reduced by solving first for H, and substituting the result into the loss function. Analogous to the Tucker3 case, this loss function can be solved via cyclically estimating A for fixed B, followed by B for fixed A, and then A for fixed B again, etc. Each subproblem is an eigenvalue-eigenvector problem of a dimension equal to the number of components for the mode in question. 6.3 Parafac model ----------------- The estimation of the Parafac model is also achieved via alternating least squares using regression procedures. The program has the option to request orthonormal components, nonnegative components, or unrestricted components. 6.4 Initialisation ------------------ To start iterations for the Tucker models, solutions obtained via Tucker's Method I are used, which will already provide the solution if an exact solution exists. As in virtually all problems of this kind, only convergence to a local minimum is assured. However, the specific initial configuration has shown to steer the algorithm in the proper direction. The general impression is that local minima do not form a serious problem. The TRILIN program uses random starts as a default, but the option to use Tucker's Method I is available as well. The program can be requested to make multiple starts with different random matrices to assess the stability and quality of the solution. A request may be made to make the selection of the best analysis automatic. The algorithm is far more sensitive to local minima then the TUCKALS algorithms, so that several starts are advisable. Most times at least three should be tried, but with more components more starts are advisable. 7. Results ========== The primary output of the programs consists of the following elements: 1. Information on the overall fit of the model; (Provisional) calculation of degrees-of-freedom; Several partitionings of the overall fit by variables, subjects, occasions by the component combinations via the extended or full core array (TUCKALS2, TUCKALS3); 2. Components scaled in several ways; 3. Core array scaled in several ways (also from TRILIN components); Supplementary information 4. Input data check; 5. Removed means and normalisation factors; 6. Initial configurations; 7. Iteration history; 8. Missing data handling and estimation; 9. Optional convergence acceleration; 10. Optional automatic selection of best analysis (TRILIN); 11. Residuals, fitted data; 12. Joint (bi)plots (TUCKALS2, TUCKALS3); 13. Component transformations (TUCKALS2, TUCKALS3); 14. Distances (inner products) of points in joint plots; 15. Many other plots; 16. Restrictions to orthonormal or nonnegative components (TRILIN); 17. Extensive degeneracy check (TRILIN); 18. External configurations can be read in to restart analyses to evaluate results from other studies to evaluate component spaces after transformation. 19. All analysis with all possible component combination less than and equal to the supplied numbers of components; 20. Simulataneous rotation of core and components (ROCOCO). 21. Output of HTML files for browsers. 22. Publication quality plots made by GnuPlot. 8. Some technical information ============================= The programs are written in portable FORTRAN90. The present versions have been compiled with MS Fortran 5.0 and MS Power Station and they run in a Windows environment. There are two different ways of specifying the input parameters for the present versions of the programs. The most attractive one is via the user interface INTERFACE3 - IF3. This is a fully menu-driven program which handles input specification, data preprocessing, execution, output inspection and output processing to calculate residuals, perform transformations of the component matrices, and produce joint plots. It also allows for editing output files with the user's favourite editor. A full description of IF3 is given in the 3WAYPACK User's Manual. The other method of specifying input parameters is by using the antiquated fortran input specification based on entering numbers in specific columns on specific records. Thus the analysis programs can also function fully as stand-alone programs. 9. Realised recent developments. =============================== * Inclusion of MIXCLUS3, a program for performing three-way mixture method of clustering (Basford & McLachlan, 1985). * Inclusion of the T3COVAR programs using a faster algorithm for the Tucker3 models with as special features input of multivariable-multioccasion matrices and handling data sets with one very large mode (Kiers & Krijnen, 1991; Kiers, Kroonenberg, & Ten Berge, 1992). * Inclusion of ANACOR3, three-mode correspondence analysis, developed in co-operation with the late Andre Carlier. * Inclusion of Procrus, a Procrustes analysis program based on work by Commandeur. * Inclusion of Simultaneoud Component Analysis developed by Timmerman and Kiers. 10. Documentation ================= 10.1 User's guide ----------------- *************************************************************** No new user's guides for the present version of the programs is available as of yet. *************************************************************** Kroonenberg, P.M. (1996). 3WAYPACK User's manual. A package of three-way programs. Leiden: Leiden Institute of Education and Child Studies, Leiden University. Kroonenberg, P.M. (1996). 3WAYPACK Menu structure. Leiden: Department of Education, Leiden University. 10.2 Technical references ------------------------- Basford, K.E., & McLachlan, G.J. (1985). The mixture method of clustering applied to three-way data. Journal of Classification, 2, 109-125. Brouwer, P., & Kroonenberg, P.M. (1991). Some notes on the diagonalization of the extended three-mode core array. Journal of Classification, 8, 93-98. Carlier, A., & Kroonenberg, P.M. (1996). Decompositions and biplots in three-way correspondence analysis. Psychometrika, 61, 355-373. Carroll, J.D., & Chang, J.J. (1970). Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckart-Young" decomposition. Psychometrika, 35, 283-319. Carroll, J.D., & Chang, J.J. (1972). A generalization of INDSCAL allowing IDIOsyn- cratic reference systems as well as an analytic approximation to INDSCAL. Paper presented at the Spring Meeting of the Psychometric Society, Princeton, New Jersey, March. Escoufier, Y., Lavit, Ch., & Traissac, P. (1994). The ACT (STATIS method). Computational Statistics and Data Analysis. Harshman, R.A. (1970). Foundations of the PARAFAC procedure: Models and methods for an "explanatory" multi-mode factor analysis. UCLA Working Papers in Phonetics, 16, 1-84. Harshman, R.A., & Kroonenberg, P.M. (1989). Overlooked solutions to Cattell's parallel proportional profiles problem: A perspective on three-mode analysis. Technical report. Leiden Institute of Education and Child Studies, Leiden University. Harshman, R.A., & Lundy, M.E. (1984a). The PARAFAC model for three-way factor analysis and multidimensional scaling. In H.G. Law, C.W. Snyder Jr., J.A. Hattie, and R.P. McDonald (Eds.), Research methods for multimode data analysis (pp. 122-215). New York: Preager. Harshman, R.A., & Lundy, M.E. (1984b). Data preprocessing and the extended PARAFAC model. In H.G. Law, C.W. Snyder Jr., J.A. Hattie, and R.P. McDonald (Eds.), Research methods for multimode data analysis (pp. 216-284). New York: Preager. Kiers, H.A.L., & Krijnen, W.P. (1991). An efficient algorithm for PARAFAC of three-way data with large numbers of observational units. Psychometrika, 56. 147-152. Kiers, H.A.L., Kroonenberg, P.M., & Ten Berge, J.M.F. (1992). An efficient algorithm for TUCKALS3 on data with large numbers of observational units. Psychometrika, 57, 415-422. Kroonenberg, P.M. (1983a). Three-mode principal component analysis. Theory and applications. Leiden: DSWO Press. Kroonenberg, P.M. (1983b). Annotated bibliography of three-mode factor analysis. British Journal of Mathematical and Statistical Psychology, 36, 81-113. Kroonenberg, P.M. (1992). Three-mode component models. Statistica Applicata, 4, 619- 634. Kroonenberg, P.M., & De Leeuw, J. (1980). Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika, 45, 69-97. Kroonenberg, P.M., Ten Berge, J.M.F, Brouwer, P., & Kiers, H. (1989). Gram-Schmidt versus Bauer-Rutishauser in alternating least-squares algorithms for three-way data. Computational Statistics Quarterly, 4, 81-87. Ten Berge, J.M.F., De Leeuw, J., & Kroonenberg, P.M. (1987). Some additional results on principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika, 52, 183-191. Tucker, L.R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279-311. Tucker, L.R. (1972). Relations between multidimensional scaling and three-mode factor analysis. Psychometrika, 37, 3-27. Weesie, H.M., & Van Houwelingen, J.C. (1983). GEPCAM User's Manual. Utrecht: Institute of Mathematical Statistics, University of Utrecht. =============================================================================== H O W T O O R D E R The software can be obtained by completing the order form. With respect to the software it is imperative to sign the contract that is part of the order form. After sending the order form you will receive an invoice for the total amount, but advance payment would be appreciated. Unfortunately Dutch customers have to pay 17.5% BTW over the listed prices. Reduced prices are for educational and governmental institutions only, all others should be the full price. If possible, please remit in Euros to account 530510642 of "Leiden University c.s. Fac. Sociale Wetenschappen" at the ABN-AMRO bank, Stationsweg, Postbus 66, 2300 AB Leiden, The Netherlands; Swiftcode: ABN ANL 2A; IBAN: NL21ABNA0530510642. Please note that banking charges should be borne by the customer. Payments should quote the following code: SAP 2403 014 455 (P. M. Kroonenberg). Alternatively, pay by credit card (supplying the necessary information), remit in Euros or US dollars, using an International Money Order or a Bank Draft, made out to "Leiden University, c.s. Fac. Sociale Wetenschappen, Wassenaarseweg 52, 2333 AK Leiden", also quoting the code above. Our VAT code: NL001935549B01 Order forms and payment should be sent to: The Three-Mode Company P. M. Kroonenberg Leiden Institute of Education and Child Studies, Leiden University Wassenaarseweg 52 2333 AK Leiden, The Netherlands Tel. *-31-71-5273446 Fax *-31-71-5273945 e-mail: kroonenb at fsw.leidenuniv.nl =============================================================================== 3WAYPACK 3WAYPACK is a package with analysis program designed to handled three-way data in various forms. Three-way data typically arise when a sample of subjects are measured on several variables under a number of conditions. Of course, the terms subjects, variables, and conditions are only generic, and are different in different fields of application. For instance, the subjects might be Blue crabs, the variables chemicals elements found in different tissues of the body of a crab as in a study by Gemperline et al. (1992; Analytical Chemistry). The principal analysis methods in 3WAYPACK are three-way generalisations of component analysis, but other programs are included as well. The programs can also be used for a simple singular value decomposition of a single two-way matrix, for a weighted (or replicated) principal component analysis, the individual differences scaling models INDSCAL and IDIOSCAL, clustering, procrustes analysis, and several other models. The three-way analysis programs are written in Fortran90. The complete distribution contains TUCKALS3, TUCKALS2 and Three-mode correspondence analysis in one program, TRILIN (Parafac model), Three-mode mixture method of clustering, Simultaneous cluster analysis and Procrustes analysis. The present versions come with a special interface written in Pascal using conventional memory, called INTERFACE3 (if3). This interface has a menu structure to facilitate the preparation of the input, running the program, and inspecting the output. The three-way programs can, however, also be used as stand-alone programs. The present versions include missing data handling plus several other enhancements compared to previous versions. Preprocessing of the input data, such as centring and normalisation, and postprocessing of the output, such rotating components, constructing joint plots and inspecting residuals can be performed within the framework of 3WAYPACK. Browser output is available as publication quality graphics through Gnuplot. The software will be supplied as executable programs compiled with MS Power Station. Input and output examples are supplied as well as a user manual (for an older version for the program) is available via the website of the Three-Mode Company. Persons who need to handle very large data sets requiring very large amounts of memory, or who only have access to computers with non-dos or windows operating systems should contact the author for special versions of the programs. Versions under UNIX, and on the Macintosh are under development, please check with the author. =============================================================================== The Three-Mode Company P.M. Kroonenberg Leiden Institute of Education and Child Studies Leiden University O R D E R F O R M SOFTWARE: * 3WayPack (complete), latest version Euro 200 /Euro 600 _____ Shipping, handling costs, and departmental levies Euro 40/Euro 120 _____ * 3WayPack (Tucker programs only), latest version Euro 100 /Euro 300 _____ Shipping, handling costs, and departmental levies Euro 20/Euro 60 _____ * DUTCH customers please add 17.5% BTW _____ * Information about availability on other operating systems than Windows (please specify) ___________________________ Please note that it is imperative to sign the contract when ordering software! Reduced prices are for educational and governmental institutions only. PLEASE SPECIFY THE OPERATING SYSTEM UNDER WHICH YOU WOULD LIKE TO USE THE THREE-MODE PROGRAMS: DOCUMENTATION: * Copies of the manuals are available for free from the website of the website of the Three-Mode Company, in particular + 3WAYPACK USER'S MANUAL + 3WAYPACK MENU STRUCTURE + P. M. Kroonenberg Three-mode principal component analysis: Theory and applications. (3rd printing + errata) Leiden: DSWO Press, 1983 =============================================================================== The Three-Mode Company P.M. Kroonenberg Leiden Institute of Education and Child Studies Leiden University C O N T R A C T The programs will be supplied under the following conditions: 1. Buyer acquires the right to use the programs. The programs remain property of the authors. Buyer is not allowed to distribute programs to a third party without written consent of the authors. 2. All programs will be delivered via e-mail or a dedicated web page, unless requested otherwise. 3. Buyer will take care that in a publication of results obtained through the use of one of the programs, the name of the program and its author will be referred to, and copies of such publications will be sent to the author of the programs. 4. The programs will be supplied as soon as possible, mostly within a week after receipt of the order. Within one month after receipt of the programs, the total charge will be paid. Place, date _______________________ Signature _______________________ Name _____________________________________________________ Institution _____________________________________________________ Address _____________________________________________________ ===============================================================================

| Algemene en Gezinspedagogiek | Centre for Child & Family Studies | Leiden Institute of Education and Child Studies | The Three-Mode Company | TOP |

Leiden Institute of Education and Child Studies, Leiden University

Wassenaarseweg 52, 2333 AK Leiden, The Netherlands

Tel. *-31-71-5273446/5273434 (secr.); fax *-31-71-5273945

E-mail: kroonenb at fsw.leidenuniv.nl

Created: 4-04-1997

Last Updated: 17-11-2003