Three-Mode Abstracts, Part B
With one can go to the index of
this part of the Abstracts, with
one can go to other
parts (letters) of the Abstracts.
|Ba | Bb |
Bc | Bd |
Be | Bf |
Bg | Bh |
Bi | Bj |
Bk | Bl |
Bm | Bn |
Bo | Bp |
Bq | Br |
Bs | Bt |
Bu | Bv |
Bw | Bx |
By | Bz |
Bader, B. W., & Kolda, T. G. (2006).
Efficient MATLAB computations with sparse and factored tensors.
Sandia Report, Sandia National Laboratories, Albuquerque, New Mexico and Livermore, California.
In this paper, the term tensor refers simply to a multidimensional or N-way
array, and we consider how specially structured tensors allow for efficient storage
and computation. First, we study sparse tensors, which have the property
that the vast majority of the elements are zero. We propose storing sparse tensors
using coordinate format and describe the computational efficiency of this
scheme for various mathematical operations, including those typical to tensor
decomposition algorithms. Second, we study factored tensors, which have the
property that they can be assembled from more basic components. We consider
two specific types: a Tucker tensor can be expressed as the product of a core
tensor (which itself may be dense, sparse, or factored) and a matrix along each
mode, and a Kruskal tensor can be expressed as the sum of rank-1 tensors. We
are interested in the case where the storage of the components is less than the
storage of the full tensor, and we demonstrate that many elementary operations
can be computed using only the components. All of the efficiencies described
in this paper are implemented in the Tensor Toolbox for MATLAB.
Bader, B. W. , & Kolda, T. G. (2007).
Efficient MATLAB computations with sparse and factored tensors.
SIAM Journal on Scientific Computing, 30, 205-231.
In this paper, the term tensor refers simply to a multidimensional or
N-way array, and we consider how specially structured tensors allow for
efficient storage and computation first, we study sparse tensors, which
have the property that the vast majority of the elements are zero. We
propose storing sparse tensors using coordinate format and describe the
computational efficiency of this scheme for various mathematical
operations, including those typical to tensor decomposition algorithms.
Second, we study factored tensors, which have the property that they
can be assembled from more basic components. We consider two specific
types: A Tucker tensor can be expressed as the product of a core tensor
(which itself may be dense, sparse, or factored) and a matrix along
each mode, and a Kruskal tensor can be expressed as the sum of rank-1
tensors. We are interested in the case where the storage of the
components is less than the storage of the full tensor, and we
demonstrate that many elementary operations can be computed using only
the components. All of the efficiencies described in this paper are
implemented in the Tensor Toolbox for MATLAB.
Baerwald, T.J. (1976).
The emergence of a new "downtown".The Geographical
Review, 68, 308-318.
A geographical application of T3 to changes in
land use over time with land use classes, time periods and
districts as modes. No numerical results presented.
Bagozzi, R. P., & Yi, Y. J. (1990).
Assessing method variance in multitrait multimethod matrices - the case
of self reported affect and perceptions at work.
Journal of Applied Psychology, 75, 547-560.
Spector (1987) concluded that there was little evidence of method variance
in multitrait-multimethod data from 10 studies of self-reported affect and perceptions at work,
but Williams, Cote, and Buckley (1989) concluded that method variance was prevalent. We extended
these studies by examining several important but often neglected issues in assessing method variance.
We describe a direct-product model that can represent multiplicative method effects and propose that
model assumptions, individual parameters, and diagnostic indicators, as well as overall model fits,
be carefully examined. Our reanalyses indicate that method variance in these studies is more prevalent
than Spector concluded but less prevalent than Williams et al. asserted. We also found that methods
can have multiplicative effects, supporting the claim made by Campbell and O'Connell (1967, 1982).
Bailey, R.A., & Rowley, C.A. (1993).
Maximal rank of an element of a tensor product. Linear Algebra
and Its Applications, 182, 1-7.
Upper bounds are given for the maximal rank of an element of the tensor
product of three vector space.
Baird, I. S., Sudharsan, D., & Thomas, H.(1988).
Adressing temporal change in strategic groups analysis: a three-mode factor analysis approach.
Journal of Management, 14, 425-139.
he advantages of using 3-mode factor (TMF) analysis to capture
temporal effects when formulating strategic groups are examined. TMF analysis
is proposed as a procedure to permit joint consideration of temporal changes in
strategy and temporal variations in the importance of strategic criterion variables
when analyzing the strategic group structure within industries. The TMF demonstration
for strategic grouping across time is based upon the 46 firms in the computer industry
for whom financial data are available on COMPUSTAT tapes for 1977-1981.
The empirical data set has the dimensionality of a 46 X 6 X 5 matrix (46
firms arrayed on 6 financial risk variables over a time period of 5
years). The financial mode analysis enables the financial risk
dimensions to be synthesized into 3 factors: 1. investor treatment
orientation, 2. liquidity management sophistication, and 3. debt
aversion. The time-based analysis identified 3 periods of strategic
change in the industry. Financial risk traits of the 6 strategic groups
identified in the firm-mode analysis may be examined with reference to
the core matrix.
Driemodale faktoranalyse in een differentieelpsychologisch
onderzoek naar de beoordeling van abstracte
schilderijen.Nederlands Tijdschrift voor de Psychologie,
T3 and its possible rotations are
discussed at a conceptual level. The model is illustrated with an
analysis of 15 abstract (non-figurative) paintings, scored on 10
bipolar (semantic) scales by 34 subjects. The relation between
neuroticism and extraversion (measured independently), and the
resulting factors were analysed using the core matrix.
Barbieri, P., Andersson, C.A., Massart, D.L. & Predonzani,
S., Adami, G., & Reisenhofer, E. (1999a).
Modeling bio-geochemical interactions in the
surface waters of the gulf of Trieste by
three-way principal component analysis (PCA).
Analytica Chimica Acta, 398, 227-235.
Data of temperature, salinity, dissolved oxygen,
nutrients and chlorophyll measured on samples of
surface seawater and collected monthly during 2
years in different sites of the gulf of Trieste
are modeled by means of three-way principal
component analysis (PCA). Missing values are
handled using an expectation maximization
algorithm, regression or substitution with random
numbers, depending on their origin.
Physicochemical parameters are described by three
different components that explain the effect of
the river input on the seawater pattern, the
effect of temperature, and metabolic-catabolic
activity of the phytoplankton, respectively. One
spatial component accounts for the gradient of
influence of the estuarine waters in the gulf,
and three temporal components characterize three
main seasonal conditions. Anomalous situations,
generated by meteoclimatic events, are
Barbieri, P., Adami, G., & Reisenhofer, E. (1999b).
Searching for a 3-way model of spatial and seasonal variations in the chemical composition of
Annali Di Chimica, 89, 639-648.
A procedure is described for the search of a three,way;principal components model,
characterizing a data set concerning the spatial and temporal distribution of the physico-chemical
parameters which govern the composition of waters collected from springs, ponds and;rivers of the
Karst of Trieste. Ten physico-chemical parameters were determined-for eleven sampling sites and eleven
sampling times. A:graphic method was applied in order to find the number of components in each of the
three ways of the model, explaining a relatively high quantity of variation of the. data, with a
limited number of components, i.e. with descriptive parsimony, and generating interpretable factors.
The examination of 125 possible Tucker3 models, having from 1 to 5 components in each of the ways,
allowed us to identify the model having two components in each of the three ways as the one satisfying
the desired criteria. The chance of reducing to a simpler PARAFAC model has been successfully explored,
and two trilinear components were then computed. The first one is mainly related to a spatial factor
conditioning the considered waters, while the second is related to a seasonal factor.
Barbieri, P., Adami, G., Predonzani, S., Reisenhofer, E., & Massart, D. L. (1999c).
Survey of environmental complex systems: pattern recognition of physicochemical
data describing coastal water quality in the Gulf of Trieste.
Journal of Environmental Monitoring, 1, 39-74.
A data set reporting temperature, salinity,dissolved oxygen, nitrogen as
ammonia, nitrite and nitrate, silicate, chlorophyll a and phaeopigment values, determined
in seawaters sampled during two years with a monthly frequency in 16 stations in the Gulf
of Trieste, and at different depths of the water column, has been studied. In order to
find synthetic descriptors useful for following the spatial and temporal variations of
biogeochemical phenomena occurring in the considered ecosystem, the data set has been
factorized using principal component analysis. A graphical display of scores, by means
of boxplots and biplots, helped in the interpretation of the data set. The first factor
conditioning the system is related to the input of freshwater from the estuary of the
Isonzo River and to the stratification of the seawater(thermohaline discontinuity), while
the second and third components describe interactions between biological activity,
nutrients and physicochemical parameters; typical spring and autumn phytoplankton blooms
were identified, in addition to an exceptional winter bloom conditioned by anomalous
meteorological/climatic conditions. The fourth principal component explains the reducing
activity of seawaters, which often increases when the decomposition of organic matter is
relevant. The simple linear model proposed, and the related graphs, are shown to be
useful tools for monitoring the main features of such a complex dynamic environmental
system. The outlined approach to the considered complex data structure presents in a
cognitive easy way (graphical outputs) the significant variations of the data, and
allows for a detailed interpretation of the results of the monitoring campaign. Temporal
and spatial effects are outlined, as well as those related to the depth in the water
Barbieri, P., Adami, G., Piselli, S., Gemiti, F., & Reisenhofer, E. (2002a).
A three-way principal factor analysis for assessing the time
variability of freshwaters related to a municipal water supply.
Chemometrics and Intelligent Laboratory Systems, 62, 89-100.
Chemical analyses (total hardness, HARD; dissolved oxygen, DO; chlorides; sulfates; nitrates; nitrites; ammonia;
orthophosphates; and UV-absorbing organic constituents, UV-ORG), physical data (turbidity, TURB; temperature, TEMP;
conductivity, COND), and biological monitors (total and faecal coliforms, FAEC; faecal streptococci, STREPTO) constitute the
15 parameters, monitored with monthly frequency in the space of 4 years on freshwaters sampled at seven sites in a karstic area
of northeastern Italy. The data set was used for a three-way principal factor analysis aimed at exploring the pattern of
information about the environmental quality of the monitored freshwaters, since four wells are feeding the municipal water
supply of the Province of Trieste, and the other water courses can influence them. The selected three-way (3,3,2) model uses
three components for describing the analytical parameters, three for temporal variations and two for spatial variations. The
method optimising the ‘variance of squares’ of the core elements has permitted a simple and meaningful interpretation of the
Tucker-3 solution. The procedure succeeded in decomposing the overall temporal variation in three parts, thus highlighting
nonperiodic critical events, a periodic seasonal component and a constant term. The seasonality has been confirmed by the
examination of the autocorrelation function of the second temporal component. An environmental interpretation and an estimate
of the relative relevance of phenomena conditioning the considered water body, detected by the multiway analysis, have been
Barbieri, P., Adami, G., Piselli, S., Gemiti, F., & Reisenhofer, E. (2002b).
A three-way principal factor analysis for assessing the time variability of
freshwaters related to a municipal water supply. Chemometrics and Intelligent
Laboratory Systems, 60, 89-100.
Chemical analyses and biological monitors constitute the 15 parameters, monitored
with monthly frequency in the space of 4 years on freshwaters sampled at seven
sites in a karstic area of northeastern Italy. The data set was used for a
three-way principal factor analysis aimed at exploring the pattern of information
about the environmental quality of the monitored freshwaters, since four wells
are feeding the municipal water supply of the Province of Trieste, and the
other water courses can influence them. The selected three-way (3,3,2) model
uses three components for describing the analytical parameters, three for
temporal variations and two for spatial variations. The method optimising
the 'variance of squares' of the core elements has permitted a simple and
meaningful interpretation of the Tucker-3 solution. The procedure succeeded
in decomposing the overall temporal variation in three parts, thus highlighting
nonperiodic critical events, a periodic seasonal component and a constant term.
The seasonality has been confirmed by the examination of the autocorrelation
function of the second temporal component. An environmental interpretation
and an estimate of the relative relevance of phenomena conditioning the
considered water body, detected by the multiway analysis, have been proposed.
Baronti, S., Casini, A., Lotti, F., & Porcinai, S. (1997).
Principal component analysis of visible and near-infrared multispectral images
of works of art.
Chemometrics and Intelligent Laboratory Systems, 39, 103-114.
Principal component analysis (PCA) was applied to a very simple case
of a tempera panel painted with four known pigments (cinnabar, malachite, yellow ochre
and chromium oxide). The four pigments were spread pure as well as dilute with carbon
black (5% w/w, 10% w/w) thus creating 12 homogeneous areas of the same size. The panel
was imaged by a Vidicon camera in the visible and near-infrared regions (420-1550 nm)
resulting in a set of 29 images. PCA was applied by taking various subsets of the
input data. From the analysis of this simple and predictable case study some guidelines
are synthesized and proposed for the application to actual work of art. Results are
presented for the painted panel. Preliminary results are also reported for the Luca
Signorelli's ''Predella della Trinita''. The multivariate image analysis results in
the visible and near-infrared regions show that it is possible to use the multispectral
image data in order to get a segmentation and a classification of painted zones by
pigments with different chemical composition or physical properties.
Barré, A., & Fichet, B. (1985).
Analyse des correspondances et rotations procustéennes
représentation hiérarchique et ordres compatibles.
Statistiques et Analyse de Données, 10, 16-26.
Contingency tables depending on time define the evolution of items for
variables. We study this evolution by means of a correspondence factor
analysis and a
hierarchical classification at each time. Moreover, Procrustes analysis
and the search of a
common order on units for the hierarchies help in comparing the graphical
Bartussek, D. (1973).
Zur Interpretation der Kernmatrix
in der dreimodalen Faktorenanalyse von R.L. Tucker [On the interpretation
of the core-matrix in the three-mode factor analysis of R.L. Tucker].
Psychologische Beiträge, 15, 169-184.
After a rather clear exposition of T3, generalizing from PCA on
two-mode matrices, B. proposes to scale the component matrices
such that the components have lengths equal to the corresponding
eigenvalues. These eigenvalues are themselves adjusted by division
through the total number of elements in the other two modes. The
reciprocal scaling is performed for the core matrix elements.
These elements become in this way independent of the size of the
sum of squares of the components and may therefore be interpreted
as 'classical' factor scores. In the same sense the elements of
the components correspond to 'classical' factor loadings rather
than being just elements of orthogonal eigenvectors.
Standardization of the raw data and interpretation of T3 results
by comparing them with external variables are discus- sed as
Bartussek, D. (1980).
Die dreimodale Faktorenanalyse als Methode
Bestimmung von EEG-Frequenzbändern [Trimodal factor analysis as a
of determining EEG frequency bands]. In St. Kubicki, W.M.
Herrmann & G. Laudahn (Eds.), Faktorenanalyse und
Variablenbildung aus dem Elektroenzephalogramm (pp. 15-26).
FRG.: Gustav Fischer Verlag.
T3 is outlined, its relation to Cattell's (1966) data box is
indicated, and the interpretation of the core matrix for EEG
data is discussed. Also included is a discussion of the
subject and situation selection, the norming of the EEG
frequency spectra to be calculated, the standardization of the
spectrum values and the choice of a time basis for the
Bartussek, D. & Gräser, H. (1980).
Ergebnisse dreimodaler Faktorenanalysen von
EEG-Frequenzspektren. In S. Kubicki, W.M. Herrmann & G. Laudahn
(Eds.), Faktorenanalyse und Variablenbildung aus dem
Electroenzephalogramm (pp. 79-87). Stuttgart, FRG: Gustav Fischer
The results of two unpublished studies are reported. Of 40
students 30 values of the frequency spectrum for six activity
situations were measured in 2 ways. T3 was performed on data
(40x30x12) standardized per spectral value over all
student/situation combinations. Frequencies and situations were
varimaxed. Special attention was paid to the interpretation of the
core matrix and the effects of standardization. In the other study
3 spectral values were collected from 20 subjects in 24
situations. Data standardized as above. Frequencies were
varimaxed; the situations and subjects were obliquely rotated.
Again detailed attention to the core matrix.
Basford, K.E., Federer, W.T., & Miles-McDermott, N.J. (1987).
Illustrative examples of clustering using the mixture method and two
comparable methods from SAS. Computational Statistics
Quarterly, 4, 219-233.
The technique of clustering uses the measurements on a set of elements
identify clusters or groups of elements such that there is relative
within the groups and heterogeneity between the groups. Under the mixture
approach to clustering, the elements are assumed to be a sample from a
of several populations in various proportions. In addition to the formal
the practical application to two real data sets is considered, with the
function in each underlying population assumed to be Normal. To provide a
for comparison, two SAS clustering methods with similar assumptions are
considered. The data are analyzed using: KMM, SAS (CLUSTER) - Ward's
method, and SAS (CLUSTER) - EML method; the results are discussed.
Basford, K.E., & Kroonenberg, P.M. (1989).
An investigation of multi-attribute genotype response across
environments using three-mode principal component analysis.
The usefullness of three-mode principal component analysis to explore
attribute genotype-environment interaction is investigated. The technique
a general description of the underlying patterns present in the data in
interactions of the three quantities involved. As an example, data from
Australian experiment on the breeding of soybean lines are treated in
Basford, K.E., Kroonenberg, P.M., & DeLacy, I.H. (1991).
Three-way methods for multiattribute genotype by environment
data: An illustrated partial survey. Field Crops Research,
Several ordination and clustering techniques are discussed with respect
to their usefulness
in analysing multiattribute genotype×environment data. The methods
described and illustrated by application to data from the Australian
Cotton Cultivar Trials
(ACCT), a series of regional variety trials designed to investigate
(Gossypium hirsutum (L.)) lines in several locations each year.
techniques applicable to three-way data are necessary to assess these
lines using yield and
lint-quality data. By the choice of complementary methods, it is possible
to make both global
and detailed statements about the relative performance of the cotton
lines. These techniques
can enhance the researcher's ability to make informed decisions about the
genotype×environment data collected from these trials using
simultaneous analysis of
the attributes of interest.
Basford, K.E., Kroonenberg, P.M., DeLacy, I.H., & Lawrence, P.K.
Multiattribute evaluation of regional cotton variety trials.
Theoretical and Applied Genetics, 79, 225-234.
Two multivariate techniques applicable to three-way data are described
to analyse the data: the mixture maximum likelihood method of clustering
three-mode principal component analysis. Applied together, the methods
each other's usefullness in interpreting the information on the line
patterns across the locations. The methods provide a good integration of
responses across environments of the entries for the different attributes
trials, and a less subjective, relativly easy to apply and interpret
of describing the patterns of performance and associations in complex
multiattribute and multilocation trials. This should lead to more
among lines in such trials.
Basford, K.E., & McLachlan, G.J. (1985a).
Estimation of allocation rates in a cluster analysis context.
Journal of the American Statistical Association, 80,
A sample of multivariate observations is assumed to be drawn from a mixture of a given
number of underlying populations. The mixture likelihood approach to clustering is used to
allocate each individual in the sampe to its population of origin on the basis of the
estimated posterior probabilities of population membership. Estimation of the correct
allocation rate is considered for individual populations as well as for the overall mixture
by averaging functions of the maximum of these posterior probabilities. The estimates of the
correct allocation rates provide a means of assessing the performance of the mixture
approach to clustering. The bootstrap technique is investigated for its effectiveness in
reducing the bias of the estimates so obtained. Results are reported for three real data
sets and a simulation study. It is demonstrated that the proposed estimates generally
provide useful information on the unobservable allocation rates of the mixture approach.
Encouraging results are obtained for the bootstrap method of bias correction applied to the
estimates of the individual and overall allocation rates.
Basford, K.E., & McLachlan, G.J. (1985b).
Likelihood estimation with normal mixture models. The Journal
of the Royal Statistical Society, 34, 282-289.
Considered are some of the problems associated with likelihood estimation in the context of
a mixture of multivariate normal distributions. In mixture models, the likelihood equation
usually has multiple roots and so there is the question of which root to choose. In the case
of equal covariance matrices the choice of root is straightforward in the sense that the
maximum likelihood estimator exists and is consistent. However, an example is presented to
demonstrate that the adoption of a homoscedastic normal model in the presence of some
heteroscedasticity can considerably influence the likelihood estimates, in particular of the
mixing proportions, and hence the consequent clustering of the sample at hand.
Basford, K.E., & McLachlan, G.J. (1985c).
The mixture method of clustering applied to three-way data.
Journal of Classification, 2, 109-125.
This article shows that by appropriate specification of the underlying model, the mixture
maximum likelihood approach to clustering can be applied in the context of a three-way
table. It is illustrated using a soybean data set which consists of multiattribute
measurements on a number of genotypes each grown in several environments. Although the
problem is set in the framework of clustering genotypes, the technique is applicable to
other types of three-way data sets.
Batchelder, W. H., Kumbasar, E., & Boyd, J. P. (1997).
Consensus analysis of three way social network data.
Journal of Mathematical Sociology, 22, 29-58.
Three way social network data occurs when every actor in a social network
generates a digraph of the entire network. This paper presents a statistical model based on
cultural consensus analysis for aggregating these separate digraphs into a single consensus
digraph. In addition, the model allows estimation of separate hit and false alarm rates for
each actor that can vary within each actor in different regions of the digraph. Several
standard signal detection models are used to interpret the hit and false alarm parameters in
terms of knowledge and response bias. A published three way data set by Kumbasar, Romney, and
Batchelder (American Journal of Sociology, 1994) is analyzed, and the model reveals that both
response bias and knowledge decrease with distance from ego.
Baunsgaard , D., Andersson, C. A., Arndal, A., & Munck, L., (2000a).
Multi-way chemometrics for mathematical separation of fluorescent colorants and
colour precursors from spectrofluorimetry of beet sugar and beet sugar thick juice
as validated by HPLC analysis.
Food Chemistry, 70, 113-121.
In previous analyses of colour impurities in processed sugar, a multi-way
chemometric model, CANDECOMP-PARAFAC (CP), has been used to model fluorescence
excitation-emission landscapes of sugar samples. Four fluorescent components were found,
two of them tyrosine and tryptophan, correlating to important quality and process parameters.
In this paper HPLC analyses are used to chemically verify and extend the CP models of sugar.
Thick juice, an intermediate in the sugar production, was analysed by size exclusion HPLC.
Tyrosine and tryptophan were confirmed as constituents in thick juice. Colorants were found
to be high molecular weight compounds. Fluorescence landscapes on collected column fractions
were modelled by the CP model and seven fluorophores were resolved. Apart from tyrosine and
tryptophan, four of the fluorophores were identified as high molecular weight compounds, three
of them possible Maillard reaction polymers, whereas the seventh component resembled a
polyphenolic compound. It is concluded that the relevance of CP for mathematical separation of
fluorescence landscapes has been justified on two levels by HPLC; firstly as a screening method
of fluorophores in complex samples and secondly as a confirmation of peak purity in chromatographic
Baunsgaard, D., Munck, L., & Norgaard, L. (2000b).
Analysis of the effect of crystal size and color distribution on
fluorescence measurements of solid sugar using chemometrics.
Applied Spectroscopy, 54, 1684-1689.
Fluorescence from sugar crystal samples has previously been
used to obtain information about factory imprint and sugar quality. Solid-phase
fluorescence has potential as a fast screening method, but the spectra are
highly influenced by the measurement geometry and sugar crystal sample. The aim
of the present study was to examine how the fluorescence measurements are
related to the sugar crystals for a better understanding of both. Initially,
five sugar samples of varied composition were sieved into five crystal size
fractions. Fluorescence excitation-emission landscapes of the fractions were
measured with solid transmission and reflection techniques and in solution.
The transmission fluorescence was quenched at ultraviolet wavelengths, and
light scatter highly influenced the reflection fluorescence. Principal component
analysis (PCA) showed that large crystals favored the transmission fluorescence,
whereas smaller crystals improved the reflection fluorescence measurements.
The multi-way method PARAFAC (parallel factor analysis) was used to resolve
spectra of individual components from the fluorescence landscapes. Transmission
and solution components had similar spectral profiles at higher wavelengths,
characterizing a colorant and a colorant intermediate. The resolved components
of the reflection data were very influenced by scatter. Color predictions based
on a few significant wavelength variables equaled the model results of
full-spectrum models using partial least-squares regression (PLS). The variables
corresponded to wavelength maxima of the resolved colorants and ultraviolet
wavelengths characterizing colorant precursors.
Baunsgaard, D., Norgaard, L., & Godshall, H. A. (2000c).
Fluorescence of raw cane sugars evaluated by chemometrics.
Journal of Agricultural and Food Chemistry, 48, 4955-4962.
In a fluorescence study of raw cane sugar samples, two-way and
three-way chemometric methods have been used to extract information about the
individual fluorophores in the sugar from fluorescence excitation-emission landscapes.
A sample set of 47 raw sugar samples representing a varied selection was analyzed,
and three individual fluorophores with (275, 350) nm, (340, 420) nm, and (390, 460)
nm as their approximate excitation and emission maxima were found. The spectral
profiles of the fluorophores were estimated with the three-way decomposition model
PARAFAC. Two-way principal component analysis (PCA) of unfolded fluorescence
landscapes confirmed the PARAFAC results and showed patterns of samples related to
time of storage; Partial least squares (PLS) calibration models of color at 420 nm
had a high model error due to the very high color range of the raw sugars, but
variable selection; performed on the fluorescence data revealed that all three
fluorophores were correlated to color. The (275, 350) nm fluorophore is considered
as a color precursor to the color developed on storage and the (340, 420) nm and
(390, 460) nm fluorophores show colorant polymer characteristics.
Baunsgaard, D., Norgaard, L., & Godshall, H. A. (2001).
Specific screening for color precursors and colorants in beet and cane sugar
liquors in relation to model colorants using spectrofluorometry evaluated by
HPLC and multiway data analysis.
Journal of Agricultural and Food Chemistry, 49, 1687-1694.
A comparison was made of the fluorophores in beet thick juice
and cane final evaporator syrup, which are comparable in the
production of cane and beet sugar; that is, both represent the
final stage of syrup concentration prior to crystallization of
sugar. To further elucidate the nature of the color components
in cane and beet syrup, a series of model colorants was also
prepared, consisting of mildly alkaline-degraded fructose and
glucose and two Maillard type colorants, glucose-glycine and
glucose-lysine. Fluorescence excitation-emission landscapes
resolved into individual fluorescent components with PARAFAC
modeling were used as a screening method for colorants, and the
method was validated with size exclusion chromatography using a
diode array UV-vis detector. Fluorophores from the model
colorants were mainly located at visible wavelengths. An
overall similarity in chromatograms and absorption spectra of
the four model colorant samples indicated that the formation of
darker color was the distinguishing characteristic, rather than
different reaction products. The fluorophores obtained from the
beet and cane syrups consisted of color precursor amino acids
in the W wavelength region. Tryptophan was found in both beet
and cane syrups. Tyrosine as a fluorophore was resolved in only
beet syrup, reflecting the higher levels of amino acids in beet
processing. In the visible wavelength region, cane syrup
colorant fluorophores were situated at higher wavelengths than
those of beet syrup, indicating formation of darker colorants.
A higher level of invert sugar in cane processing compared to
beet processing was suggested as a possible explanation for the
Baxter, D. C., & Ohman, J. (1990).
Multicomponent standard additions and partial least-squares modeling: A multivariate calibration
approach to the resolution of spectral interferences in graphite-furnace atomic-absorption spectrometry.
Spectrochimica Acta Part B - Atomic Spectroscopy, 45, 481-491.
Spectral interferences in graphite furnace atomic absorption spectrometry (GFAAS)
represent a considerable problem, often making the direct determination of certain elements in
specific matrices impossible, particularly when continuum source background correction is employed.
The possibilities to resolve such spectral interferences mathematically by applying multivariate
calibration have been investigated. Resolution is achieved using multi-component standard additions
(the so-called generalised standard addition method or GSAM) combined with partial least squares
(PLS) modelling. This multivariate calibration method, PLS-GSAM, is described and its use illustrated
by application to the GFAAS determination of gold in the presence of cobalt at the 242.8 nm wavelength,
where severe spectral interference problems are observed using continuum source background correction.
Two requirements for the successful application of PLS-GSAM are that the sample constituent causing
the spectral interference is known and that its concentration can be increased by standard additions.
It is shown that more accurate results are obtained by PLS-GSAM than by conventional (single-component)
standard addition methods.
Bechmann, I.E. (1997).
Second-order data by flow injection analysis with
spectrophotometric diode-array detection and
incorporated gel-filtration chromatographic column.
Talanta, 44, 585-591.
A flow injection analysis (fia) system furnished
with a gel-filtration chromatographic column and
with photodiode-array detection was used for the
generation of second-order data. The system
presented is a model system in which the analytes
are blue dextran, potassium hexacyanoferrate(iii)
and heparin. It is shown that the rank of the
involved sample data matrices corresponds to the
number of chemical components present in the
sample. The Parafac (parallel factor analysis)
algorithm combined with multiple linear regression
and the tri-PLS (tri-linear partial least-squares
regression), which allows unknown substances to be
present in the sample, are implemented for fia
systems and it is illustrated how these three-way
algorithms can handle spectral interferents. The
prediction ability of the two methods for pure
two-component samples and also the predictions
ability in the presence of unknown interferents
are satisfactory. However, the predictions
obtained by tri-PLS are slightly better than those
obtained using Parafac regression algorithm.
Beckmann, C. F., & Smith, S. M. (2005).
Tensorial extensions of independent component analysis for mulisubject FMRI analysis.
NeuroImage, 25, 294-311.
We discuss model-free analysis of multisubject or multisession FMRI
data by extending the single-session probabilistic independent component
analysis model (PICA; Beckmann and Smith, 2004. IEEE Trans.
on Medical Imaging, 23 (2) 137-152) to higher dimensions. This results
in a three-way decomposition that represents the different signals and
artefacts present in the data in terms of their temporal, spatial, and
subject-dependent variations. The technique is derived from and
compared with parallel factor analysis (PARAFAC; Harshman and
Lundy, 1984. In Research methods for multimode data analysis,
chapter 5, pages 122-215. Praeger, New York). Using simulated data
as well as data from multisession and multisubject FMRI studies we
demonstrate that the tensor PICA approach is able to efficiently and
accurately extract signals of interest in the spatial, temporal, and
subject/session domain. The final decompositions improve upon
PARAFAC results in terms of greater accuracy, reduced interference
between the different estimated sources (reduced cross-talk), robustness
(against deviations of the data from modeling assumptions and
against overfitting), and computational speed. On real FMRI
dactivationT data, the tensor PICA approach is able to extract plausible
activation maps, time courses, and session/subject modes as well as
provide a rich description of additional processes of interest such as
image artefacts or secondary activation patterns. The resulting data
decomposition gives simple and useful representations of multisubject/
multisession FMRI data that can aid the interpretation and optimization
of group FMRI studies beyond what can be achieved using modelbased
Bécue-Bertaut, M. (1992).
In A. Rizzi, M. Vichi, & H.-H. Bock (Eds.) Advances in data
classification (pp. 457-464). Berlin: Springer.
Methods are presented for analysing open-ended answers in surveys having
the form of
three-way arrays, such as comparing answers given to the same questions
in different surveys,
and examining individuals as described by their answers to several open
questions. In essence
the methods come down to replicate use of standard correspondence
analysis of different
varieties of tables constructed from the original three-way array.
software routines have been developed.
Bécue-Bertaut, M., & Pagès, J. (2004).
A principal axes method for comparin contingency tables: MFACT.
Computational Statistics & Data Analysis, 45, 481-503.
A new methodology is introduced for comparing the structures of several contingency tables.
The latter,built up from di erent samples or populations,present the same rows and di erent
columns (or vice versa).This methodology combines some aspects of principal axes methods
(global maximum dispersion axes),canonical correlation techniques (canonical dispersion axes)
and Procrustes analysis (superimposed representations)but takes into account the particularities
of contingency tables in order to extend correspondence analysis to multiple contingency tables.
Two main problems arise:the di erences between the margins of the common dimension and the
need for balancing the in uence of the di erent tables in global processing.A study of the four
structures induced on Spanish regions by mortality causes (by gender)and by age distribution
(by gender),in conjunction,will illustrate the methodology.
Beffy, J.L. (1992).
Application de l'analyse en composantes principales à
trois modes pour l'étude physico-chimique d'un
écosystème lacustre d'altitude:
Perspectives en écologie.[Application of three-mode principal
component analysis for a chemo-physical study of the ecosystem of a
high altitude lake: Perspectives in ecology]. Revue Statistique
Appliquée, 40 (1), 37-56.
Physical and chemical parameters were measured by CHACORNAC (1986) in
an oligotrophic high-mountain lake. The presence of a thick ice-cover
during eight months involves a complex spatio-temporal evolution of the parameters.
The application of three-mode principal component analysis brings to the fore
the interaction between spacial and temporal patterns of these parameters.
The results obtained on this particular case allow to discuss a more intensive use in
Begin, M., Roff, D. A., & Debat, V. (2004).
The effect of temperature and wing morphology on quantitative genetic variation in the cricket
Gryllus firmus, with an appendix examining the statistical properties of the Jackknife-manova
method of matrix comparison.
Journal of Evolutionary Biology, 17, 1255-1267.
We investigated the effect of temperature and wing morphology on the quantitative genetic variances
and covariances of five size-related traits in the sand cricket, Gryllus firmus. Micropterous and
macropterous crickets were reared in the laboratory at 24, 28 and 32 degreesC. Quantitative genetic
parameters were estimated using a nested full-sib family design, and (co)variance matrices were compared
using the T method, Flury hierarchy and Jackknife-MANOVA method. The results revealed that the mean
phenotypic value of each trait varied significantly among temperatures and wing morphs, but temperature
reaction norms were not similar across all traits. Micropterous individuals were always smaller than
macropterous individuals while expressing more phenotypic variation, a finding discussed in terms of
canalization and life-history trade-offs. We observed little variation between the matrices of among-family
(co)variation corresponding to each combination of temperature and wing morphology, with only one
matrix of six differing in structure from the others. The implications of this result are discussed with
respect to the prediction of evolutionary trajectories.
Beh, E. J., & Davy, P. J. (1998).
Partitioning Pearson's chi-squared statistic for a completely ordered three-way contingency table.
Journal of Marketing Research, 11, 156-163.
The paper presents a partition of the Pearson chi-squared statistic
for triply ordered three-way contingency tables. The partition invokes orthogonal
polynomials and identifies three-way association terms as well as each combination
of two-way associations. This partition provides information about the structure of
each variable by identifying important bivariate and trivariate associations in terms
of location (linear), dispersion (quadratic) and higher order components. The
significance of each term in the partition, and each association within each term can
also be determined. The paper compares the chi-squared partition with the log-linear models
of Agresti (1994) for multi-way contingency tables with ordinal categories, by generalizing
the model proposed by Haberman (1974).
Belk, R.W. (1974).
An exploratory assessment of situational effects
buyer behavior. Journal of Marketing Research,
The variance in selected purchase decisions was explored as
function of consumption and purchase context. T3 was used for
data from 100 subjects in 10 situations with 10 snack
products, and in 9 situations with 11 meat products. Solutions
were obtained via Tucker's Method III. Both situation and
product spaces were varimax rotated. The same data were also
analysed with a three-way mixed effects analysis of variance
Belk, R.W. (1979).
Gift-giving behavior. In J.N. Sheth (Ed.),
marketing, Vol. 2 (pp. 95-126). Greenwich, CT: JAI Press Inc.
As part of a larger study 12 characteristics in each of 15
gift-giving situations were rated by 219 respondents. The
components were varimax rotated, and the two person components
were analysed using the core matrix. One component matrix and the
core matrix are presented in detail.
Bell, T. S., Dirks, D. D., & Carterette, E. C. (1989).
Interactive factors in consonant confusion patterns Journal of the Acoustical Society of America, 85, 339-346.
Confusion patterns among English consonant were examined using log-linear modeling
techniques to assess the influence of low-pass filtering, shaped noise, presentation level, and
consonant position. Ten normal-hearing listeners were presented consonant-vowel (C-V) and
vowel-consonant (V-C) syllables containing the vowel/a/. Stimuli were presented in quiet and
in noise, and were either filtered or broadband. The noise was shaped such that the effective
signal level in each « octave band was equivalent in quiet and noise listening conditions. Three
presentation levels were analyzed corresponding to the overall rms level of the combined
speech stimuli. Error patterns were affected significantly by presentation level, filtering, and
consonant position as a complex interaction. The effect of filtering was dependent on
presentation level and consonant position. The effects stemming from the noise were less
pronounced Specific confusions responsible for these effects were isolated, and an acoustical
interaction is suggested stressing the spectral characteristics of the signals and their
modification by presentation level and filtering.
Beltrán, J. L., & Ferrer, R., & Guiteras, J. (1998a).
Parallel factor analysis of partially resolved chromatographic data Determination
of polycyclic aromatic hydrocarbons in water samples. Journal of Chromatography A, 802, 263-275.
A procedure, based on parallel factor analysis (PARAFAC), has been used for the analysis of polycyclic aromatic
hydrocarbons in water samples. The chromatographic system has been set to obtain short-time chromatograms containing
several unresolved peaks. The detection system consisted of a fast-scanning fluorescence spectra detector, which allowed
three-dimensional data – where retention time, emission wavelengths and fluorescence intensity were represented – to be
obtained. The procedure has been applied to spiked tap water samples with good results.
Beltrán, J. L., & Ferrer, R., & Guiteras, J. (1998b).
Multivariate calibration of polycyclic aromatic hydrocarbon mixtures from
excitation-emission fluorescence spectra. Analytica Chimica Acta, 373, 311-319.
The excitation–emission fluorescence spectra (EEM) of mixtures of 10 polycyclic aromatic hydrocarbons
(PAHs) have been analyzed using different multivariate calibration procedures (partial least squares regression, PLSR; and
parallel factor analysis, PARAFAC). The compounds studied were anthracene, benz[a]anthracene, benzo[a]pyrene,
chrysene, phenanthrene, fluoranthene, fluorene, naphthalene, perylene and pyrene.
Beltrán, J. L., Guiteras, J., & Ferrer, R., (1998c).
Three-way multivariate calibration procedures applied to high-performance liquid chromatography
coupled with fast-scanning fluorescence spectrometry detection. Determination of polycyclic aromatic
hydrocarbons in water samples.
Analytical Chemistry, 70, 1949-1955.
Three-way partial least-squares and n factor parallel factor analysis have been
compared for the analysis of polycyclic aromatic hydrocarbons in water samples. Data were obtained
with a chromatographic system set to record short-time chromatograms containing several unresolved
peaks. The detection system consisted of a fast-scanning fluorescence spectra detector, which allows
one to obtain three-dimensional data, where retention time, emission wavelengths, and fluorescence
intensity are represented. The combined use of a multivariate calibration method and the
three-dimensional data obtained from the HPLC-FSFS system allows resolution of closely eluting
compounds, thus making a complete separation unnecessary. The procedure has been applied to tap
water samples (spiked at 0.10 and 0.20 mu g L-1 levels) with good results, similar to those obtained
with a HPLC system with a conventional fluorescence detector.
Bendtsen, A. B., Glarborg, P., & Dam-Johansen, K. (2001).
Visualization methods in analysis of detailed chemical kinetics modelling.
Computers & Chemistry, 25, 161-170.
Sensitivity analysis, principal component analysis of the
sensitivity matrix, and rate-of-production analysis are useful tools in
interpreting detailed chemical kinetics calculations. This paper deals with
the practical use and communication of the sensitivity analysis and the related
methods are discussed. Some limitations of sensitivity analysis, originating
from the mathematical concept (e.g. first-order or brute force methods) or from
the software-specific implementation of the method, are discussed. As
supplementary tools to the current methods, three novel visual tools for
analysis of detailed chemical kinetics mechanisms are introduced: (a) scaled
sensitivity analysis which is especially suited for studying initiation
reactions where the span of reaction rates is high; (b) automated generation
of reaction pathway plots which provides an immediate graphical illustration
of the chemical processes occurring; (c) explorative (or chemometric) analysis
of accumulated rate of progress matrices which assist in the identification of
reaction subsets. The application of these tools are demonstrated by analysing
NO, enhanced oxidation of methane at 700-1200 K.
Benito, M., & Peña, D. (2005).
A fast approach for dimensionality reduction with image data.
Pattern Recognition, 38, 2400-2408.
An important objective in image analysis is dimensionality reduction. The most often used data-exploratory technique with
this objective is principal component analysis, which performs a singular value decomposition on a data matrix of vectorized
images. When considering an array data or tensor instead of a matrix, the high-order generalization of PCA for computing
principal components offers multiple ways to decompose tensors orthogonally. As an alternative, we propose a new method
basedon the projection of the images as matrices andsho w that it leads to a better reconstruction of images than previous
Bennani Dosse, M. (1995).
Positionnement multidimensionnel d'un tableau à 3 voies.
Revue Statistique Appliquée, 43(4), 63-75.
We propose a new Multidimensional scaling method which represents 3-way
tables where all of the three ways are of equal consideration from the point of exploratory
data analysis and whose elements are interpreted as measures of proximity. The treatment of
real data is performed to illustrate the proposed methods.
Benninghoff, L., Vonczarnowski, D., Denkhauw, E., & Lemke, K. (1997).
Analysis of human tissues by total reflection x ray fluorescence: application of
chemometrics for diagnostic cancer recognition.
Spectrochimica Acta Part B Atomic Spectroscopy, 52, 1039-1046.
For the determination of trace element distributions of more than 20
elements in malignant and normal tissues of the human colon, tissue samples (approx. 400
mg wet weight) were digested with 3 ml of nitric acid (sub boiled quality) by use of an
autoclave system. The accuracy of measurements has been investigated by using certified
materials. The analytical results were evaluated by using a spreadsheet program to give
an overview of the element distribution in cancerous samples and in normal colon tissues.
A further application, cluster analysis of the analytical results, was introduced to
demonstrate the possibility of classification for cancer diagnosis. To confirm the
results of cluster analysis, multivariate three way principal component analysis was
performed. Additionally, microtome frozen sections (10 mu m) were prepared from the same
tissue samples to compare the analytical results, i.e. the mass fractions of elements,
according to the preparation method and to exclude systematic errors depending on the
inhomogeneity of the tissues. (C) 1997 Elsevier Science B.V.
Bentler, P.M. & Lee, S.-Y. (1978).
Statistical aspects of a
factor analysis model. Psychometrika, 43, 343-352.
A special case of Bloxom's version (1968) of T3 is developed
statistically. A distinction is made between fixed and random
modes. Parameter matrices are associated with the fixed modes,
while no parameters are associated with the mode representing
random observation vectors. Estimation by a weighted least
squares method based upon Gauss-Newton. Example based upon
self-report and peer-report measures (see also B. & L.,
Bentler, P.M. & Lee, S.-Y. (1979).
A statistical development of
three-mode factor analysis. British Journal of Mathematical
& Statistical Psychology, 32,
B & L consider a factor analytic random vector version of T3.
The parameters of the model are associated with two fixed
modes and the covariance matrix of the random vectors. Their
approach brings three-mode FA in the realm of structural
equation models. Their model does not treat all three modes
symmetrically as Tucker (1966) and Kroonenberg & De Leeuw
(1980) do. With B & L's model a confirmatory approach to T3 is
possible, and standard errors and a goodness-of-fit statistic
become available. B & L's model has some similarity to the
treatment of T3 by Tucker (1966) through his method III. The
model is illustrated by a multi-trait multi-method matrix
Bentler, P.M., Poon, W.-Y., & Lee, S.-Y. (1988).
Generalized multimode latent variable models: Implementation by
standard programs. Computational Statistics & Data Analysis
Three-mode models in factor analysis have not been used very frequently
due in part to their
mathematical, statistical, and computational complexity. It is shown that
computer programs such as LISREL and EQS can be used to estimate and test
such models. The
models are generalized to permit more complex measurement structures, as
well as to allow
linear structural regressions among the latent variables. These
generalized multimode models
can be similarly easily computationally implemented. An example is used to
Bentler, P.M., & Weeks, D.G. (1979).
Interrelations among models for the analysis of moment structures
Research, 14, 169-186.
Factor analysis in several populations, covariance structure models,
factor analysis, structural equation systems with measurement model, and
analysis of covariance with measurement model are all shown to be
specializations of a general moment structure model by Bentler. Some
new structured linear models are also described that
may be considered either generalizations or special cases of existing
Bernstein, A.L. & Wicker, F.W. (1969).
A three-mode factor
the concept of novelty. Psychonomic Science, 14, 291-
A rather simplistic inquiry into the concept of 'novelty'
using the unscaled scores of 30 students on an 18 item
semantic differential type scale with 10 realistic and
unrealistic animals. T3 on cross-products. No serious
Berrueta, L.A., Fernandez, L.A., & Vicente, F. (1991).
Fluorim: A computer program for the automated
data collection and treatment using commercial
Computers & chemistry, 15, 307-312.
A general purpose program that uses a pc microcomputer has
been designed for the control of experiments in
fluorimetry. This program allows the digital collection of
the main types of scans that a commercial
spectrofluorimeter can perform: Emission, excitation and
synchronic; as well as the measurement of the fluorescent
intensity as a function of time. The program also allows a
series of operations with previously stored spectra. These
operations include screen and paper plots of the spectra,
calculation of linear combinations of them and the nine
first derivatives, as well as their relative maxima and
Bertero, H. D., de la Vega, A. J., Correa, G., Jacobsen, S. E., & Mujica, A. (2004).
Genotype and genotype-by-environment interaction effects for grain yield and grain
size of quinoa(Chenopodium quinoa Willd.) as revealed by pattern analysis of
international multi-environment trials.
Field Crops Research, 89, 299-318.
The size and nature of the genotype (G) and genotype environment (GxE)
interaction effects for grain yield, its physiological determinants, and grain size
exhibited by the Andean grain crop quinoa at low latitudes were examined in a
multi-environment trial involving a diverse set of 24 cultivars tested in 14 sites
under irrigation across three continents. These environments included a wide latitudinal
(from 21 30'N to 16 21'S), altitudinal (from 5 to 3841 m a.s.l.) and temperature
(average daily temperatures during crop cycle varied from 9 to 22.1 C) range; while
average daily photoperiods exhibited a smaller variation, from 11.2 to 12.8 h. The GxE
interaction to G component of variance ratio was 4:1 and 1:1 for grain yield and
grain size, respectively. Two-mode pattern analysis of the environment-standardised
matrix of grain yield revealed four genotypic groups of different response pattern
across environments. This clustering, which separates cultivars from mid-altitude
valleys of the northern Andes, northern altiplano, southern altiplano and sea level,
showed a close correspondence with adaptation groups previously proposed. The results
of the genotype clustering can be used to choose genotypes of contrasting relative
performance across environments for further studies aimed at assessing the opportunity
to select for broad or specific adaptation. Classification of sites for grain yield
grossly discriminated between cold highland sites, tropical valleys of moderate
altitude, and warmer, low altitude sites. As expected from the size of the G E interaction
component, no single genotype group showed consistently superior grain yield across all
environment groups. The G and GxE interaction effects observed for the duration of the
crop cycle had a major influence on the average cultivar performance and on the form of
GxE interactions observed for total above-ground biomass and grain yield. Although
different environment types showed contrasting effects on the physiological attributes
underlying grain yield variation among cultivars, it was observed that good average
performance and broad adaptation could come from the combination of medium–late maturity
and high harvest index. Correlation analysis revealed no association between the average
cultivar responses for grain yield and grain size. Three-mode pattern analysis have also
shown no association between the GxE interaction effects for both traits. Both observations
indicate that simultaneous progress for grain yield and grain size can be expected from
Beylkin, G., & Mohlenkamp, M. J. (2005).
Algorithms for numerical analysis in high dimensions.
Society for Industrial and Applied Mathemetics, 26, 2133-2159.
Nearly every numerical analysis algorithm has computational complexity that scales
exponentially in the underlying physical dimension. The separated representation, introduced previously,
allows many operations to be performed with scaling that is formally linear in the dimension.
In this paper we further develop this representation by
(i) discussing the variety of mechanisms that allow it to be surprisingly efficient;
(ii) addressing the issue of conditioning;
(iii) presenting algorithms for solving linear systems within this framework; and
(iv) demonstrating methods for dealing with antisymmetric functions, as arise in the multiparticle
Schr¨odinger equation in quantum mechanics.
Numerical examples are given.
Bezemer, E., & Rutan, S. C. (2001a).
Multivariate curve resolution with non-linear fitting of kinetic profiles.
Chemometrics and Intelligent Laboratory Systems, 59, 19-31.
This paper describes the incorporation of a hard modeling step based on a kinetic
model, into a soft modeling multi-variate curve resolution technique. The soft modeling technique
allows for the determination of the retention and spectral profiles from overlapped components
while the hard modeling step allows for the simultaneous prediction of the rate constants
of the various steps in the reaction pathway. The program uses standard MATLABw functions for
determining the solutions of the differential equations as well as for finding the optimal
rate constants to describe the kinetic profiles. The kinetic model is entered by a set of
command line parameters and can describe any order chemical reaction with multiple reaction
pathways. This paper uses simulated first- and second-order reaction data as well as real
data to characterize the performance of the program. The algorithm is able to resolve overlapped
retention and spectral profiles and predict the rate constants for the reaction.
Bezemer, E., & Rutan, S. C. (2001b).
Study of the Hydrolysis of a Sulfonylurea Herbicide Using Liquid Chromatography with Diode Array
Detection and Mass Spectrometry by Three-Way Multivariate Curve Resolution-Alternating Least Squares.
Analytical Chemistry, 73, 4403-4409.
This research is focused on the development of a novel,
automated chemometric method for obtaining relevant
chemical information from time-course measurements of
an evolving chemical system. This paper describes an
investigation of the hydrolysis of Ally, which is a sulfonylurea
herbicide. The hydrolysis of this compound is
observed at different pHs and temperatures by reversedphase
liquid chromatography using a diode array detector.
The data are analyzed using a three-way, multivariate
curve resolution technique. Of special interest was the
application of a closure constraint in the kinetic dimension
followed by the determination of the rate constants
for each step of the pathway by using a differential
equation solver and nonlinear fitting of the data.
Bezemer, E., & Rutan, S.C. (2002).
Three-way alternating least squares using three-dimensional tensors in MATLAB.
Chemometrics and Intelligent Laboratory Systems, 60, 239-251.
This paper describes an improved three-way alternating least-squares multivariate
curve resolution algorithm that makes use of the recently introduced multi-dimensional
arrays of MATLAB. Multi-dimensional arrays allow for a convenient way to apply
chemically sound constraints, such as closure, in the third dimension. The program
is designed for kinetic studies on liquid chromatography with diode array
detection but can be used for other three-way data analysis. The program is
tested with a large number of synthetic data sets and its flexibility is demonstrated,
especially when non-trilinear data sets are fit. In this case, the algorithm finds
a solution with a better fit than direct trilinear decomposition (DTD). When
trilinear data are used, the optimal fit is not as good as when a direct
decomposition method is used. Most real data sets, however, have some degree
of non-trilinearity. This makes this method a better choice to analyze non-trilinear,
three-way data than direct trilinear decomposition.
Bezemer, E., & Rutan, S. C. (2003).
Evaluation of synthetic liquid chromatography-diode array detection-mass spectrometry
data for the determination of enzyme kinetics.
Analytica Chimica Acta, 490, 17-29.
In this paper, we investigate the accuracy and precision of the results from diode
array detector (DAD) data and mass spectrometry (MS) data as obtained subsequent to
chromatographic separations using computer simulations. Special attention was given
to simulations of multiple injections from a developing enzymatic reaction. These
simulations result in three-way LC-DAD-MS kinetic data; LC-DAD and LC-MS data were
also evaluated independently in this investigation. The noise characteristics of
the MS detector prevent accurate determination of the individual reaction rate constants
by the analysis method. Using the data from the DAD in combination with the MS detector
results in improved estimation of the rate constants. The results also indicate that
the higher resolving power of the MS information compensates for the lower signal-to-noise
ratio in these data, compared to DAD data. (C) 2003 Elsevier Science B.V. All rights
Bezemer, E., & Rutan, S. C. (2006).
Analysis of three- and four-way data using multivariate curve resolution-alternating least squares with global multi-way kinetic fitting.
Chemometrics and Intelligent Laboratory Systems, 81, 82-93.
This paper demonstrates a novel implementation of an alternating least squares (ALS) algorithm for resolving three- and four-way data.
Computer-simulated multi-way data are studied as well as the multi-way data obtained from typical kinetic experiments observed using liquid
chromatography with diode array detection (LC-DAD) and UV–visible spectroscopy. Each data set is analyzed using this new multi-way ALS
algorithm, not only providing estimates of the spectral profiles (and retention profiles in the case of LC-DAD measurements) for each of the
components involved, but also simultaneously estimating the rate constants for the reaction steps at different experimental conditions using a
global kinetic analysis. However, when the reaction conditions do not require that all the rate constants are identical for each experiment, as is the
case when the reactions are observed at different temperatures, the data analysis still benefits from the common information present in the data,
such as spectral and retention profiles, as well as a common reaction mechanism.
Bharati, M. H., & MacGregor, J. F. (1998).
Multivariate image analysis for real-time process monitoring and control.
Industrial & Engineering Chemistry Research, 37, 4715-4724.
Information from on-line imaging sensors has great potential for the
monitoring and control of spatially distributed systems. The major difficulty lies in
the efficient extraction of information from the images in real-time, information such
as the frequencies of occurrence of specific features and their locations in the process
or product space. This paper uses multivariate image analysis (MIA) methods based on
multiway principal component analysis to decompose the highly correlated data present
in multispectral images. The frequencies of occurrence of certain features in the
image, regardless of their spatial locations, can be, easily monitored in the space of
the principal components (PC). The spatial locations of these features in the original
image space can then be obtained by transposing highlighted pixels from the PC space'
into the original image space. In this manner it is possible to easily detect and
locate (even very subtle) features from real-time imaging sensors for the purpose of
performing statistical process control or feedback control of spatial processes. Due
to; the current lack of availability of such multispectral sensors in industrial processes,
the concepts and potential of this approach are illustrated using a sequence of multispectral
images obtained from a LANDSAT satellite, as it passes over a certain geographical
region of the earth's surface.
Bhattacharya, P., & Mukherjee, N.P. (1994).
On the representation of uncertain information by multidimensional
arrays. IEEE Transactions on Systems, Man, and Cybernetics,
A multidimensional approach is introduced to the representation of
information in conjunction with the Dempster-Schafer theory. A
array, called a transition array, is defined, which stores the joint
the occurences of a set of variables taking values in different sets.
array, it is shown how to compute the information regarding the
occurences of the variables as certain matrix products.
Bhonske, J. B., Wang, Z., Tamamura, H., Fujii, N., Peiper, S. C., & Trent, J. O. (2005).
A simple, automated quasi-4D-QSAR, quasi-multi way PLS approach to develop highly predictive QSAR models for
highly predictive QSAR models for highly flexible CXCR4 inhibitor cyclic pentapeptide ligands using scripted
common molecular modeling tools.
QSAR & Combinatorial Science, 24, 620-630.
A methodology for developing highly predictive (r2>0.9) 3D-QSAR models (q2>0.7)
based on sixteen flexible CXCR4 cyclic pentapeptide inhibitors is reported. The effective
automated use of common molecular modeling tools such as Macromodel and Sybyl is
demonstrated. The recently developed multi-way Partial Least Square (PLS) approach for
discovering the bioactive conformers and alignment was used in a quasi-multi-way PLS
approach. Twenty-five conformers for each compound were generated by Monte Carlo
conformational searches and alignments (seventy five in total) were based on the
templates from the three most active compound conformers. These were aligned in Sybyl
Molecular Databases and Sybyl Molecular Spreadsheets. All repetitive tasks were
automated by use of simple Unix shell, python and Sybyl Programming Language (SPL)
scripts. This efficient protocol furnished three 3D-QSAR models with q2 values of 0.714,
0.734 and 0.657 and predictive r2 values of 0.951, 0.990, and 0.956 respectively. The best
3D-QSAR model predicted the biological activities of nine test compounds from all
activity ranges within 0.5 log units.
Bieber, S. L. (1986).
A hierarchical approach to multigroup factorial invariance.
Journal of Classification, 3, 113-134.
A procedure is presented which permits the analysis of factor analytic
problems in which several groups exist. The analysis incorporates a
hierarchical scheme of searching for factorial invariance and is an extension
of Meredith's (1964) Method One procedure. By overlaying a contextual
frame of reference on a traditional factor analysis solution, it is possible to
use this technique to examine structural similarity and dissimilarity between
groups. The procedure is exhibited in an example and in addition a comparison
is made to discriminant analysis.
Bijlsma, S. (2000).
Estimating Rate Constants of Chemical Reactions using Spectroscopy.
Unpublished doctoral thesis, University of Amsterdam, The Netherlands.
General introduction; Theory of two-way methods; theory of three-way methods;
quality assessment of reaction rate constant estimates; description of datasets
and experimental set-up; applications of two-way methods; applications of three-
way methods; comparison between two-way and three-way methods; the use of
constraints in classical curve resolution; general conclusions and future
Bijlsma, S., Boelens, H. F. M., & Smilde, A. K. (2001).
Determination of rate constants in second-order kinetics using UV-visible
Applied Spectroscopy, 55, 77-83.
A general method for estimating reaction rate constants of
chemical reactions using ultraviolet-visible (UV-vis) spectroscopy is presented.
The only requirement is that some of the chemical components involved be
spectroscopically active. The method uses the combination of spectroscopic
measurements and techniques from numerical mathematics and chemometrics.
Therefore, the method can be used in cases where a large spectral overlap of
the individual reacting absorbing species is present. No knowledge about molar
absorbances of individual reacting absorbing species is required for
quantification. The reaction rate constants and the individual spectra of the
reacting absorbing species of the two-step consecutive reaction of
3-chlorophenylhydrazonopropane dinitrile,vith 2-mercaptoethanol were estimated
simultaneously from UV-vis recorded spectra in time. The results obtained were
Bijlsma, S., Boelens, H. F. M., Hoefsloot, H. C. J., & Smilde, A. K. (2002).
Constrained least squares methods for estimating reaction rate constants
from spectrscopic data.
Journal of Chemometrics, 16, 28-40.
Model errors, experimental errors and instrumental noise influence
the accuracy of reaction rate constant estimates obtained from spectral data recorded
in time during a chemical reaction. In order to improve the accuracy, which can be
divided into the precision and bias of reaction rate constant estimates, constraints
can be used within the estimation procedure. The impact of different constraints on the
accuracy of reaction rate constant estimates has been investigated using classical curve
resolution (CCR). Different types of constraints can be used in CCR. For example, if pure
spectra of reacting absorbing species are known in advance, this knowledge can be used
explicitly. Also, the fact that pure spectra of reacting absorbing species are non-negative
is a constraint that can be used in CCR. Experimental data have been obtained from UV-vis
spectra taken in time of a biochemical reaction. From the experimental data, reaction rate
constants and pure spectra were estimated with and without implementation of constraints in
CCR. Because only the precision of reaction rate constant estimates could be investigated
using the experimental data, simulations were set up that were similar to the experimental
data in order to additionally investigate the bias of reaction rate constant estimates. From
the results of the simulated data it is concluded that the use of constraints does not result
self-evidently in an improvement in the accuracy of rate constant estimates. Guidelines for
using constraints are given.
Bijlsma, S., Louwerse, D.J., & Smilde, A.K. (1999).
constants and pure UV-vis spectra of a two-step reaction using trilinear models.
Journal of Chemometrics, 13, 311-329.
This paper describes the estimation of reaction rate constants and pure species
UV-vis spectra of the consecutive reaction of 3-chlorophenylhydrazonopropane
dinitrile with 2-mercaptoethanol. The reaction rate constants were estimated
from the UV-vis measurements of the reacting system using the generalized rank
annihilation method (GRAM) and the Levenberg-Marquardt/PARAFAC (LM-PAR)
algorithm. Both algorithms can be applied in cases where the contribution of
different species in the mixture spectra is of exponentially decaying character.
From a single two-way array, two two-way data sets are formed by means of
splitting such that there is a constant time lag between the two two-way data
sets. By stacking these two two-way data sets, the reaction rate constants can
be estimated very easily from the third dimension. GRAM, which is fast and non-
iterative, decomposes the trilinear structure using a generalized eigenvalue
problem (GEP). The iterative algorithm LM-PAR consists of a combination of the
Levenberg-Marquardt algorithm and alternating least squares steps of the PARAFAC
model using GRAM results as a set of initial starting values. Pure spectra of
the absorbing species were estimated and compared with their measured pure
spectra. LM-PAR performed the best, giving the lowest relative fit error.
However, the relative fit error obtained with GRAM was acceptable. Since a lot
of measurements are based on exponentially decaying functions, GRAM and LM-PAR
can have many applications in chemistry.
Bijlsma, S., Louwerse, D.J., Windig, W., & Smilde, A.K. (1998).
estimation of rate constants using on-line SW-NIR and trilinear models.
Analytica Chimica Acta, 376, 339-355.
In this paper, two algorithms are presented to estimate reaction rate constants
from on-line short-wavelength near-infrared (SW-NIR) measurements. These can be
applied in cases where the contribution of the different species in the mixture
spectra is of exponentially decaying character. From a single two-dimensional
data set two two-way data sets are formed by splitting the original data set
such that there is a constant time lag between the two two-way data sets. Next,
a trilinear structure is formed by stacking these two two-way data sets into a
three-way array. In the first algorithm, based on the generalized rank
annihilation method (GRAM), the trilinear structure is decomposed by solving a
generalized eigenvalue problem (GEP). Because GRAM is sensitive to noise it
leads to rough estimations of reaction rate constants. The second algorithm (LM-
PAR) is an iterative algorithm, which consists of a combination of the
Levenberg-Marquardt algorithm and alternating least squares steps of the
parallel factor analysis (parafac) model using the GRAM results as initial
values. Simulations and an application to a real data set showed that both
algorithms can be applied to estimate reaction rate constants in case of extreme
spectral overlap of different species involved in the reacting system.
Bijlsma, S., & Smilde, A. K. (2000).
Estimating reaction rate constants from a two-step reaction: a
comparison between two-way and three-way methods.
Journal of Chemometrics, 14, 541-560.
In this paper, two different spectral datasets are used in order to estimate reaction rate constants using different
algorithms. Dataset 1 consists of short-wavelength near-infrared (SW-NIR) spectra taken in time of the two-step
epoxidation of 2,5-di-tert-butyl-1,4-benzoquinone using tert-butyl hydroperoxide and Triton B catalyst. This
dataset showed moderate reproducibility. Dataset 2 consists of UV-VIS recorded spectra of the consecutive
reaction of 3-chlorophenylhydrazonopropane dinitrile with 2-mercaptoethanol. This dataset showed good
reproducibility. Two-way and three-way methods were used in order to estimate the reaction rate constants for
both datasets. For the SW-NIR dataset the lowest standard deviations for the reaction rate constants were
obtained with a two-way method. The lowest standard deviations for the reaction rate constant estimates for the
UV-VIS dataset were obtained with a two-way method which uses spectral information that is known in advance.
In this case the pure spectrum of two reacting absorbing species is known in advance and this information was
used by the two-way method. For one two-way method and a few three-way methods which do not use spectral
information that is known in advance, pure spectra of the reacting absorbing species of the UV-VIS dataset were
estimated which showed excellent agreement with the recorded pure spectra. The pure spectra of the reacting
absorbing species for the SW-NIR dataset were not estimated, because it was not possible to record the real pure
spectra of these species. For both spectral datasets, quality assessment has been performed using a jackknife
Bloxom, B. (1968).
A note on invariance in three-mode factor analysis.
Psychometrika, 33, 347-350.
B. proposes a 'true' factor analysis variant of T3, where the
derived factor scores, the scores of the subjects on the
combination variables and the errors are random variables
rather than matrices of parameters for a finite number of
individuals (see also Bentler & Lee, 1978, 1979). Conditions
for the invariance across subpopulations for the factor
pattern matrices, the core matrix and the residual covariance
matrix are discussed.
Tucker's three-mode factor analysis model. In H.G. Law, C.W.
Snyder Jr, J.A. Hattie & R.P. McDonald (Eds.),
Research methods for multimode data analysis
(pp. 104-121). New York: Praeger.
Bocci, L., Vicari, D., & Vichi, M. (2006).
A mixture model for the classification of three-way proximity data.
Computational Statistics & Data Analysis, 50, 1625-1654.
Large data sets organized into a three-way proximity array are generally difficult to comprehend
and specific techniques are necessary to extract relevant information.
The existing classification methodologies for dissimilarities between objects collected in different
occasions assume a unique common underlying classification structure. However, since the objects’
clustering structure often changes along the occasions, the use of a single classification to reconstruct
the taxonomic information frequently appears quite unrealistic.
The methodology proposed here models the dissimilarities in a likelihood framework. The goal is
to identify a (secondary) partition of the occasions in homogeneous classes and, simultaneously, a
(primary) consensus partition of the objects within each of such classes. Furthermore, a class-specific
dimensionality reduction operator is also included which allows to identify classes of occasions such
that the within-class variability is minimized.
The model is formalized as a finite mixture of multivariate normal distributions and solved by a
numerical method based on ECM strategy.
Influence functions and outlier detection under the common
principal components model: A robust approach.
Biometrika, 89, 861-875.
The common principal components model for several groups of multivariate obser-vations
assumes equal principal axes but dierent variances along these axes among the
groups. Influence functions for plug-in and projection-pursuit estimates under a common
principal component model are obtained. Asymptotic variances are derived from them.
Outlier detection is possible using partial influence functions.
Boik, R.J. (1990).
A likelihood ratio test for three-mode singular values: Upper percentiles
and an application to three-way ANOVA. Computational Statistics
& Data Analysis, 10, 1-9.
This paper considers the rank-1 three-mode model for an
n2×n3×n1 matrix, Y. In
vector form, the model is
y = (v1¤v2¤v3)z + e, where
y = vec(Y),
vj is an nj×1 vector of parameters,
1 for j = 1, 2, 3, and e~ N(0, s2I). The likelihood ratio test
of H0: ? = 0 is
given and, employing a Jacobi polynomial expansion, upper percentiles of
distribution of the test statistic are computed. As an illustration, the
results are applied
to the problem of testing additivity in unreplicated three-way
Boik, R.J., & Marasinghe, M.G. (1989).
Analysis of nonadditive multiway classifications. Journal of the
American Statistical Assocation, 84, 1059-1064.
This article considers the problems of testing additivity and
estimating s2 in
unreplicated multiway classifications. To model nonadditivity and jointly
s2, the interaction parameter space must be restricted;
otherwise the model is
saturated. The parameterization used is a multiway extension of the two-
interaction model of Mandel (1971) and Johnson and Graybill (1972a). An
exact test of ? = 0
is constructed and an estimator of s2 is proposed that can be
interaction has been detected. The test is an approximation to the
likelihood ratio test
(LRT) of H0: ? = 0. Selected percentiles of the null
distribution are given for three-way
classifications. For large ??, a transformation of the test statistic is
shown to be
approximately distributed as a noncentral F and can be used to compute
the power of the test.
The test and estimator are illustrated on a data set.
Bolton, B. (1988).
Multivariate approaches to human learning. In J.R. Nesselroade & R.B.
Handbook of multivariate experimental psychology. Perspectives on
differences., pp. 789-819. Plenum Press, New York, NY.
A comprehensive, mathematically precise learning
formulation, called structured learning theory (SLT), is outlined.
The relation between factor analysis and learning and the analysis
of generalized learning curves by factor analysis are discussed,
as well as the three-mode factor analysis of learning
Bolck, A., Smilde, A. K., & Bruins, C. H. P. (1999).
Monitoring aged reversed-phase high performance liquid chromatography columns.
Chemometrics and Intelligent Laboratory Systems, 46, 1-12.
In this paper, a new approach for the quality assessment of routinely used
reversed-phase high performance liquid chromatography columns is presented. A used column is
not directly considered deteriorated when changes in retention occur. If attention is paid to
the type and magnitude of the changes, columns often can still be used. Therefore, columns have
to be monitored at regular time points. This means that, in the first place, a few well chosen
measurements have to be done on the used column. With statistical techniques, Hotelling's T-2
statistic in combination with three-way analysis, the type and magnitude of changes in retention
then can be detected. The type of changes can be divided in hydrophobicity changes, selectivity
changes and both hydrophobicity and selectivity changes. This paper describes the approach in theory,
completed with examples. At the end, a strategy for monitoring during routine use is proposed, which
is visualized in a monitoring scheme.
Booksh, K. S. (around 1997).
Three-way calibration with hyphenated data.
Department of Chemistry and Biochemistry, Arizona State University.
Three-way calibrations methods, such as the generalized rank annihilation method
(GRAM) and parallel factor analysis (PARAFAC), are becoming increasing prevalent
tools to solve analytical challenges. The main advantage of three-way calibration is
estimation of analyte concentrations in the presence of unknown, uncalibrated spectral
interferents. These methods also permit the extraction of analyte, and often interferent,
spectral profiles from complex and uncharacterized mixtures. In this tutorial a theoretical
and practical overview throughout the progression of three-way calibration methods from
the simplest rank annihilation factor analysis (RAFA) to the more flexible PARAFAC is
presented. Extensions of many three-way methods are covered to highlight the paradigms
flexibility to solve particular analytical calibration problems.
Booksh, K., Henshaw, J.M., Burgess, L.W., & Kowalski, B.R.
A second-order standard addition method with
application to calibration of a
kinetics-spectroscopic sensor for quantitation of
Journal of Chemometrics, 9, 263-282.
Presented here is an algorithm for analysis of
second order data by the method of standard
additions. The method of standard additions is
applicable when matrix effects make traditional
calibration unreliable. The algorithm employs a
generalized eigenproblem to mathematically
separate the instrument response of the analyte
from the instrument response of any interfering
species. A scheme for determining the eigenvectors
(and hence the concentration estimate) that
uniquely correspond to the analyte of interest is
given. These eigenvectors can readily be
distinguished from any eigenvector that
corresponds to the spectrum of the interferents or
both the interferents and analyte. The stability
of the estimated analyte concentration is verified
by monte carlo simulations. The algorithm is
applied to the analysis of trichloroethylene in
samples that have matrix effects caused by an
interaction with chloroform.
Booksh, K.S., & Kowalski, B.R. (1994a).
Comments on the DATa ANalysis (DATAN) algorithm
and rank annihilation factor analysis for the
analysis of correlated spectral data.
Journal of Chemometrics, 8, 287-292.
It is shown that the data analysis (DATAN)
algorithm can be expressed in terms of rank
annihilation factor analysis (RAFA). Subsequent
advances in RAFA are applied to DATAN to eliminate
the problems and restrictions associated with
DATAN. The extension of DATAN in terms of the
trilinear decomposition algorithm is discussed.
Booksh, K.S., & Kowalski, B.R. (1997).
Calibration method choice by comparison of model
basis functions to the theoretical instrumental
Analytica Chimica Acta, 348, 1-9.
Sorting through the large array of calibration
methods available for first and second order
calibration is often a daunting task for initiates
into the field of chemometrics. Justifying the
selected method as the most appropriate one is
even more difficult. Presented here is a
justification for calibration method selection
based on matching the model employed in the
calibration method with the instrumental response
function. This is applied to the disparate types
of nonlinearities found in both first and second
order calibration. Matching the calibration method
to the instrumental response function is employed
to parse the decision making process for choosing
between branches in the first order parsimony
tree. The different types of nonlinearities
present in second order data and their
implications on calibration model selection are
Booksh, K.S., Lin, Z.H., Wang, Z.Y., & Kowalski, B.R. (1994).
Extension of trilinear decomposition method with
an application to the flow probe sensor.
Analytical Chemistry, 66, 2561-2569.
The trilinear decomposition algorithm (TLD) is a
method for calibration of second-order
instrumentation (e.g., lc-uv). This method, like
the generalized rank annihilation method (GRAM),
estimates the intrinsic profiles (e.g., spectra
and chromatograms) of each component present in
each sample by solving an eigenvector/eigenvalue
problem. The relative concentration of each
component between the samples is found by the
least squares fitting of the intrinsic profiles to
the instrument response of the samples. The
advantage the TLD algorithm has over GRAM is the
ability to analyze data from multiple samples
simultaneously. The previously published algorithm
provided unreliable calibration estimates when
imaginary eigenvectors were included in the
solution of the eigenproblem. An improved TLD
algorithm is presented to correct this problem.
The TLD algorithm is also extended to provide
reliable calibration in the case where the
instrument response to analyte concentration is
nonlinear. This extension assumes the intrinsic
profiles of the analyte are identical at all
analyte concentrations. The improved and extended
TLD algorithm is demonstrated on two simulated
data sets as well as the flow optrode analysis of
Pb(II) and Cd(II).
Booksh, K.S., Muroski, A.R., & Myrick, M.L. (1996).
Single-measurement excitation/emission matrix
spectrofluorometer for determination of
hydrocarbons in ocean water. 2. Calibration and
quantitation of naphthalene and styrene.
Analytical Chemistry, 68, 3539-3544.
An excitation/emission matrix imaging
spectrofluorometer was employed for quantitation
of two fluorescent compounds, naphthalene and
styrene, contained in ocean water exposed to
gasoline. Multidimensional parallel factor
(Parafac) analysis models were used to resolve the
naphthalene and styrene fluorescence spectra from
a complex background signal and overlapping
spectral interferents not included in the
calibration set. Linearity was demonstrated over two
orders of magnitude for determination of
naphthalene with a detection limit of eight parts per
billion. Similarly, nearly two orders of magnitude
of linearity was demonstrated in the determination
of styrene with an 11 ppb limit of detection.
Furthermore, the synthesis of the EEM
spectrofluorometer and the Parafac analysis for
unbiased prediction of naphthalene and styrene
concentration in mixture samples containing
uncalibrated spectral interferents was
Bonnet, N., & Zahm, J. M. (1998).
Analysis of image sequences in fluorescence videomicroscopy of stationary objects.
Cytometry, 31, 217-228.
Fluorescence videomicroscopy allows the temporal behavior of
biological specimens to be studied at the cellular level. We describe two types of
methods that can be used for extracting quantitative information from image sequences:
the modelling approach, which is mainly local, and multivariate statistical analysis,
which provides a global approach. The potentials for use of these two methods are
illustrated through a simulation example and actual examples dealing with the study
of chloride secretion by airway epithelial cells. We define some guidelines for
making a choice between these two approaches, bearing in mind that a blend of these
two methodologies is also possible.
Boqué, R., Larrechi, M. S., & Ruis, F., X. (1999a).
Multivariate detection limits with fixed probabilities of error.
Chemometrics and Intelligent Laboratory Systems, 45, 397-408.
In this paper, a new approach to calculate multivariate detection limits (MDL) for the commonly used
inverse calibration model is discussed. The derived estimator follows the latest recommendations of the International
Union of Pure and Applied Chemistry (IUPAC) concerning the detection capabilities of analytical methods. Consequently,
the new approach: (a) is based on the theory of hypothesis testing and takes into account the probabilities of false
positive and false negative decisions, and (b) takes into account all the different sources of error, both in
calibration and prediction steps, which affect the final result. The MDL is affected by the presence of other analytes
in the sample to be analysed; therefore, it has a different value for each sample to be tested and so the proposed
approach attempts to find whether the concentration derived from a given response can be detected or not at the fixed
probabilities of error. The estimator has been validated with and applied to real samples analysed by NIR spectroscopy.
Boqué, R., Ferré, J., Faber, N. M.& Rius, F. X. (2002).
Limit of detection estimator for second-order bilinear colibration.
Analytic Chimica Acta, 451, 313-321.
A new approach is developed for estimating the limit of detection in second-order
bilinear calibration with the generalized rank annihilation method (GRAM). The
proposed estimator is based on recently derived expressions for prediction variance
and bias. It follows the latest IUPAC recommendations in the sense that it concisely
accounts for the probabilities of committing both types I and II errors, i.e.
false positive and false negative declarations, respectively. The estimator has
been extensively validated with simulated data, yielding promising results. (C)
2002 Elsevier Science B.V. All rights reserved.
Boqué, R., & Smilde, A.K. (1999b).
Monitoring and diagnosing batch processes with multiway covariates regression
models. AIChE Journal, 45, 1504-1520.
Multivariate statistical procedures for
monitoring the behavior of batch processes are
presented. A new type of regression, called
multiway covariates regression, is used to find
the relationship between the process variables
and the quality variables of the final product.
The three-way structure of the batch process data
is modeled by means of a Tucker3 or a PARAFAC
model. The only information needed is a
historical data set of past successful batches.
Subsequent new batches can be monitored using
multivariate statistical process control charts.
In this way the progress of the new batch can be
tracked and possible faults can be easily
detected. Further detailed information from the
process can be obtained by interrogating the
underlying model. Diagnostic tools, such as
contribution plots of each of the variables to
the observed deviation, are also developed.
Finally, on-line predictions of the final quality
variables can be monitored; providing an
additional tool to see whether a particular batch
will produce an out-of-spec product. These ideas
are illustrated using simulated and real data of
a batch polymerization reaction.
Borg, I., & Lingoes, J.C. (1978).
What weight should weights have in individual differences scaling?
Quality and Quantity, 12, 223-237.
This paper reanalyzes some data collected by Green and Rao (1972) via
The results are compared with those produced by INDSCAL which is a)
the most popular procedure, and b) also the method of analysis chosen
by Green and Rao (1972).
Borg, I., & Lingoes, J. (1987).
Multidimensional Similarity Structure Analysis. New York: Springer-
Verlag. (Review by P.M. Kroonenberg)
Chapter 20 (Individual differences models) contains a discussion of three-way
data analysis. In particular various aspects of Procrustes analysis for several
configurations are presented and the PINDIS and INDSCAL procedures are explained
in some detail. (see also Lingoes & Borg, 1978)
Borgatti, S. P., & Everett, M. G. (1992).
Regular blackmodels of multiway, multimode matrices.
Social Networks, 14, 91-120.
Blockmodels are used to collapse redundant elements in a system
in order to clarify the patterns of relationships among the
elements. Traditional blockmodels define redundancy in terms of
structural equivalence. This choice serves many analytic
purposes very well, but is inadequate for others. In
particular, role systems would be better modeled by blockmodels
based on regular equivalence. The first goal of this paper is
to generalize blockmodels to incorporate both structural and
regular equivalence. Another limitation of traditional
blockmodels is that they are defined only for (collections of)
2-way 1-mode adjacency matrices. This excludes common datasets
such as actor-by-event, actor-by-organization, item-by-use and
case-by-variable matrices. It also excludes 3-way data such as
actor-by-actor-by-time or subject-by-verb-by-object matrices.
The second goal of this paper is to define blockmodels for
multiway, multimode matrices in general. In so doing, we also
shift the focus of attention away from the blocking of actors
(or other entities) and toward the blocking of ties (or
Bouroche, J.-M. & Dussaix, A.-M. (1975).
three-way data analysis. Metra, 14, 299-319.
A method called 'Double PCA' is proposed for the analysis of
three-way data, say subjects x variables x time points. First
PCA is performed on the variables x time points matrix
averaged over subjects to assess general trends. Then per time
point PCA is performed over the subjects x variables matrix
centred per variable. Finally four different procedures are
discussed to obtain a 'best' common subject-space for all time
points. Plots showing the 'trajectory' of each subject in the
common space are given. Illustrated with a study of the French
Bove, G., & Di Ciaccio, A. (1989).
Comparisons among three factorial methods for analysing
three-mode data. In R. Coppi & S. Bolasco (Eds.), Multiway
data analysis (pp. 103-113). Amsterdam: Elsevier.
Methods for analusing three-mode data are compared by focusing on their
and models. Some interesting results concerning the different adopted
distances in the
geometrical representations of the variables are also obtained. An
application to French
census data is provided to emphasize the differences.
Bove, G., & Di Ciaccio, A. (1994).
A user-oriented overview of multiway methods and software.
Computational Statistics & Data Analysis,
This paper provides a brief overview of the most widely known methods and
software dealing with multiway data. The main features are described
applicative capabilities, in order to make the choices of users easier.
Bramston, P., Snyder Jr, C.W., Leah, J.A. & Law, H.G. (1983).
Assessment of assertiveness in the intellectually handicapped.
Multivariate Experimental Clinical Research, 6,
Earlier work in the structural analysis of self-reported difficulty in
indicated that individuals differed in terms of a two-facet model -
response type (positive
vs. negative assertiveness) by referents (close vs. distant interpersonal
study replicated the individual differences structure for an
sample, thus extending the generalizability of that model. However,
although the dimensions
were found in three different methods of assessment, self-report,
behavioral rating, and role
play, little agreement was found between the methods in accounting for
Additionally, there were hints that the four interaction dimensions of
actually reflect different difficulty positions on a non-linear
unidimensional scale of
assertiveness. Using a Rasch model to derive the single scale, role play
and self-report were
significantly correlated in their assessments, but the correlation was
not very great. It was
hypothesized that method differences might reflect legitimately different
close-distant referent raters.
Bridgman, R.P., Snyder Jr, C.W., & Law, H.G. (1981).
Individual differences in conceptual behaviour following manipulated
controllability. Personality and Individual Differences, 2, 197-
The present study examined the influence of manipulated controllability on
the intrinsic individual differences among 30 female undergraduates in a
disjunctive conceptual behavior recovery task. Three-mode factor analysis
was used to explore the process variability in a multivariate time-series
Results indicate that intrinsic task processes were altered by the
pretreatment, but the nature of the impact reflected substantial individual
differences in reaction.
Bro, R. (1995).
Algorithm for finding an interpretable simple neural network solution using PLS.
Journal of Chemometrics, 9, 423-430.
This communication describes the combination of a feedforward neural
network (NN) with one hidden neuron and partial least squares (PLS) regression. Through
training of the neural network with an algorithm that is a combination of a modified
simplex, PLS and certain numerical restrictions, one gains an NN solution that has
several feasible properties: (i) as in PLS the solution is qualitatively interpretable;
(ii) it works faster than or comparably with ordinary training algorithms for neural
networks; (iii) it contains the linear solution as a limiting case. Another very
important aspect of this training algorithm is the fact that outlier detection as in
ordinary PLS is possible through loadings, scores and residuals. The algorithm is used
on a simple non-linear problem concerning fluorescence spectra of white sugar solutions.
Bro, R. (1996).
Multiway calibration. Multilinear PLS. Journal of
Chemometrics, 10, 47-61.
A new multiway regression method called N-way partial least
squares (N-PLS) is
presented. The emphasis is on the three-way PLS version (tri-PLS), but it
is shown how to
extend the algorithm to higher orders. The developed algorithm is
superior to unfolding
methods, primarily owing to a stabilization of the decomposition. This
potentially gives increased interpretability and better predictions. The
algorithm is fast
compared with e.g. PARAFAC, because it consists of solving eigenvalue
problems. An example of
the developed algorithm taken from the sugar industry is shown and
compared with unfold-PLS.
Fluorescence excitation-emission matrices (EEMs) are measured on white
sugar solutions and
used to predict the ash content of the sugar. The predictions are
comparable by the two
methods, but there is a clear difference in the interpretability of the
two solutions. Also
shown is a simulated example of EEMs with very noisy measurements and a
low relative signal
from the analyte of interest. The predictions from unfold-PLS are almost
twice as bad as from
tri-PLS despite the large number of samples (125) used in the
Bro, R. (1997).
PARAFAC. Tutorial & applications. Chemometrics and Intelligent
Laboratory Systems, 38, 149-171.
This paper explains the multi-way decomposition method PARAFAC and its
use in chemometrics. PARAFAC is a generalization of PCA to higher order
arrays, but some of the characteristics of the method are quite
different from the ordinary two-way case. An important advantage of
using multi-way methods instead of unfolding methods is that the
estimated models are very simple in a mathematical sense, and therefore
more robust and easier to interpret. The applications presented include
subjects as: Analysis of variance by PARAFAC, a five-way application
of PARAFAC, PARAFAC with half the elements missing, PARAFAC constrained
to positive solutions and PARAFAC for regression as in principal
Bro, R. (1998).
Multi-way Analysis in the Food Industry. Unpublished doctoral thesis,
University of Amsterdam, Amsterdam The Netherlands.
Introduction; Multi-way analysis; How to read this thesis.
2. Multi-way data
Introduction; Unfolding; Rank of multi-way arrays.
3. Multi-way models
Introduction; The Kathri-Rao product; Parafac; Parafac2; Paratuck2; Tucker
models; Multilinear partial least squares regression; Summary.
Introduction; Alternating least squares; Parafac; Parafac2; Paratuck2; Tucker
models; Multilinear partial least squares regression; Improving alternating
least squares algorithm; Summary.
What is validation; Preprocessing; Which model to use; Number of components;
Checking convergence; Degeneracy; Assessing uniqueness; Influence and residual
analysis; Assessing robustness; Frequent problems and questions;
Introduction; Constraints; Alternating least squares revisited; Algorithms;
Introduction; Sensory analysis of bread; Comparing regression models; Rank-
deficient spectral FIA data; Exploratory study of sugar production; Enzymatic
activity; Modeling chromatographic retention time shifts.
Bro, R. (1999).
Exploratory study of sugar production using fluorescence spectroscopy and multi-way analysis.
Chemometrics and Intelligent Laboratory Systems, 46, 133-147.
This paper is concerned with the possibility of obtaining chemically meaningful models of
complicated processes by the use of fluorescence spectroscopy screening and the unique
parallel factor analysis (Parafac) model. The second-order nature of fluorescence excitation
emission data and the fact that the Parafac model has no rotational indeterminacy mean that
in certain cases, it is possible to decompose complex mixture signals into contributions
from individual chemical components. Relating the thus obtained information to, e.g.,
important quality parameters, it is possible to analyze, understand, predict and monitor the
quality based on a chemical foundation. The proposed approach thus gives a direct link
between process analytical chemistry and multivariate statistical process control.
Bro, R. (2003).
Multivariate calibration - What is in chemometrics for the analytical chemist?
Analytica Chimica Acta, 500, 185-194.
Chemometrics has been used for some 30 years but there is still need for
disseminating the potential benefits to a wider audience. In this paper, we claim that
proper analytical chemistry (1) must in fact incorporate a chemometric approach and (2)
that there are several significant advantages of doing so. In order to explain this, an
indirect route will be taken, where the most important benefits of chemometric methods
are discussed using small illustrative examples. Emphasis will be on multivariate data
analysis (for example calibration), whereas other parts of chemometrics such as
experimental design will not be treated here. Four distinct aspects are treated in
detail: noise reduction; handling of interferents; the exploratory aspect and the
possible outlier control. Additionally, some new developments in chemometrics are
Bro, R., & Andersson, C.A. (1998).
Improving the speed of multiway algorithms. Part II: Compression.
Intelligent Laboratory Systems, 42, 105-113.
In this paper an approach is developed for compressing a multiway array prior to estimating
a multilinear model with the purpose of speeding up the estimation. A method is developed
which seems very well-suited for a rich variety of models with optional constraints on the
factors. It is based on three key aspects: (1) a fast implementation of a Tucker3
algorithm, which serves as the compression method, (2) the optimality theorem of the
CANDELINC model, which ensures that the compressed array preserves the original variation
maximally, and (3) a set of guidelines for how to incorporate optional constraints. The
compression approach is tested on two large data sets and shown to speed up the estimation
of the model up to 40 times. The developed algorithms can be downloaded from
Bro, R., Andersson, C.A., & Kiers, H.A.L. (1999).
PARAFAC2 - Part II. Modeling chromatic data with retention time shifts.
Chemometrics, 13, 295-309.
This paper offers an approach for handling retention time shifts in resolving
chromatographic data using the PARAFAC2 model. In Part I of this series an
algorithm for PARAFAC2 was developed and extended to N-way arrays. It was
discussed that the PARAFAC2 model has a number of attractive features. It is
unique under mild conditions though it puts fewer restrictions on the data than
the well-known PARAFAC1 model. This has important implications for the modeling
of chromatographic data in which retention time shifts can be regarded as a
violation of the assumption of parallel proportional profiles underlying the
PARAFAC1 model. The PARAFAC2 model does not assume parallel proportional elution
profiles, but only that the matrix of elution profiles preserve its 'inner-
product structure' from sample to sample. This means that the cross-products of
the matrix holding the elution profiles in its columns remain constant. Here an
application using chromatographic separation based on the molecular size of
thick juice samples from the beet sugar industry illustrates the benefit of
using the PARAFAC2 model.
Bro, R., & De Jong, S. (1997).
A fast non-negativity-constrained least squares algorithm. Journal
of Chemometrics, 11, 392-401.
In this paper a modification of the standard algorithm for
nonnegativity constrained linear least squares regression method
is proposed. The algorithm is specifically designed for use in
multiway decomposition methods like PARAFAC and N-mode principal
component analysis. In those methods the typical situation is that
there is a high ratio between the number of objects and variables
in the regression problems solved. Furthermore, very similar regression
problems are solved many times during iterative procedures used. The
algorithm proposed is based on the de facto standard algorithm
NNLS by Lawson and Hanson, but modified to take advantage of the special
characteristics of iterative algorithms involving repeated use of
nonnegativity constraints. The principle behind the NNLS algorithm is
described in detail and a comparison is made between this standard
algorithm and the new algorithm called FNNLS (fast NNLS).
Bro, R., & Heimdal, H. (1996).
Enzymatic browning of vegetables. Calibration and analysis of variance
by multiway methods. Chemometrics and Intelligent Laboratory
Systems, 34, 85-102.
This paper describes the chemometrical aspects of an investigation
of the enzymatic browning of vegetables. Enzymatic browning is caused
by polyphenol oxidase, PPO. Kinetic UV/VIS spectra and experimental
design variables of PPO incubated samples are used for predicting
enzymatic activity and substrate consumption. The mathematical models
used are multiway PLS (N-PLS) and fiveway PARAFAC. Both methods are
available from Internet in MATLAB code. Throughout the results of the
multiway methods are compared to competing methods (PLS, PCR, Tucker,
feedforward neural networks, locally weighted regression, ANOVA and
others). The result of the investigation is, that the multiway methods
have clear advantages with respect to predictions and interpretability,
both mathematically and technologically.
Bro, R., & Jakobsen, M. (2002).
Exploring complex interactions in designed data using GEMANOVA. Color changes in fresh
beef during storage. Journal of Chemometrics, 16, 294-304.
Data from a severely reduced experimental design are investigated in order
to obtain detailed information on important factors affecting the changes in quality of meat
during storage under different conditions. It is possible to model the response, meat color,
using traditional ANOVA (analysis of variance) techniques, but the exploratory and explanatory
value of this model is somewhat restricted owing to the number of factors and the fact that
several interactions exist. For those reasons, it is not possible to visualize the model in a
simple way and therefore not possible to have a clear overview of the total variation in the data.
Using a recently suggested alternative to traditional analysis of variance, GEMANOVA (generalized
multiplicative ANOVA), it is possible to analyze the data effectively and obtain a more interpretable
solution that enables a simple overview of the whole sampling domain. Whereas traditional analysis
of variance typically seeks a model with main effects and as few and simple interactions and
cross-products as possible, the GEMANOVA model seeks to describe the data primarily by means of
higher-order interactions, albeit in a straightforward way. The two approaches are thus complementary.
It is shown that the GEMANOVA model is simple to interpret, primarily because the GEMANOVA structure
is in agreement with the nature of the data. It is shown that the GEMANOVA model used is
mathematically unique, which leads to attractive simplified ways of interpreting the model. The
results presented are the first published results, where the GEMANOVA model is not simply equivalent
to an ordinary PARAFAC model, thus taking full advantage of the additional structural power of
GEMANOVA. A new algorithm for fitting the GEMANOVA model is developed and is available from the
Bro, R., & Kiers, H. A. L. (2003a).
A new efficient method for determining the number of components in PARAFAC models.
Journal of Chemometrics, 17, 274-286.
A new diagnostic called the core consistency diagnostic (CORCONDIA) is suggested
for determining the proper number of components for multiway models. It applies
especially to the parallel factor analysis (PARAFAC) model, but also to other models
that can be considered as restricted Tucker3 models. It is based on scrutinizing
the 'appropriateness' of the structural model based on the data and the estimated
parameters of gradually augmented models. A PARAFAC model (employing dimension-wise
combinations of components for all modes) is called appropriate if adding other
combinations of the same components does not improve the fit considerably. It is
proposed to choose the largest model that is still sufficiently appropriate. Using
examples from a range of different types of data, it is shown that the core consistency
diagnostic is an effective tool for determining the appropriate number of components
in e.g. PARAFAC models. However, it is also shown, using simulated data, that the
theoretical understanding of CORCONDIA is not yet complete. Copyright (C) 2003
John Wiley Sons, Ltd.
Bro, R., Nielsen, H. H., Stefánsson, G. & Skara, T. (2002).
A phenomenological study of ripening of salted herring. Assessing homogeneity
of data from different countries and laboratories.
Journal of Chemometrics, 16, 81-88.
Data from ripening experiments of herring carried out at three Nordic
fishery research institutions in the period 1992-1995 were collected and analyzed by
multivariate analysis. The experiments were carried out at different times, with
different stocks as raw material, using different types of treatments and analyzed
in different laboratories. The question considered here is whether these data can be
assumed to be one homogeneous set of data pertaining to ripening of salted herring or
whether data from different labs, stocks, etc. must be considered independently. This
is of importance for further research into ripening processes with these and similar
data. It is shown in this paper that all data can be considered as one homogeneous
data set. This is verified using resampling where latent structures are compared
between different sample sets. This is done indirectly by testing regression models,
that have been developed on one sample set, on other sample sets. It is also done
directly by monitoring the deviation in latent structure observed between different
sample sets. No formal statistical test is developed for whether samples can be
assumed to stem from the same population. Although this can easily be envisioned, it
was exactly the need for a more intuitive and visual test that prompted this work,
developing different exploration tools that visually make it clear how well the data
can be assumed to derive from the same population. Subsequently analyzing the data as
one homogeneous group provides new information about factors that govern the ripening
of salted herring and can be used in new strategic research as well as in industrial
Bro, R., Rinnan, A., & Faber, N. K. M. (2005).
Standard error of prediction for multilinear PLS 2. Practical implementation in fluorescence spectroscopy.
Chemometrics and Intelligent Laboratory Systems, 75, 69-76.
In Part 1 of this series, a new simplified expression was derived for estimating sample-specific standard error of prediction in inverse
multivariate regression. The focus was on the application of this expression in multilinear partial least squares (N-PLS) regression, but its
scope is more general. In this paper, the expression is applied to a fluorescence spectroscopic calibration problem where N-PLS regression is
appropriate. Guidelines are given for how to cope in practice with the main assumptions underlying the proposed methodology. The sample-specific
uncertainty estimates yield coverage probabilities close to the stated nominal value. Similar results were obtained for standard (i.e.,
linear) PLS regression and principal component regression on data rearranged to ordinary two-way matrices. The two-way results highlight
the generality of the proposed expression.
Bro, R., & Smilde, A. K. (2003b).
Centering and scaling in component analysis.
Journal of Chemometrics, 17, 16-33.
Bro, R. & Sidiropoulos, N.D. (1998).
Least squares algorithms under unimodality and non-negativity
Journal of Chemometrics, 12, 223-247.
In this paper a least squares method is developed for minimizing
//Y-XB'//2 over the matrix B subject to the
constraint that the columns of B are unimodal. This method
is directly applicable in curve resolution and in improving
stability when unimodality is known to be a valid assumption.
Unimodality least squares regression turns out to be no more
difficult than two simple Kruskal monotone regressions. The
method is useful in and exemplified with two- and multiway
methods (such as PARAFAC and PARATUCK2) based upon least squares
regression solving problems in chromotography and flow injection
Bro, R., Sidiropoulos, N. D., & Smilde, A. K. (2002).
Maximum likelihood fitting using ordinary least squares algorithms.
Journal of Chemometrics, 16, 387-400.
In this paper a general algorithm is provided for maximum likelihood fitting of
deterministic models subject to Gaussian-distributed residual variation (including any type of
non-singular covariance). By deterministic models is meant models in which no distributional
assumptions are valid (or applied) on the parameters. The algorithm may also more generally be
used for weighted least squares (WLS) fitting in situations where either distributional assumptions
are not available or other than statistical assumptions guide the choice of loss function. The
algorithm to solve the associated problem is called MILES (Maximum likelihood via Iterative Least
squares EStimation). It is shown that the sought parameters can be estimated using simple least
squares (LS) algorithms in an iterative fashion. The algorithm is based on iterative majorization
and extends earlier work for WLS fitting of models with heteroscedastic uncorrelated residual
variation. The algorithm is shown to include several current algorithms as special cases. For
example, maximum likelihood principal component analysis models with and without offsets can be
easily fitted with MILES. The MILES algorithm is simple and can be implemented as an outer loop in
any least squares algorithm, e.g. for analysis of variance, regression, response surface modeling, etc.
Several examples are provided on the use of MILES.
Bro, R., Smilde, A.K., & De Jong, S. (2001).
On the difference between low-rank and subspace approximation: improved model
for multi-linear PLS Regression. Chemometrics and Intelligent Laboratory Systems,
While both Tucker3 and PARAFAC models can be viewed as latent variable models
extending principal component analysis (PCA) to multi-way data, most
fundamental properties of PCA do not extend to both models. This has practical
importance, which will be explained in this paper. The fundamental difference
between the PARAFAC and the Tucker3 model can be viewed as the difference between
so-called low-rank and subspace approximation of the data. This
insight is used to pose a modification of the multi-linear partial least squares
regression (N-PLS) model. The modification is found by exploiting the basic
properties of PLS and of multi-way models. Compared to the current prevalent
implementation of N-PLS, the new model provides a more reasonable fit to the
independent data and exactly the same predictions of the dependent variables.
Thus, the reason for introducing this improved model is not to obtain better
predictions, but rather the aim is to improve the secondary aspect of PLS:
the modeling of the independent variables. The original version of N-PLS has some
built-in problems that are easily circumvented with the modification suggested
here. This is of importance, for example, in process monitoring, outlier
detection and also, implicitly, for jackknifing of model parameters. Some
examples are provided to illustrate some of these points.
Bro R., Workman Jr, J.J., Mobley, P.R. & Kowalski, B. (1997).
Review of chemometrics applied to spectroscopy: 1985-95, Part 3
- Multi-way analysis.
Applied Spectroscopy Reviews, 32, 237-261.
I. INTRODUCTION. A. Multi-way data; B. Important technology; C.
Software; D. Books, reviews, and tutorials.
II. MULTI-WAY MODELS AND ALGORITHMS. A. PARAFAC/GRAM; B. N-mode
PCA; C. Other models; D. Preprocessing.
III. APPLICATIONS OF MULTI-WAY METHODS. A. Mass spectrometry; B.
UV/Visible spectroscopy; C. Fluorescence; D. Other.
Bro, R., Berg, van den, F., Thybo, A., Andersen, C. M., Jorgensen, B. M., & Andersen, H. (2002).
Multivariate data analysis as a tool in advanced quality monitoring in the food production chain.
Trends in Food Science & technology, 13, 235-244.
This paper summarizes some recent advances in mathematical modeling of relevance in advanced quality
monitoring in the food production chain. Using chemometrics - multivariate data analysis - it is illustrated how to tackle
problems in food science more efficiently and, moreover, solve problems that could not otherwise be handled before.
The different mathematical models are all exemplified by food related subjects to underline the generic use of the
models within the food chain. Applications will be given from meat storage, vegetable characterization, fish quality
monitoring and industrial food processing, and will cover areas such as analysis of variance, monitoring and handling
of sampling variation, calibration, exploration/data mining and hard modeling.
Brouwer, P., & Kroonenberg, P.M. (1991).
Some notes on the diagonalization of extended three-mode core matrices.
Journal of Classification, 8, 93-98.
We extend previous results of Kroonenberg and de Leeuw (1980) and Kroonenberg (1983, Ch. 5)
on transformations of the extended core matrix of the Tucker2 model (Kroonenberg and de Leeuw
1980). In particular, it is shown that non-singular transformations to diagonalize the core
matrix will leac to PARAFAC solutions (Harshman 1970; Harshman and Lundy 1984), if such
Brown, S.D. (1998).
Information and data handling in chemistry and chemical engineering: the
state of the field from the perspective of chemometrics.
Computers and Chemical Engineering, 23, 203-216.
The basic trends in current researc of chemometrics are reviewed from the
perspective of soft modelling. Included is a discussion of the role of second-order and higher
calibration methods which make use of three-mode and multimode data. Advantages and limitations
of these methods are indicated. The logical connections between efforts made to improve or extend
chemometric methods and defects inherent in soft modelling are identified and briefly
Browne, M.W. (1989).
Relationships between an additive model and a multiplicative
model for multitrait-multimethod matrices. In R. Coppi & S.
Bolasco (Eds.), Multiway data analysis (pp. 507-520).
An additive model and a multiplicative model for multitrait-multimethod correlation matrices
are described and are related to the Campbell-Fiske conditions. Approximations are provided
for the parameters of the multiplicative model in terms of the parameters of the additive
model and situations in which the two models coincide are considered. An example where both
models are fitted to the same correlation matrix is provided.
Burdick, D.S. (1995).
Tutorial. An introduction to tensor products with applications to
multiway data analysis. Chemometrics and Intelligent Laboratory
Systems, 28, 229-237.
The concepts of tensor algebra and vector space geometry provide a unifying framework for
multilinear data analysis which simplifies notation and leads to economy of thought. Avoiding
too much abstraction too soon in defining tensor products makes these concepts accessible.
Examples are given of the use of tensor algebra in the analysis of bilinear and trilinear
models arising in fluorescence spectroscopy.
Burdick, D.S., Tu, X.M., McGown, L.B., & Millican, D.W. (1990).
Resolution of multicomponent fluorescent mixtures by analysis of
the excitation-emission-frequency array. Journal of
Chemometrics, 4, 15-28.
Fluorescence lifetime provides a third independent dimension of information for the
resolution of total luminescence spectra of multicomponent mixtures. The incorporation of
this parameter into the excitation-emission matrix (EEM) by the phase modulation technique
results in a three-dimensional excitation-emission-frequency array (EEFA). Multicomponent
analysis based on the three-dimensional EEFA brings a qualitative change for the resolved
spectra, i.e. individual spectra can be uniquely resolved, which is impossible with any
two-dimensional analysis. In this paper we present a method for analyzing the EEFA. We show
mathematically that with the three-dimensional analysis of the EEFA individual spectra and
lifetimes can be obtained. Our algorithm is developed in mathematical detail and is
demonstrated by its application to a two-component mixture.
Burgess, L.W. (1995).
Sensors and actuators B - Chemical, 29, 10-15.
Many chemical sensors based on fiber optics and absorption spectroscopy have been reported in
applications ranging from biomedical and environmental monitoring to industrial process
control. In these diverse applications, the analyte can be probed directly, by measuring its
intrinsic absorption, or by incorporating some transduction mechanism such as a reagent
chemistry to enhance sensitivity and selectivity. Physical and performance requirements are
placed on a device depending on its intended use. In applications such as chemical process
monitoring, survivability and the assurance of the long-term quality of the analytical data are
paramount. The above needs have resulted in devices that now employ multivariate data analysis,
complex sampling interfaces, and reagent renewal mechanisms. The response from such systems can
provide information not only about target analyte(s), but can also signal the presence of
interferences, and may potentially be used to follow sensor degradation. Examples are given of
devices currently being investigated along with a discussion of some of the remaining material,
chemical, and optical challenges.
Burnham, A.J., Macgregor, J.F. & Viveros, R. (1999).
A statistical framework for multivariate latent
variable regression methods based on maximum
Journal of Chemometrics, 13, 49-65.
A statistical framework is developed to contrast methods used for parameter estimation for a
latent variable multivariate regression (LVMR) model. This model involves two sets of
variables, X and Y, both with multiple variables and sharing a common latent
structure with additive random errors. The methods contrasted are partial least squares (PLS)
regression, principal component regression (PCR), reduced rank regression (RRR) and canonical
co-ordinate regression (CCR). The framework is based on a constrained maximum likelihood
analysis of the model under assumptions of multivariate normality. The constraint is that the
estimates of the latent variables are restricted to be linear functions of the X
variables, which is the form of the estimates for the methods being contrasted. The resulting
framework is a continuum regression that goes from RRR to PCR depending on the ratio of error
variances in the X and Y spaces. PLS does not arise as a member of the continuum;
however, the method does offer some insight into why PLS would work well in practice. The
constrained maximum likelihood result is also compared with the unconstrained maximum
likelihood analysis to investigate the impact of the constraint. The results are illustrated on
a simulated example.
Burnham, A. J., Viveros, R., & Macgregor, J. F. (1996).
Frameworks for latent multivariate regressin.
Journal of Chemometrics, 10, 31-45.
A set of frameworks for latent variable
multivariate regression method is developed. The
first two of these frameworks describe the
objective functions satisfied by the latent
variables chosen in canonical coordinates
regression (CCR), reduced rank regression (RRR)
and SIMPLS. These frameworks show the methods as a
natural progression from CCR (maximizing
correlation) to SIMPLS (maximizing covariance) via
RRR (which is an intermediate method). These
frameworks are unique in that they look at these
methods in terms of latent variables in both the
X- and Y-spaces. This adds insight to the nature
of the latent variables being chosen. These
frameworks are then extended to include PLS for
latent variables beyond the first component. This
new framework provides a detailed description ofthe objective function satisfied by PLS latent
variables for the multivariate case. It also
includes CCR, RRR and SIMPLS, allowing comparisons
between the methods. A further framework suggests
a new method, undeflated PLS (UDPLS), which adds
insight to the effect of the deflation process on
PLS. The impact of the objective functions on each
of the methods is illustrated on real data from a
mineral sorting plant.
Bylund, D., Danielsson, R., Malmquist, G., & Markides, K. E. (2002).
Chromatographic alignment by warping and dynamic programming as a pre-processing
tool for PARAFAC modelling of liquid chromatography-mass spectrometry data.
Journal of Chromatography A, 961, 237-244.
Solutes analysed with LC-MS are characterised. by their retention times
and mass spectra, and quantified by the intensities measured. This highly selective
information can be extracted by multiway modelling. However, for full use and
interpretability it is necessary that the assumptions made for the model are valid. For
PARAFAC modelling, the assumption is a trilinear data structure. With LC-MS, several factors,
e.g. non-linear detector response and ionisation suppression may introduce deviations from
trilinearity. The single largest problem, however, is the retention time shifts not related
to the true sample variations. In this paper, a time warping algorithm for alignment of LC-MS
data in the chromatographic direction has been examined. Several refinements have been
implemented and the features are demonstrated for both simulated and real data. With moderate
time shifts present in the data, pre-processing with this algorithm yields approximately
trilinear data for which reasonable models can be made.
Go to other sections of the Abstracts
| A | B |
C | D |
E | F |
G | H |
I | J |
K | L |
M | N |
O | P |
Q | R |
S | T |
U | V |
W | X |
Y | Z |
Institute of Education and Child Studies |
The Three-Mode Company |
Faculty of Social and Behavioural Sciences, Leiden University
The Three-Mode Company, Leiden, The Netherlands
First version : 12/02/1997;