IV. STATISTICAL ANALYSIS
A. Statistical Models and Data Analysis
1. Linear Models (Regression,
Analysis of Variance, Analysis of Covariance)
- Afifi, A.A., & Clark, V. Computer-Aided Multivariate
Analysis, 3rd ed. London: Chapman and Hall, 1996.
The book describes statistical techniques that are useful
for researchers and presents them in a way understandable to people
who have limited knowledge of statistics. The book describes the basic
models behind all the analysis and discusses the underlying assumptions.
Each chapter includes a discussion of computer programs that are suitable
for performing a particular analysis. The authors also discuss data
entry, database management, data screening, data transformations,
as well as multivariate data analysis. This edition also contains
a new chapter on log-linear analysis of multiway frequency tables.
- Cohen, J., & Cohen, P. Applied Multiple Regression/Correlation
Analysis for the Behavioral Sciences, 2nd ed. Hillsdale: Lawrence
Erlbaum Associates, 1983.
The book is intended to be a text for courses in multiple
regression/correlation (MRC) methods. It offers a thorough treatment
of the techniques and an integrated conceptual system on these methods.
Part I discusses basics. Chapter 1 introduces multiple regression/correlation
analysis as a general data-analytic system. Chapters 2 and 3 provide
detailed discussions on bivariate and multiple regression/correlation.
Chapter 4 considers sets of independent variables as units of analysis.
Part II deals with use of different measurement scales in MRC. It
also discusses the handling of missing data and interaction. Part
III discusses various applications of MRC: causal models in MRC, ANCOVA
via MRC, repeated measurement and matched subjects designs, MRC, and
other multivariate methods.
- Cook, R.D., & Weisberg, S. Residuals and Influence
in Regression. London: Chapman and Hall, 1982.
The book contains a comprehensive account of diagnostic
methods for detecting inadequacies in the fit statistics in data analyses
based on linear regression models. The authors also extend their discussion
of the methods to more complicated problems. The use of graphical
procedures is emphasized, and most techniques discussed are illustrated
in over 35 examples with more than 50 figures. The book is written
at an intermediate level and is appropriate for those who are familiar
with, or currently are learning, the methods of standard linear regression.
- Draper, N.R., & Smith, H. Applied Regression
Analysis, 3rd ed. New York: John Wiley & Sons, 1998.
The classic text on regression analysis offers a clear,
thorough presentation of concepts and applications as well as a complete,
easily accessible introduction to the fundamentals of regression analysis.
The reader only requires basic knowledge of elementary statistics.
The text focuses on the fitting and checking of both linear and nonlinear
regression models, using small and large data sets, with pocket calculators
or computers. This latest edition of the book features separate chapters
on multicollinearity, generalized linear models, mixture ingredients,
geometry of regression, robust regression, and resampling procedures.
- Elashoff, J.D. Analysis of covariance: A delicate instrument.
American Educational Research Journal 6:383-401, 1969.
This book gives a clear and comprehensive exposition
of the technique of ANCOVA. It discusses what it really does and does
not do, the advantages and limitations of the technique, and the conditions
the data must satisfy for covariance analysis to be a valid technique.
- Hawkins, D.M. Identification of Outliers. London:
Chapman and Hall, 1980.
The book is a comprehensive and integrated assessment
of the methods of identifying statistical outliers in a mass of data.
It brings together existing findings in the field, and it emphasizes
on the optimal procedures. The author describes new concepts and test
procedures and compares them to those that they supersede. An extremely
useful feature of the book are the tables of fractiles of a comprehensive
set of optimal outlier test statistics, some of which are defined
for the first time.
- Huitema, B.E. The Analysis of Covariance and Alternatives.
New York: John Wiley & Sons, 1980.
The book provides research workers in the behavioral
and biological sciences with an applied and comprehensive treatment
of the analysis of covariance and other alternative procedures. It
is divided into three parts. Part I includes a brief review of statistical
inference through simple analysis of variance and regression, followed
by a diagrammatic illustration of the underlying rationale of ANOCA.
Then the general linear regression approach to ANOVA and ANOCOVA is
introduced. This part also deals with multiple comparison procedures,
assumptions, design and interpretation problems. Part II covers varieties
of ANCOVA, such as procedures that deal with multiple covariates,
nonlinearity, multiple factors, and multiple dependent variables.
Part III presents alternatives to standard parametric ANCOVA (e.g.,
rank ANCOVA, Johnson-Neyman techniques, and true-score ANCOVA) and
standardized change score analysis.
- Morrison, D.F. Applied Linear Statistical Methods.
Englewood Cliffs, NJ: Prentice Hall, 1983.
Linear statistical inference encompasses the fitting
of lines and planes by least squares, the analysis of variance for
experimental designs, correlation, traditional multivariate analysis,
and some of the time series analyses. The book covers some techniques
from these areas with the emphasis on the underlying assumptions,
mathematical models, and applications of the methods.
- Morrison, D.F. Multivariate Statistical Methods,
3rd ed. New York: McGraw-Hill, 1990.
The book is written to provide investigators in the
life and behavioral sciences with an elementary source for multivariate
techniques. It is also meant to be a textbook for graduate courses
on multivariate statistical methods. The book provides a review of
essential univariate statistical concepts in the first chapter, and
the second chapter covers matrix algebra. Chapter 3 discusses standard
results on multinormal distribution, the estimation of its parameters,
and correlation analysis. All these are essential background for the
understanding of multivariate statistics. Later chapters deal with
various multivariate analyses, from hypothesis testing of means, MANOVA
to discriminant functions. Chapter 7 deals with inferences from covariance
matrices. Chapters 8 and 9 deal with principal components and factor
- Neter, J., Kutner, M.H., Nachtsheim, C.J., & Wasserman,
W. Applied Linear Statistical Models, 4th ed. Chicago: Irwin,
A key feature of this book is its presentation of application
of the linear statistical models in regression, analysis of variance,
and experimental design. The same notation is used for all three areas.
The notion of a general linear statistical model, in the context of
regression models, is carried over to analysis of variance and experimental
design models to bring out their relation to regression models. In
addition to more conventional topics in regression, ANOVA, and experimental
designs, the authors include topics that are often slighted, though
important in practice, such as the model-building process for regression,
use of indicator variables. The book also emphasizes residual analysis
and other diagnostics for examining the appropriateness of a statistical
model, as well as remedial measures one can use when the model is
- Pedhazur, E.J. Multiple Regression in Behavioral
Research, 3rd ed. New York: Harcourt Brace College Publishers,
The major focus of the book is multiple regression and
its application. It also includes chapters on other multivariate analytic
techniques. Part I of the text, chapters 1 to 8, introduces the foundation
of multiple regression. It covers simple regression and diagnostics
in regression analysis, matrix operation, and partial and semipartial
correlation. Part II deals with the use of multiple regression in
explanatory research. Chapters 9, 10, and 13 address the analyses
of designs with continuous variables. Chapters 11 and 12 describe
how to deal with categorical variables in multiple regression, in
particular the use of MR for ANOVA designs. Chapters 14 and 15 deal
with attribute-treatments-interaction (ATI) and ANCOVA designs. The
book also has chapters on multilevel analysis, logistic regression,
structural equation models, discriminant analysis, and canonical analysis.
- Tabachnick, B.G., & Fidell, L.S. Using Multivariate
Statistics. New York: HarperCollins, 1996.
The book discusses multivariate statistical technique.
The authors discuss considerations involved in determining the most
appropriate technique, screening data for compliance, preparing followup
analyses, and preparing the results for journal publication. Each
chapter deals with each technique's specific research questions, assumptions,
and limitations. Small and large sample examples, special topics,
and results are included in each chapter. Topics covered in the book
include multiple regression, canonical correlation, multiway frequency
analysis, analysis of covariance, factor analysis, structural equation
modeling, and logistic regression.
Back to Top
2. Linear Models
(Mixed-Effects and Variance Components Models, Random Coefficient Models,
and Hierarchical/Multilevel Models)
- Bryk, A.S., & Raudenbush, S.W. Hierarchical
Linear Models: Applications and Data Analysis Methods. Newbury
Park, CA: Sage Publications, 1992.
The introductory text explicates the theory and use
of hierarchical linear models (HLM), through the use of working examples
and lucid explanations. The presentation remains reasonably nontechnical
by focusing on three general research purposesimproved estimation
of effects within an individual unit, estimating and testing hypotheses
about cross-level effects, and partitioning of variance and covariance
components among levels. The volume describes use of both two- and
three-level models in organizational research and studies of individual
development and meta-analysis applications and concludes with a formal
derivation of the statistical methods used in the book.
- Goldstein, H., & McDonald, R.P. A general model
for the analysis of multilevel data. Psychometrika 53(4):455-467,
The authors developed a general model for the analysis
of multivariate multilevel data structures. They include special cases
like repeated measures designs, multiple matrix samples, multilevel
latent variable models, multiple time series, and variance and covariance
- Kreft, I.G.G., & De Leeuw, E.D. Introducing
Multilevel Modeling, Thousand Oaks, CA: Sage Publications, 1998.
The authors introduce the multilevel modeling approach
to researchers in social sciences. The book covers practical issues
and potential problems of doing multilevel analyses and the author's
approach is user-oriented, keeping formal mathematics and statistics
to a minimum. Other key features of the book include the use of worked
examples using real data sets, analyzed using the latest computer
package for multilevel modeling.
- Longford, N.T. Random Coefficient Models. Oxford:
Clarendon Press, 1993.
The book presents an elementary and systematic introduction
to modeling of between-cluster variation, with emphasis on substantive
interpretation. The text contains details of computational methods
for estimation with random coefficient models, as well as a number
of examples. Other than the basic random coefficient model, the book
also deals with models with multiple layers of nesting, measurement
error models for multilevel data, and a generalization of the random
coefficient models to the class of generalized linear models.
- Raudenbush, S.W. Educational applications of hierarchical
linear models: A review. Journal of Educational Statistics
This paper reviews use of hierarchical linear models
to deal with multilevel data in educational research. It discusses
the estimation of both within- and between-group parameters in these
models and reviews estimation theory and application of such models.
Also, the logic of these methods is extended beyond the paradigmatic
case to include research domains as diverse as panel studies, meta-analysis,
and classical test theory. Estimation theory is reviewed from Bayes
and empirical Bayes viewpoints, and the examples considered involve
data sets with two levels of hierarchy.
- Raudenbush, S.W., & Chan, W.S. Application of a
hierarchical linear model to the study of adolescent deviance in an
overlapping cohort design. J Consult Clin Psychol 61:941-951,
The paper illustrates the use of the hierarchical linear
models in assessing the psychometric properties of an instrument for
studying change, compares the adequacy of linear and curvilinear growth
models, controls for time invariant and time-varying covariates, and
links overlapping cohorts of data. The authors employ data on attitudes
toward deviance during adolescence.
- Searle, S.R., Casella, G., & McCulloch, C.E. Variance
Components. New York: Wiley, 1992.
The book is written for research workers who have interests
in the use of mixed models and variance components for statistically
analyzing data. For students the book is suitable for linear models
courses that include something on mixed models, variance components,
and prediction. The introductory chapter of the book describes fixed,
random, and mixed models and uses nine examples to illustrate them.
The latter chapters describe the history of variance component estimation
and different methods of estimation in one-way classification with
or without balanced data. Chapters 4 and 5 deal with ANOVA estimation
in general. Chapter 6 covers ML and REML estimation, and chapter 7
describes the prediction of random effects using best prediction (BP),
best linear prediction (BLP), and best linear unbiased prediction
(BLUP). Chapters 8 through 12 of the book cover specialized topics,
such as computation of ML and REMl estimates; Bayes estimation and
hierarchical models; and binary and discrete data.
Back to Top
3. Generalized Linear Models
(Log-Linear Models and Logit Models)
- Agresti, A. The Analysis of Ordinal Categorical
Data. New York: John Wiley, 1984.
The book provides an introduction to basic descriptive
and inferential methods for categorical data and gives thorough coverage
of later developments. Special emphasis is placed on interpretation
and application of methods including an integrated comparison of the
available strategies for analyzing ordinal data. The book also discusses
implementation of methods using computer packages such as SAS, SPSSx,
- Agresti, A. Categorical Data Analysis. New York:
The book describes the most important methods of analyzing
categorical data. It offers a unified presentation of modeling using
generalized linear models and emphasize loglinear and logit modeling
techniques. Some special topics covered in the book include methods
for repeated measurement data, prescriptions for how ordinal variables
should be treated differently than nominal variables, derivations
of basic asymptotic and fixed-sample-size inferential methods, and
discussion of exact small sample procedures.
- Andersen, E.B. The Statistical Analysis of Categorical
Data, 3rd ed. Berlin: Springer-Verlag, 1994.
The book can be used as a textbook for a graduate course
in categorical data analysis. Topics covered included statistical
inference in categorical data analysis; log-linear models and generalized
linear models; two-way, three-way, and multidimensional contingency
tables; incomplete tables; separability and collapsibility; the logit
model and logistic regression analysis; models for the interactions;
correspondence analysis; latent structure analysis; and latent class
models. The treatment of statistical methods for categorical data
is based on development of models and on derivation of parameters
estimates, test quantities, and diagnostics for model departures.
All the introduced methods are illustrated by data sets and accompanied
- Bishop, Y.M.M., Fienberg, S.E., & Hollan, P.W.
Discrete Multivariate Analysis: Theory and Practice. Cambridge,
MA: MIT Press, 1975.
The text deals with analyses of discrete multivariate
data, in particular those in the form of cross-classifications. Through
the presentation of parametric models, sampling schemes, basic theory,
practical examples, and advice on computation, the book serves as
a ready reference for various users. The authors start with a chapter
on structural models, then move on with chapters on MLE estimation
and methods of goodness of fit. They also talk about practical aspects
of model fitting and topics like incomplete tables, improved multinomial
estimators, asymptotic methods, Markov models, and some other procedures
useful for analyzing discrete multivariate data.
- Cliff, N. Ordinal Methods for Behavioral Data Analysis.
Mahwah, NJ: Lawrence Erlbaum Associates, Inc., 1996.
The book treats ordinal methods in an integrated way
rather than as a compendium of unrelated methods. It emphasizes that
the ordinal quantities are highly meaningful in their own right, not
just as stand-ins for more traditional correlations or analyses of
variance. In fact, since the ordinal statistics have desirable descriptive
properties of their own, the book treats them parametrically, rather
than nonparametrically. The author discusses how ordinal statistics
can be applied in a much wider set of research situations than has
usually been thought, and shows that they can often come closer to
answering the researchers primary questions than traditional
- Clogg, C.C., & Shihadeh, E.S. Statistical Models
for Ordinal Variables. Thousand Oaks, CA: Sage Publication, 1994.
The book deals with the latest development in methods
for analyzing ordinal data. It incorporates ordinal and even numerical
information into the classical log-linear analysis of multidimensional
contingency tables. It also builds on methods introduced by Goodman,
Haberman, Fienberg, and Clogg, and it presents them in a unifying
framework. The authors stressed that the book is gearing toward the
applications of new models and methods for analysis of ordinal variables
in the social sciences.
- DeMaris, A. A tutorial in logistic regression. Journal
of Marriage and the Family 57:956-968, 1995.
This article discusses some major uses of the logistic
regression model in social data analysis. To illustrate the use of
the technique, the author compares it to linear regression. He begins
with a discussion of the modeling of a binary dependent variable and
then shows the modeling of polytomous dependent variables, considering
cases in which the values are alternately unordered, then ordered.
Techniques are illustrated throughout using data from the 1993 General
Social Survey (GSS).
- Everitt, B.S. The Analysis of Contingency Tables,
2nd ed. London: Chapman and Hall, 1992.
This book gives a comprehensive account of the analysis
of contingency tables, written at a level suitable for the applied
researcher. In this new edition more material is included such as
logistic regression models for tables with ordered categories and
for response variables with more than two categories. A brief account
is also given on correspondence analysis, a recently developed technique.
The methods of analysis described in this book are relevant to research
workers and graduate students dealing with data from surveys, particularly
in the areas of psychiatry, social sciences, and psychology.
- Fleiss, J.L. Statistical Methods for Rates and Proportions.
New York: John Wiley, 1981.
The book is concerned with the analysis of categorical
data, with emphasis on applications to health sciences. It covers
theoretical and practical issues related to rates and proportions,
such as related probability theory, assessing significance in a fourfold
table, sample size determination, and randomization. The author then
discusses three different sampling methods and their analysis, namely
naturalistic or cross-sectional studies, prospective and retrospective
studies, and controlled comparative studies. Other topics covered
include analysis of data from matched samples, comparison of proportions
from many samples, combining evidence from fourfold tables, measurement
and control of misclassification errors, and standardization of rates.
- Goodman, L.A. Analyzing Qualitative/Categorical
Data. Cambridge, MA: Abt Books, 1978.
The book consists of a collection of papers written
by Leo Goodman, who led the early development of log-linear models.
It covers readings on both log-linear models and latent-structure
analysis. It has five parts: (1) the logit model; (2) the general
log-linear model; (3) Davis on Goodmans approach; (4) latent
structure and scaling models; and (5) some extensions to the Goodman
- Goodman, L.A. The Analysis of Cross-Classifications
Having Ordered Categories. Cambridge, MA: Harvard University Press.
The book is a collection of papers written by Leo Goodman
on the analysis of ordinal data. It also includes work by Cliff Clogg,
which describes the applications of association models (chapter 8)
and the analysis of multiway cross-classifications having ordered
categories (chapter 9). Chapters 1 to 4 of the book deal with the
use of log-linear models in three different contexts: the analysis
of the joint distribution in a cross-classification, the analysis
of dependence, and the analysis of association. Chapters 5 and 6 develop
further analysis of association, in comparison to earlier models developed
by Karl Pearson and R.A. Fisher. Chapters 7, 8, and 9 provide examples
of application of association models.
- Hagenaars, J.A. Categorical Longitudinal Data.
Newbury Park, CA: Sage, 1990.
This book focuses on the analysis of categorical data
obtained at a few discrete points in time, and log-linear model occupies
a central position in the book. Special attention is paid to log-linear
models with latent variables. After a short introductory chapter on
the types of analyses of social change, chapter 2 describes the essential
features in log-linear models, and chapter 3 talks about log-linear
models with latent variables. Chapters 4 to 7 form the core of the
book, and they touch on panel analysis and trend and cohort analysis.
At the end of the book, chapter 8 summarizes the author's main arguments
and presents several computer programs that implement ideas in the
- Hosmer, D.W., Jr., & Lemeshow, S. Applied Logistic
Regression. New York: Wiley, 1989.
The book is the first focused introduction to the model.
The latter is developed by approaching logistic regression via a linear
regression point of view, rather than by means of contingency tables.
Emphasis is placed on effective modeling strategies, including variable
selection methods and the interpretation and presentation of results.
The book also covers topics like logistic regression diagnostics.
It further discusses the application of the method with different
sampling models and its use in matched case-control studies. The last
chapter is devoted to special topics on polytomous logistic regression
and use of logistic regression to survival data.
- Lindsey, J.K. Modelling Frequency and Count Data.
Oxford: Clarendon Press, 1995.
The book is structured around the distinction between
independent events occurring to different individuals, resulting in
frequencies, and repeated events occurring to the same individuals,
yielding counts. It presents standard as well as more recently developed
models of categorical data. The author also demonstrates that much
of modern statistics can be seen as special cases of categorical data
models; both generalized linear models and proportional hazards models
can be fitted as log linear models. More specialized topics such as
Markov chains, over-dispersion and random effects are also covered.
- Sobel, M.E. The analysis of contingency tables. In:
Arminger, G., Clogg, C.C., & Sobel, M.E., eds. A Handbook for
Statistical Modeling in the Social and Behavioral Sciences. New
York: Plenum, 1992, pp. 252-303.
The chapter discusses in detail various forms of log-linear
models. It starts with a brief history of the development of log-linear
models. Section 2 uses several examples to introduce the reader to
the log-linear model. Section 3 discusses the use of the odds ratio
as a measure of association in two-way and three-way tables. Section
4 introduces models for the two-way table, Section 5 extends the discussion
to three-way tables, and section 6 takes up the case of higher-way
tables. Section 7 discusses estimation theory for the models. Section
8 discusses residuals and model selection procedures, and section
9 discusses computer programs that can be used to fit the models considered
- von Eye, A., & Clogg, C.C. Categorical Variables
in Developmental Research: Method of Analysis. San Diego: Academic
The volume presents methods for analysis of categorical
data in developmental research. The book covers a broad range of methods,
concepts, and approaches. It is divided into four sections: (1) measurement
and repeated observations of categorical data; (2) catastrophe theory;
(3) latent class and log-linear models; and (4) applications.
- von Eye, A., & Niedermeier, K.E. Statistical
Analysis of Longitudinal Categorical Data in the Social and Behavioral
Sciences: An Introduction with Computer Illustration. Mahwah,
NJ: Lawrence Erlbaum Associates, 1999.
The book provides a comprehensive resource for analyzing
a variety of categorical data. It emphasizes the application of many
recent advances of longitudinal categorical statistical methods. Each
chapter provides basic methodology, helpful applications, examples
using data from all fields of the social sciences, computer tutorials,
and exercises. After defining categorical data and reviewing the basics
of log-linear modeling, the book examines log-linear modeling for
repeated observations, chi-square partitioning, prediction analysis,
and configural frequency analysis.
Back to Top
4. Latent Variable Models
- Bartholomew, D.J. Latent Variable Models and Factor
Analysis. London: Griffin, 1987.
The book offers a unified treatment of latent variable
models, which include factor analysis, latent class, and latent trait
analysis. After earlier chapters describe these models, chapter 4
of the book discusses common elements of these models and the sufficiency
principle. Later chapters deal with models and methods for binary
and polytomous data.
- Comrey, A.L., & Lee, H.B. A First Course in
Factor Analysis, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates,
The book introduces readers to the theory and practice
of factor analysis. Step by step, the authors first describe the derivation
and assumptions of the factor analytic model. Then they examine various
methods of factor extraction and rotations. Chapters 8 and 9 describe
various designs in factor analysis. Chapter 10 deals with the interpretation
and application of the analytic results. The authors then illustrate
the use of factor analysis in one of their projects. The last three
chapters of the book deal with confirmatory factor analysis, structural
equation models, and computer programs that handle factor analysis.
- Graham, J.W., & Collins, N.L. Controlling correlational
bias via confirmatory factor analysis of MTMM data. Multivariate
Behav Res 26:607-629, 1992.
The two commonly used CFA analyses carried out on MTMM
data either average the various measures of each trait or estimate
only relationships between MTMM traits and the outside variables.
The authors show that both methods produce equally highly biased parameter
estimates when the actual correlations between MTMM method factors
and the outside variables are substantial. An algebraic explanation
and a simulated data illustration are given for the bias due to misspecification.
Also, the problem is illustrated with a brief empirical example. Implications
for applied research are discussed.
- Gorsuch, R.L. Factor Analysis, 2nd ed. Hillsdale,
NJ: Lawrence Erlbaum Associates, 1983.
The text is meant to be both a textbook for graduate
students as well as a reference on factor analysis. It focuses on
when and how to use the technique. Derivations of the mathematical
models used in factor analysis are given. After the introduction in
chapter 1, chapters 2, 3, 6, 8, 9, and 11 cover exploratory factor
analysis, chapter 7 discusses the use of canonnical correlations to
test hypotheses, and chapters 12 and 16 discuss the relevance of scoring
techniques and replicability for all multivariate techniques. Chapters
17 and 18 provide a final overview of when each of the multivariate
techniques should be used.
- Long, J.S. Confirmatory Factor Analysis: A Preface
to LISREL. Beverly Hills, CA: Sage Publications, 1983.
The book presents the basic CFA equations and assumptions.
It provides a thorough discussion of identification in such models
and compares various methods of statistical estimation, including
unweighted least squares, generalized least squares, and maximum likelihood
methods. The author focuses on two basic applications of the CFA,
the first a general discussion of its application to the multimethod-multitrait
model, and the second a discussion of a specific mode of psychological
disorders. The theoretical advantages of the confirmatory over the
exploratory model are emphasized and demonstrated.
- McDonald, R.P. Factor Analysis and Related Methods.
Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers, 1985.
The author introduces the readers to the technique of
factor analysis in a nonmathematical manner. The book begins with
a chapter that covers basic and key concepts of common factor analysis,
followed by chapters on exploratory common factor analysis and confirmatory
factor analysis. In chapters 4 and 7, the author extends his discussion
to models for linear structural relations and item response theory.
In chapters 5 and 6, he deals with the problem of factor scores and
the problems of relationships between factor analyses.
Back to Top
5. Latent Variable Models
(Structural equation modeling)
- Arminger, G., & Schoenberg, R.J. Pseudo maximum
likelihood estimation and a test for misspecification in mean and
covariance structure models. Psychometrika 24:409-425, 1989.
The paper discusses the use of pseudo maximum likelihood
in structural equation modeling when there is model misspecifications
in mean and covariance structure models. The assumptions of multivariate
normality in the variables is violated and ML method of estimation
no longer valid. At the same time, LR, score and Wald test statistics
will not converge to central chi-square. The authors also propose
a Hausman-type test against this form of misspecification.
- Bentler, P.M. Comparative fit indexes in structural
models. Psychol Bull 107:238-246, 1990.
The author proposes the use of a coefficient that summarizes
the relative reduction in the noncentrality parameters of two nested
models. Two estimators of the coefficient yield new normed fit index
(CFI) and the nonnormed fit index (NNFI). CFI avoids the underestimation
of fit often noted in small samples for Bentler and Bonetts
(1980) normed fit index (NFI). FI is a linear function of Bentler
and Bonetts NNFI that avoids the extreme underestimation and
overestimation often found in NNFI. The author provides an example
that illustrates the behavior of these indexes under conditions of
correct specification and misspecification.
- Bollen, K.A. Total, direct, and indirect effects in
structural equation models. In: Clogg, C., ed. Sociological Methodology
1987. San Francisco: Jossey-Bass, 1987, pp. 37-68.
In this paper, Bollen reviews the decomposition of effects
for the structural equation models. It also clarifies the definition
of total effects and the alternative meanings of specific indirect
effects and the techniques of calculating them. The paper also proposes
a general definition of specific effects, a definition that includes
the effects transmitted by any path or combination of paths in a model.
- Bollen, K.A. Structural Equations with Latent Variables.
New York: John Wiley & Sons, Inc., 1989.
The book offers a comprehensive treatment of structural
equation (LISREL) models. It can be used as an introduction as well
as a more advanced reference. The author begins by discussing simple
models and then builds toward the general model. The text treats procedures
such as path analysis, recursive and nonrecursive models, classical
econometrics, and confirmatory factor analysis as special cases of
a common model.
- Bollen, K.A., & Long, J.S. Testing Structural
Equation Models. Thousand Oaks, CA: Sage Publications, 1993.
The book focuses on testing model fit and respecification
in structural equation modeling. Chapters in the book were written
by researchers in the field who have played major roles in shaping
the debate over the two major steps in SEM. These authors wrote about
the use and evaluation of different kinds of goodness-of-fit indices
in SEM models. The book also includes chapters that discuss model
selection and power evaluation in SEM.
- Browne, M.W. Asymptomatically distribution-free methods
for the analysis of covariance structures. Br J Math Stat Psychol
The author derives methods for obtaining tests of fit
of structural models for covariance matrices and estimator standard
errors that are asymptotically distribution-free. He also provides
modifications to standard normal theory tests and standard errors,
which make them applicable to the wider class of elliptical distributions.
The proposed methods were examined by conducting a random sampling
- Hayduk, L.A. Structural Equation Modeling with LISREL:
Essentials and Advances. Baltimore: The John Hopkins University
The book introduces structural equation modeling to
readers who have no experience with the technique. In the first four
chapters, the author goes over basic statistical concepts and skills
readers need for understanding SEM. Chapters 5 and 6 deal with estimation
and test of model fit. Chapters 7, 8, 9, and 10 discuss how to interpret
LISREL results and the fitting multiple group models and means models.
- Hoyle, R.H., ed. Structural Equation Modeling: Concepts,
Issues, and Application. Thousand Oaks, CA: Sage Publications,
The edited book includes chapters on major aspects of
the structural equation modeling approach to research design and data
analysis. Various authors contributed chapters that cover the following
topics: basic concepts and fundamental issues in SEM; model specification
and related issues; estimation and testing in SEM; SEM with nonnormal
variables; evaluating model fit; statistical power in SEM; objectivity
and reasoning in science and SEM; and various applications of SEM.
- Hoyle, R.H., & Smith, G.T. Formulating clinical
research hypotheses as structural equation models: A conceptual overview.
J Consult Clin Psych 62(30):429-440, 1994.
In the article, the authors provide a conceptual overview
of the strategies and issues associated with formulating and evaluating
various clinical research hypotheses as structural equation models.
The paper begins with a sketch of the structural equation modeling
approach to research design and data analysis. Then a series of clinical
research hypotheses in structural equation modeling terms are presented.
The authors conclude with a section on inferring causality from structural
- MacCallum, R.C., Roznowski, M., & Necowitz, L.B.
Model modifications in covariance structure analysis: The problem
of capitalization on chance. Psychol Bull 111(3):490-504, 1992.
The paper discusses in detail the issue of model modification
and explores that empirically through sampling studies using two large
sets of data. The process of model modification, which is used commonly
to improve model fit, is data-driven and so the results from such
procedures are susceptible to capitalization on chance characteristics
of the data. The authors found that over repeated samples, model modifications
may be very inconsistent and cross-validation results may behave erratically.
The authors recommend the use of alternative a priori models.
- Marcoulides, G.A., & Schumacker, R.E. Advanced
Structural Equation Modeling: Issues and Techniques. Mahwah, NJ:
Lawrence Erlbaum Associates, 1996.
The book introduces the latest issues and developments
in SEM techniques. The topics selected include models for multitrait-multimethod
(MTMM) matrix analysis, matrix analysis, nonlinear structural equation
models, cross-domain analyses of change over time, structural time
series models, bootstrapping techniques in the analysis of mean and
covariance structure, limited information estimators, dealing with
incomplete data, problems with equivalent models, and an evaluation
of incremental fit indices.
- Maruyama, G.M. Basics of Structural Equation Modeling.
Thousand Oaks, CA: Sage Publications, 1998.
The book describes the logic underlying structural equation
modeling approaches, describes how SEM approaches relate to techniques
like regression and factor analysis, analyzes the strengths and shortcomings
of SEM as compared to alternative methodologies, and explores the
various methodologies for analyzing structural equation modeling.
Throughout the book, the author uses a single data set to demonstrate
a variety of techniques ranging from path analysis to panel analysis
to confirmatory analysis to latent variable structural equation modeling.
- Schumacker, R.E., & Lomax, R.G. A Beginner's
Guide to Structural Equation Modeling. Mahwah, NJ: Lawrence Erlbaum
The book introduces students and researchers to the
technique of structural equation modeling. The authors focus on the
conceptual steps involved in analyzing theoretical models, including
theory- or research-driven model specification, parameter estimation,
model testing, interpretation of fit indices, and respecification
of the model. Two popular software packagesEQS5 and LISREL8-SIMPLISare
used in data examples throughout the book.
- Sörbom, D. Structural equation models with structured
means. In: Jöreskog, K.G., & Wold, H., eds. Systems Under Indirect
Observation: Causality, Structure, Prediction. Vol. 1. Amsterdam:
North-Holland, 1982, pp. 183-195.
The paper discusses how to handle multiple group analysis
in LISREl. Such an approach allows one to compare groups of individuals,
e.g., to compare the means for certain constructs among these groups.
The paper used the Head Start Summer Program data to illustrate the
analysis. In detail, it describes the general model and its estimation.
- Sörbom, D. Model modification. Psychometrika
The paper discusses the formulation of the "modification
index" in LISREL program, which can be used as a guide in the
search for a "better" model in covariance structure analysis.
In statistical terms, the proposed index measures how much we will
be able to reduce the discrepancy between model and data, as defined
by a general fit function, when one parameter is added or freed or
when one equality constraint is relaxed. The index also takes into
account changes in all the parameters of the model when one particular
parameter is freed.
Back to Top
6. Classification (Cluster
- Arabie, P., Hubert, L.J., & De Soete, G. Clustering
and Classification. Singapore: World Scientific, 1996.
The edited book deals with theories and applications
on classification. It is a compendious scholarly review of the field
by some of its eminent contributors. The chapter on "combinatorial
data analysis" includes the field of clustering apart from probabilistic
approaches. It is followed by chapters on "hierarchical models,"
"complexity theory," and "neural networks." Later
chapters cover topics on clustering validation by simulation, statistical
inference on cluster analysis, cluster analysis in Japan, and clustering
and multidimensional scaling in Russia. The last chapters of the book
review work on two-way clustering of 0-1 data and the fitting of tree
models and network models.
- Blashfield, R.K., & Aldenderfer, M.S. The methods
and problems of cluster analysis. In: Nesselroade & Cattell, eds.
Handbook of Multivariate Experimental Psychology. New York:
Plenum Press, 1988, pp. 447-473.
The chapter is an overview of cluster analysis. After
briefly describing the history of the development of the technique,
the authors go into detail about the various cluster analysis methods
that are commonly used by researchers. Then the authors discuss the
concept of similarity and conclude their chapter with some unresolved
problems of cluster analysis and future direction of research in the
- Everitt, B. Cluster Analysis. New York: Halsted
The text provides a nonmathematical account of the techniques
of cluster analysis. After reviewing the general purpose of conducting
cluster analysis, the choice of variable, and the measurement of similarity
and distance, the author reviews some of the clustering techniques,
followed by a discussion of the problems of cluster analysis and an
empirical investigation of some methods of cluster analysis. The author
concludes by comparing the advantages and disadvantages of different
techniques and makes suggestions on using clustering techniques in
- Bergman, L.R. You can't classify all of the people
all of the time. Multivariate Behavioral Research 23:425-441,
When performing a classification study, it is sometimes
a sound strategy not to classify all subjects but to leave a residue
of unclassified entities to be analyzed separately. Starting from
an interactional paradigm, theoretical reasons for this approach were
given. The method RESIDAN, which uses a residue, is presented. It
is argued that the concept of antitype (rare pattern) has theoretical
significance and could be studied within the presented framework.
- Bergman, L.R. A pattern-oriented approach to studying
individual development: Snapshots and processes In: Cairns, R.B.,
Bergman, L.R., & Kagan, J., eds. The Individual as a Focus
in Developmental Research. New York: Sage Publications, 1996.
The implications of a person-oriented perspective for
the study of individual development are discussed and various methodological
solutions are suggested. Cluster analysis procedures are emphasized,
and both a direct longitudinal approach and a cross-sectional approach
followed by linking of the results at adjacent time points are presented.
The program LICUR was used, and steps for using it are described.
LISREL is also used in the paper to analyze the data and the results
compared to that of LICUR.
- Bergman, L.R., & Wangby, M. The teenage girl: Patterns
of self-reported adjustment problems and some correlates. International
Journal of Methods in Psychiatric Research 5:171-188, 1995.
This article presents a pattern approach to the study
of teenage girls adjustment problems, analyzing data concerning
519 15-year-old girls included in the Swedish longitudinal research
program, "Individual Development and Adjustment." The girls
profiles, as given by five self-reported adjustment problem indicators,
are analyzed within a cluster analytic framework using the RESIDAN
rationale, with due attention being paid to outliers and the importance
of identifying a residue. Twelve clusters are identified. Some general
features of the pattern approach are discussed.
- Kaufman, L., & Rousseeuw, P.J. Finding Groups
in Data: An Introduction to Cluster Analysis. New York: John Wiley
& Sons, 1990.
This is an applied book on cluster analysis for general
users or people who do not have a strong mathematical or statistical
background. The first chapter of the book introduces the main approaches
to clustering. Chapters 2 to 4 discuss partitioning methods. Chapters
5 to 7 cover hierarchical techniques.
- Milligan, G.W. An examination of the effect of six
types of error perturbation on fifteen clustering algorithms. Psychometrika
An evaluation of several clustering methods was conducted.
Artificial clusters that exhibited properties of internal coherence
and external isolation were constructed. The true cluster structure
was subsequently hidden by six types of error-perturbation. The results
indicate that the hierarchical methods were differentially sensitive
to the type of error perturbation. In addition, generally poor recovery
performance was obtained when random seed points were used to start
the K-means algorithms. However, two alternative starting procedures
for the nonhierarchical methods produced greatly enhanced cluster
recovery and were found to be robust with respect to all of the types
of error examined.
- Milligan, G.W. A review of Monte Carlo tests of cluster
analysis. Multivariate Behavioral Research 16:379-407, 1981.
A review of Monte Carlo validation studies of clustering
algorithms is presented. Several validation studies have tended to
support the view that Wards minimum variance hierarchical method
gives the best recovery of cluster structure. However, a more complete
review of the validation literature on clustering indicates that other
algorithms may provide better recovery under a variety of conditions.
Applied researchers are cautioned concerning the uncritical selection
of Wards method for empirical research. Alternative explanations
for the differential recovery performance are explored, and recommendations
are made for future Monte Carlo experiments.
Back to Top