Skip Navigation

Link to  the National Institutes of Health  
The Science of Drug Abuse and Addiction from the National Institute on Drug Abuse Archives of the National Institute on Drug Abuse web site
Go to the Home page
Assessing the Impact of Childhood Interventions on Subsequent Drug Use Home
Assessing the Impact of Childhood Interventions
on Subsequent Drug Use
skip navigation About the Conference
Commissioned Papers
Barbara J. Burns, Ph.D.
Scott N. Compton, Ph.D.
Helen L. Egger, M.D.
Elizabeth M.Z. Farmer, Ph.D.
E. Jane Costello
Tonya D. Armstrong
Alaattin Erkanli
Paul E. Greenbaum
Chi-Ming Kam
Linda M. Collins
Selected Bibliography
Program Contacts

Annotated Bibliography on Research Methods

Kam & Collins

Links to other parts of this paper:


A. Statistical Models and Data Analysis

1. Linear Models (Regression, Analysis of Variance, Analysis of Covariance)

  1. Afifi, A.A., & Clark, V. Computer-Aided Multivariate Analysis, 3rd ed. London: Chapman and Hall, 1996.

The book describes statistical techniques that are useful for researchers and presents them in a way understandable to people who have limited knowledge of statistics. The book describes the basic models behind all the analysis and discusses the underlying assumptions. Each chapter includes a discussion of computer programs that are suitable for performing a particular analysis. The authors also discuss data entry, database management, data screening, data transformations, as well as multivariate data analysis. This edition also contains a new chapter on log-linear analysis of multiway frequency tables.

  1. Cohen, J., & Cohen, P. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, 2nd ed. Hillsdale: Lawrence Erlbaum Associates, 1983.

The book is intended to be a text for courses in multiple regression/correlation (MRC) methods. It offers a thorough treatment of the techniques and an integrated conceptual system on these methods. Part I discusses basics. Chapter 1 introduces multiple regression/correlation analysis as a general data-analytic system. Chapters 2 and 3 provide detailed discussions on bivariate and multiple regression/correlation. Chapter 4 considers sets of independent variables as units of analysis. Part II deals with use of different measurement scales in MRC. It also discusses the handling of missing data and interaction. Part III discusses various applications of MRC: causal models in MRC, ANCOVA via MRC, repeated measurement and matched subjects designs, MRC, and other multivariate methods.

  1. Cook, R.D., & Weisberg, S. Residuals and Influence in Regression. London: Chapman and Hall, 1982.

The book contains a comprehensive account of diagnostic methods for detecting inadequacies in the fit statistics in data analyses based on linear regression models. The authors also extend their discussion of the methods to more complicated problems. The use of graphical procedures is emphasized, and most techniques discussed are illustrated in over 35 examples with more than 50 figures. The book is written at an intermediate level and is appropriate for those who are familiar with, or currently are learning, the methods of standard linear regression.

  1. Draper, N.R., & Smith, H. Applied Regression Analysis, 3rd ed. New York: John Wiley & Sons, 1998.

The classic text on regression analysis offers a clear, thorough presentation of concepts and applications as well as a complete, easily accessible introduction to the fundamentals of regression analysis. The reader only requires basic knowledge of elementary statistics. The text focuses on the fitting and checking of both linear and nonlinear regression models, using small and large data sets, with pocket calculators or computers. This latest edition of the book features separate chapters on multicollinearity, generalized linear models, mixture ingredients, geometry of regression, robust regression, and resampling procedures.

  1. Elashoff, J.D. Analysis of covariance: A delicate instrument. American Educational Research Journal 6:383-401, 1969.

This book gives a clear and comprehensive exposition of the technique of ANCOVA. It discusses what it really does and does not do, the advantages and limitations of the technique, and the conditions the data must satisfy for covariance analysis to be a valid technique.

  1. Hawkins, D.M. Identification of Outliers. London: Chapman and Hall, 1980.

The book is a comprehensive and integrated assessment of the methods of identifying statistical outliers in a mass of data. It brings together existing findings in the field, and it emphasizes on the optimal procedures. The author describes new concepts and test procedures and compares them to those that they supersede. An extremely useful feature of the book are the tables of fractiles of a comprehensive set of optimal outlier test statistics, some of which are defined for the first time.

  1. Huitema, B.E. The Analysis of Covariance and Alternatives. New York: John Wiley & Sons, 1980.

The book provides research workers in the behavioral and biological sciences with an applied and comprehensive treatment of the analysis of covariance and other alternative procedures. It is divided into three parts. Part I includes a brief review of statistical inference through simple analysis of variance and regression, followed by a diagrammatic illustration of the underlying rationale of ANOCA. Then the general linear regression approach to ANOVA and ANOCOVA is introduced. This part also deals with multiple comparison procedures, assumptions, design and interpretation problems. Part II covers varieties of ANCOVA, such as procedures that deal with multiple covariates, nonlinearity, multiple factors, and multiple dependent variables. Part III presents alternatives to standard parametric ANCOVA (e.g., rank ANCOVA, Johnson-Neyman techniques, and true-score ANCOVA) and standardized change score analysis.

  1. Morrison, D.F. Applied Linear Statistical Methods. Englewood Cliffs, NJ: Prentice Hall, 1983.

Linear statistical inference encompasses the fitting of lines and planes by least squares, the analysis of variance for experimental designs, correlation, traditional multivariate analysis, and some of the time series analyses. The book covers some techniques from these areas with the emphasis on the underlying assumptions, mathematical models, and applications of the methods.

  1. Morrison, D.F. Multivariate Statistical Methods, 3rd ed. New York: McGraw-Hill, 1990.

The book is written to provide investigators in the life and behavioral sciences with an elementary source for multivariate techniques. It is also meant to be a textbook for graduate courses on multivariate statistical methods. The book provides a review of essential univariate statistical concepts in the first chapter, and the second chapter covers matrix algebra. Chapter 3 discusses standard results on multinormal distribution, the estimation of its parameters, and correlation analysis. All these are essential background for the understanding of multivariate statistics. Later chapters deal with various multivariate analyses, from hypothesis testing of means, MANOVA to discriminant functions. Chapter 7 deals with inferences from covariance matrices. Chapters 8 and 9 deal with principal components and factor analysis.

  1. Neter, J., Kutner, M.H., Nachtsheim, C.J., & Wasserman, W. Applied Linear Statistical Models, 4th ed. Chicago: Irwin, 1996.

A key feature of this book is its presentation of application of the linear statistical models in regression, analysis of variance, and experimental design. The same notation is used for all three areas. The notion of a general linear statistical model, in the context of regression models, is carried over to analysis of variance and experimental design models to bring out their relation to regression models. In addition to more conventional topics in regression, ANOVA, and experimental designs, the authors include topics that are often slighted, though important in practice, such as the model-building process for regression, use of indicator variables. The book also emphasizes residual analysis and other diagnostics for examining the appropriateness of a statistical model, as well as remedial measures one can use when the model is not appropriate.

  1. Pedhazur, E.J. Multiple Regression in Behavioral Research, 3rd ed. New York: Harcourt Brace College Publishers, 1997.

The major focus of the book is multiple regression and its application. It also includes chapters on other multivariate analytic techniques. Part I of the text, chapters 1 to 8, introduces the foundation of multiple regression. It covers simple regression and diagnostics in regression analysis, matrix operation, and partial and semipartial correlation. Part II deals with the use of multiple regression in explanatory research. Chapters 9, 10, and 13 address the analyses of designs with continuous variables. Chapters 11 and 12 describe how to deal with categorical variables in multiple regression, in particular the use of MR for ANOVA designs. Chapters 14 and 15 deal with attribute-treatments-interaction (ATI) and ANCOVA designs. The book also has chapters on multilevel analysis, logistic regression, structural equation models, discriminant analysis, and canonical analysis.

  1. Tabachnick, B.G., & Fidell, L.S. Using Multivariate Statistics. New York: HarperCollins, 1996.

The book discusses multivariate statistical technique. The authors discuss considerations involved in determining the most appropriate technique, screening data for compliance, preparing followup analyses, and preparing the results for journal publication. Each chapter deals with each technique's specific research questions, assumptions, and limitations. Small and large sample examples, special topics, and results are included in each chapter. Topics covered in the book include multiple regression, canonical correlation, multiway frequency analysis, analysis of covariance, factor analysis, structural equation modeling, and logistic regression.

Back to Top

2. Linear Models (Mixed-Effects and Variance Components Models, Random Coefficient Models, and Hierarchical/Multilevel Models)

  1. Bryk, A.S., & Raudenbush, S.W. Hierarchical Linear Models: Applications and Data Analysis Methods. Newbury Park, CA: Sage Publications, 1992.

The introductory text explicates the theory and use of hierarchical linear models (HLM), through the use of working examples and lucid explanations. The presentation remains reasonably nontechnical by focusing on three general research purposes—improved estimation of effects within an individual unit, estimating and testing hypotheses about cross-level effects, and partitioning of variance and covariance components among levels. The volume describes use of both two- and three-level models in organizational research and studies of individual development and meta-analysis applications and concludes with a formal derivation of the statistical methods used in the book.

  1. Goldstein, H., & McDonald, R.P. A general model for the analysis of multilevel data. Psychometrika 53(4):455-467, 1988.

The authors developed a general model for the analysis of multivariate multilevel data structures. They include special cases like repeated measures designs, multiple matrix samples, multilevel latent variable models, multiple time series, and variance and covariance component models.

  1. Kreft, I.G.G., & De Leeuw, E.D. Introducing Multilevel Modeling, Thousand Oaks, CA: Sage Publications, 1998.

The authors introduce the multilevel modeling approach to researchers in social sciences. The book covers practical issues and potential problems of doing multilevel analyses and the author's approach is user-oriented, keeping formal mathematics and statistics to a minimum. Other key features of the book include the use of worked examples using real data sets, analyzed using the latest computer package for multilevel modeling.

  1. Longford, N.T. Random Coefficient Models. Oxford: Clarendon Press, 1993.

The book presents an elementary and systematic introduction to modeling of between-cluster variation, with emphasis on substantive interpretation. The text contains details of computational methods for estimation with random coefficient models, as well as a number of examples. Other than the basic random coefficient model, the book also deals with models with multiple layers of nesting, measurement error models for multilevel data, and a generalization of the random coefficient models to the class of generalized linear models.

  1. Raudenbush, S.W. Educational applications of hierarchical linear models: A review. Journal of Educational Statistics 13(2):85-116, 1988.

This paper reviews use of hierarchical linear models to deal with multilevel data in educational research. It discusses the estimation of both within- and between-group parameters in these models and reviews estimation theory and application of such models. Also, the logic of these methods is extended beyond the paradigmatic case to include research domains as diverse as panel studies, meta-analysis, and classical test theory. Estimation theory is reviewed from Bayes and empirical Bayes viewpoints, and the examples considered involve data sets with two levels of hierarchy.

  1. Raudenbush, S.W., & Chan, W.S. Application of a hierarchical linear model to the study of adolescent deviance in an overlapping cohort design. J Consult Clin Psychol 61:941-951, 1993.

The paper illustrates the use of the hierarchical linear models in assessing the psychometric properties of an instrument for studying change, compares the adequacy of linear and curvilinear growth models, controls for time invariant and time-varying covariates, and links overlapping cohorts of data. The authors employ data on attitudes toward deviance during adolescence.

  1. Searle, S.R., Casella, G., & McCulloch, C.E. Variance Components. New York: Wiley, 1992.

The book is written for research workers who have interests in the use of mixed models and variance components for statistically analyzing data. For students the book is suitable for linear models courses that include something on mixed models, variance components, and prediction. The introductory chapter of the book describes fixed, random, and mixed models and uses nine examples to illustrate them. The latter chapters describe the history of variance component estimation and different methods of estimation in one-way classification with or without balanced data. Chapters 4 and 5 deal with ANOVA estimation in general. Chapter 6 covers ML and REML estimation, and chapter 7 describes the prediction of random effects using best prediction (BP), best linear prediction (BLP), and best linear unbiased prediction (BLUP). Chapters 8 through 12 of the book cover specialized topics, such as computation of ML and REMl estimates; Bayes estimation and hierarchical models; and binary and discrete data.

Back to Top

3. Generalized Linear Models (Log-Linear Models and Logit Models)

  1. Agresti, A. The Analysis of Ordinal Categorical Data. New York: John Wiley, 1984.

The book provides an introduction to basic descriptive and inferential methods for categorical data and gives thorough coverage of later developments. Special emphasis is placed on interpretation and application of methods including an integrated comparison of the available strategies for analyzing ordinal data. The book also discusses implementation of methods using computer packages such as SAS, SPSSx, and GLIM.

  1. Agresti, A. Categorical Data Analysis. New York: Wiley, 1990.

The book describes the most important methods of analyzing categorical data. It offers a unified presentation of modeling using generalized linear models and emphasize loglinear and logit modeling techniques. Some special topics covered in the book include methods for repeated measurement data, prescriptions for how ordinal variables should be treated differently than nominal variables, derivations of basic asymptotic and fixed-sample-size inferential methods, and discussion of exact small sample procedures.

  1. Andersen, E.B. The Statistical Analysis of Categorical Data, 3rd ed. Berlin: Springer-Verlag, 1994.

The book can be used as a textbook for a graduate course in categorical data analysis. Topics covered included statistical inference in categorical data analysis; log-linear models and generalized linear models; two-way, three-way, and multidimensional contingency tables; incomplete tables; separability and collapsibility; the logit model and logistic regression analysis; models for the interactions; correspondence analysis; latent structure analysis; and latent class models. The treatment of statistical methods for categorical data is based on development of models and on derivation of parameters estimates, test quantities, and diagnostics for model departures. All the introduced methods are illustrated by data sets and accompanied by exercises.

  1. Bishop, Y.M.M., Fienberg, S.E., & Hollan, P.W. Discrete Multivariate Analysis: Theory and Practice. Cambridge, MA: MIT Press, 1975.

The text deals with analyses of discrete multivariate data, in particular those in the form of cross-classifications. Through the presentation of parametric models, sampling schemes, basic theory, practical examples, and advice on computation, the book serves as a ready reference for various users. The authors start with a chapter on structural models, then move on with chapters on MLE estimation and methods of goodness of fit. They also talk about practical aspects of model fitting and topics like incomplete tables, improved multinomial estimators, asymptotic methods, Markov models, and some other procedures useful for analyzing discrete multivariate data.

  1. Cliff, N. Ordinal Methods for Behavioral Data Analysis. Mahwah, NJ: Lawrence Erlbaum Associates, Inc., 1996.

The book treats ordinal methods in an integrated way rather than as a compendium of unrelated methods. It emphasizes that the ordinal quantities are highly meaningful in their own right, not just as stand-ins for more traditional correlations or analyses of variance. In fact, since the ordinal statistics have desirable descriptive properties of their own, the book treats them parametrically, rather than nonparametrically. The author discusses how ordinal statistics can be applied in a much wider set of research situations than has usually been thought, and shows that they can often come closer to answering the researcher’s primary questions than traditional ones can.

  1. Clogg, C.C., & Shihadeh, E.S. Statistical Models for Ordinal Variables. Thousand Oaks, CA: Sage Publication, 1994.

The book deals with the latest development in methods for analyzing ordinal data. It incorporates ordinal and even numerical information into the classical log-linear analysis of multidimensional contingency tables. It also builds on methods introduced by Goodman, Haberman, Fienberg, and Clogg, and it presents them in a unifying framework. The authors stressed that the book is gearing toward the applications of new models and methods for analysis of ordinal variables in the social sciences.

  1. DeMaris, A. A tutorial in logistic regression. Journal of Marriage and the Family 57:956-968, 1995.

This article discusses some major uses of the logistic regression model in social data analysis. To illustrate the use of the technique, the author compares it to linear regression. He begins with a discussion of the modeling of a binary dependent variable and then shows the modeling of polytomous dependent variables, considering cases in which the values are alternately unordered, then ordered. Techniques are illustrated throughout using data from the 1993 General Social Survey (GSS).

  1. Everitt, B.S. The Analysis of Contingency Tables, 2nd ed. London: Chapman and Hall, 1992.

This book gives a comprehensive account of the analysis of contingency tables, written at a level suitable for the applied researcher. In this new edition more material is included such as logistic regression models for tables with ordered categories and for response variables with more than two categories. A brief account is also given on correspondence analysis, a recently developed technique. The methods of analysis described in this book are relevant to research workers and graduate students dealing with data from surveys, particularly in the areas of psychiatry, social sciences, and psychology.

  1. Fleiss, J.L. Statistical Methods for Rates and Proportions. New York: John Wiley, 1981.

The book is concerned with the analysis of categorical data, with emphasis on applications to health sciences. It covers theoretical and practical issues related to rates and proportions, such as related probability theory, assessing significance in a fourfold table, sample size determination, and randomization. The author then discusses three different sampling methods and their analysis, namely naturalistic or cross-sectional studies, prospective and retrospective studies, and controlled comparative studies. Other topics covered include analysis of data from matched samples, comparison of proportions from many samples, combining evidence from fourfold tables, measurement and control of misclassification errors, and standardization of rates.

  1. Goodman, L.A. Analyzing Qualitative/Categorical Data. Cambridge, MA: Abt Books, 1978.

The book consists of a collection of papers written by Leo Goodman, who led the early development of log-linear models. It covers readings on both log-linear models and latent-structure analysis. It has five parts: (1) the logit model; (2) the general log-linear model; (3) Davis on Goodman’s approach; (4) latent structure and scaling models; and (5) some extensions to the Goodman system.

  1. Goodman, L.A. The Analysis of Cross-Classifications Having Ordered Categories. Cambridge, MA: Harvard University Press.

The book is a collection of papers written by Leo Goodman on the analysis of ordinal data. It also includes work by Cliff Clogg, which describes the applications of association models (chapter 8) and the analysis of multiway cross-classifications having ordered categories (chapter 9). Chapters 1 to 4 of the book deal with the use of log-linear models in three different contexts: the analysis of the joint distribution in a cross-classification, the analysis of dependence, and the analysis of association. Chapters 5 and 6 develop further analysis of association, in comparison to earlier models developed by Karl Pearson and R.A. Fisher. Chapters 7, 8, and 9 provide examples of application of association models.

  1. Hagenaars, J.A. Categorical Longitudinal Data. Newbury Park, CA: Sage, 1990.

This book focuses on the analysis of categorical data obtained at a few discrete points in time, and log-linear model occupies a central position in the book. Special attention is paid to log-linear models with latent variables. After a short introductory chapter on the types of analyses of social change, chapter 2 describes the essential features in log-linear models, and chapter 3 talks about log-linear models with latent variables. Chapters 4 to 7 form the core of the book, and they touch on panel analysis and trend and cohort analysis. At the end of the book, chapter 8 summarizes the author's main arguments and presents several computer programs that implement ideas in the book.

  1. Hosmer, D.W., Jr., & Lemeshow, S. Applied Logistic Regression. New York: Wiley, 1989.

The book is the first focused introduction to the model. The latter is developed by approaching logistic regression via a linear regression point of view, rather than by means of contingency tables. Emphasis is placed on effective modeling strategies, including variable selection methods and the interpretation and presentation of results. The book also covers topics like logistic regression diagnostics. It further discusses the application of the method with different sampling models and its use in matched case-control studies. The last chapter is devoted to special topics on polytomous logistic regression and use of logistic regression to survival data.

  1. Lindsey, J.K. Modelling Frequency and Count Data. Oxford: Clarendon Press, 1995.

The book is structured around the distinction between independent events occurring to different individuals, resulting in frequencies, and repeated events occurring to the same individuals, yielding counts. It presents standard as well as more recently developed models of categorical data. The author also demonstrates that much of modern statistics can be seen as special cases of categorical data models; both generalized linear models and proportional hazards models can be fitted as log linear models. More specialized topics such as Markov chains, over-dispersion and random effects are also covered.

  1. Sobel, M.E. The analysis of contingency tables. In: Arminger, G., Clogg, C.C., & Sobel, M.E., eds. A Handbook for Statistical Modeling in the Social and Behavioral Sciences. New York: Plenum, 1992, pp. 252-303.

The chapter discusses in detail various forms of log-linear models. It starts with a brief history of the development of log-linear models. Section 2 uses several examples to introduce the reader to the log-linear model. Section 3 discusses the use of the odds ratio as a measure of association in two-way and three-way tables. Section 4 introduces models for the two-way table, Section 5 extends the discussion to three-way tables, and section 6 takes up the case of higher-way tables. Section 7 discusses estimation theory for the models. Section 8 discusses residuals and model selection procedures, and section 9 discusses computer programs that can be used to fit the models considered herein.

  1. von Eye, A., & Clogg, C.C. Categorical Variables in Developmental Research: Method of Analysis. San Diego: Academic Press, 1996.

The volume presents methods for analysis of categorical data in developmental research. The book covers a broad range of methods, concepts, and approaches. It is divided into four sections: (1) measurement and repeated observations of categorical data; (2) catastrophe theory; (3) latent class and log-linear models; and (4) applications.

  1. von Eye, A., & Niedermeier, K.E. Statistical Analysis of Longitudinal Categorical Data in the Social and Behavioral Sciences: An Introduction with Computer Illustration. Mahwah, NJ: Lawrence Erlbaum Associates, 1999.

The book provides a comprehensive resource for analyzing a variety of categorical data. It emphasizes the application of many recent advances of longitudinal categorical statistical methods. Each chapter provides basic methodology, helpful applications, examples using data from all fields of the social sciences, computer tutorials, and exercises. After defining categorical data and reviewing the basics of log-linear modeling, the book examines log-linear modeling for repeated observations, chi-square partitioning, prediction analysis, and configural frequency analysis.

Back to Top

4. Latent Variable Models (Factor Analysis)

  1. Bartholomew, D.J. Latent Variable Models and Factor Analysis. London: Griffin, 1987.

The book offers a unified treatment of latent variable models, which include factor analysis, latent class, and latent trait analysis. After earlier chapters describe these models, chapter 4 of the book discusses common elements of these models and the sufficiency principle. Later chapters deal with models and methods for binary and polytomous data.

  1. Comrey, A.L., & Lee, H.B. A First Course in Factor Analysis, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1992.

The book introduces readers to the theory and practice of factor analysis. Step by step, the authors first describe the derivation and assumptions of the factor analytic model. Then they examine various methods of factor extraction and rotations. Chapters 8 and 9 describe various designs in factor analysis. Chapter 10 deals with the interpretation and application of the analytic results. The authors then illustrate the use of factor analysis in one of their projects. The last three chapters of the book deal with confirmatory factor analysis, structural equation models, and computer programs that handle factor analysis.

  1. Graham, J.W., & Collins, N.L. Controlling correlational bias via confirmatory factor analysis of MTMM data. Multivariate Behav Res 26:607-629, 1992.

The two commonly used CFA analyses carried out on MTMM data either average the various measures of each trait or estimate only relationships between MTMM traits and the outside variables. The authors show that both methods produce equally highly biased parameter estimates when the actual correlations between MTMM method factors and the outside variables are substantial. An algebraic explanation and a simulated data illustration are given for the bias due to misspecification. Also, the problem is illustrated with a brief empirical example. Implications for applied research are discussed.

  1. Gorsuch, R.L. Factor Analysis, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1983.

The text is meant to be both a textbook for graduate students as well as a reference on factor analysis. It focuses on when and how to use the technique. Derivations of the mathematical models used in factor analysis are given. After the introduction in chapter 1, chapters 2, 3, 6, 8, 9, and 11 cover exploratory factor analysis, chapter 7 discusses the use of canonnical correlations to test hypotheses, and chapters 12 and 16 discuss the relevance of scoring techniques and replicability for all multivariate techniques. Chapters 17 and 18 provide a final overview of when each of the multivariate techniques should be used.

  1. Long, J.S. Confirmatory Factor Analysis: A Preface to LISREL. Beverly Hills, CA: Sage Publications, 1983.

The book presents the basic CFA equations and assumptions. It provides a thorough discussion of identification in such models and compares various methods of statistical estimation, including unweighted least squares, generalized least squares, and maximum likelihood methods. The author focuses on two basic applications of the CFA, the first a general discussion of its application to the multimethod-multitrait model, and the second a discussion of a specific mode of psychological disorders. The theoretical advantages of the confirmatory over the exploratory model are emphasized and demonstrated.

  1. McDonald, R.P. Factor Analysis and Related Methods. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers, 1985.

The author introduces the readers to the technique of factor analysis in a nonmathematical manner. The book begins with a chapter that covers basic and key concepts of common factor analysis, followed by chapters on exploratory common factor analysis and confirmatory factor analysis. In chapters 4 and 7, the author extends his discussion to models for linear structural relations and item response theory. In chapters 5 and 6, he deals with the problem of factor scores and the problems of relationships between factor analyses.

Back to Top

5. Latent Variable Models (Structural equation modeling)

  1. Arminger, G., & Schoenberg, R.J. Pseudo maximum likelihood estimation and a test for misspecification in mean and covariance structure models. Psychometrika 24:409-425, 1989.

The paper discusses the use of pseudo maximum likelihood in structural equation modeling when there is model misspecifications in mean and covariance structure models. The assumptions of multivariate normality in the variables is violated and ML method of estimation no longer valid. At the same time, LR, score and Wald test statistics will not converge to central chi-square. The authors also propose a Hausman-type test against this form of misspecification.

  1. Bentler, P.M. Comparative fit indexes in structural models. Psychol Bull 107:238-246, 1990.

The author proposes the use of a coefficient that summarizes the relative reduction in the noncentrality parameters of two nested models. Two estimators of the coefficient yield new normed fit index (CFI) and the nonnormed fit index (NNFI). CFI avoids the underestimation of fit often noted in small samples for Bentler and Bonett’s (1980) normed fit index (NFI). FI is a linear function of Bentler and Bonett’s NNFI that avoids the extreme underestimation and overestimation often found in NNFI. The author provides an example that illustrates the behavior of these indexes under conditions of correct specification and misspecification.

  1. Bollen, K.A. Total, direct, and indirect effects in structural equation models. In: Clogg, C., ed. Sociological Methodology 1987. San Francisco: Jossey-Bass, 1987, pp. 37-68.

In this paper, Bollen reviews the decomposition of effects for the structural equation models. It also clarifies the definition of total effects and the alternative meanings of specific indirect effects and the techniques of calculating them. The paper also proposes a general definition of specific effects, a definition that includes the effects transmitted by any path or combination of paths in a model.

  1. Bollen, K.A. Structural Equations with Latent Variables. New York: John Wiley & Sons, Inc., 1989.

The book offers a comprehensive treatment of structural equation (LISREL) models. It can be used as an introduction as well as a more advanced reference. The author begins by discussing simple models and then builds toward the general model. The text treats procedures such as path analysis, recursive and nonrecursive models, classical econometrics, and confirmatory factor analysis as special cases of a common model.

  1. Bollen, K.A., & Long, J.S. Testing Structural Equation Models. Thousand Oaks, CA: Sage Publications, 1993.

The book focuses on testing model fit and respecification in structural equation modeling. Chapters in the book were written by researchers in the field who have played major roles in shaping the debate over the two major steps in SEM. These authors wrote about the use and evaluation of different kinds of goodness-of-fit indices in SEM models. The book also includes chapters that discuss model selection and power evaluation in SEM.

  1. Browne, M.W. Asymptomatically distribution-free methods for the analysis of covariance structures. Br J Math Stat Psychol 37:62-83,1984.

The author derives methods for obtaining tests of fit of structural models for covariance matrices and estimator standard errors that are asymptotically distribution-free. He also provides modifications to standard normal theory tests and standard errors, which make them applicable to the wider class of elliptical distributions. The proposed methods were examined by conducting a random sampling experiment.

  1. Hayduk, L.A. Structural Equation Modeling with LISREL: Essentials and Advances. Baltimore: The John Hopkins University Press, 1987.

The book introduces structural equation modeling to readers who have no experience with the technique. In the first four chapters, the author goes over basic statistical concepts and skills readers need for understanding SEM. Chapters 5 and 6 deal with estimation and test of model fit. Chapters 7, 8, 9, and 10 discuss how to interpret LISREL results and the fitting multiple group models and means models.

  1. Hoyle, R.H., ed. Structural Equation Modeling: Concepts, Issues, and Application. Thousand Oaks, CA: Sage Publications, 1995.

The edited book includes chapters on major aspects of the structural equation modeling approach to research design and data analysis. Various authors contributed chapters that cover the following topics: basic concepts and fundamental issues in SEM; model specification and related issues; estimation and testing in SEM; SEM with nonnormal variables; evaluating model fit; statistical power in SEM; objectivity and reasoning in science and SEM; and various applications of SEM.

  1. Hoyle, R.H., & Smith, G.T. Formulating clinical research hypotheses as structural equation models: A conceptual overview. J Consult Clin Psych 62(30):429-440, 1994.

In the article, the authors provide a conceptual overview of the strategies and issues associated with formulating and evaluating various clinical research hypotheses as structural equation models. The paper begins with a sketch of the structural equation modeling approach to research design and data analysis. Then a series of clinical research hypotheses in structural equation modeling terms are presented. The authors conclude with a section on inferring causality from structural equation models.

  1. MacCallum, R.C., Roznowski, M., & Necowitz, L.B. Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychol Bull 111(3):490-504, 1992.

The paper discusses in detail the issue of model modification and explores that empirically through sampling studies using two large sets of data. The process of model modification, which is used commonly to improve model fit, is data-driven and so the results from such procedures are susceptible to capitalization on chance characteristics of the data. The authors found that over repeated samples, model modifications may be very inconsistent and cross-validation results may behave erratically. The authors recommend the use of alternative a priori models.

  1. Marcoulides, G.A., & Schumacker, R.E. Advanced Structural Equation Modeling: Issues and Techniques. Mahwah, NJ: Lawrence Erlbaum Associates, 1996.

The book introduces the latest issues and developments in SEM techniques. The topics selected include models for multitrait-multimethod (MTMM) matrix analysis, matrix analysis, nonlinear structural equation models, cross-domain analyses of change over time, structural time series models, bootstrapping techniques in the analysis of mean and covariance structure, limited information estimators, dealing with incomplete data, problems with equivalent models, and an evaluation of incremental fit indices.

  1. Maruyama, G.M. Basics of Structural Equation Modeling. Thousand Oaks, CA: Sage Publications, 1998.

The book describes the logic underlying structural equation modeling approaches, describes how SEM approaches relate to techniques like regression and factor analysis, analyzes the strengths and shortcomings of SEM as compared to alternative methodologies, and explores the various methodologies for analyzing structural equation modeling. Throughout the book, the author uses a single data set to demonstrate a variety of techniques ranging from path analysis to panel analysis to confirmatory analysis to latent variable structural equation modeling.

  1. Schumacker, R.E., & Lomax, R.G. A Beginner's Guide to Structural Equation Modeling. Mahwah, NJ: Lawrence Erlbaum Associates, 1996.

The book introduces students and researchers to the technique of structural equation modeling. The authors focus on the conceptual steps involved in analyzing theoretical models, including theory- or research-driven model specification, parameter estimation, model testing, interpretation of fit indices, and respecification of the model. Two popular software packages—EQS5 and LISREL8-SIMPLIS—are used in data examples throughout the book.

  1. Sörbom, D. Structural equation models with structured means. In: Jöreskog, K.G., & Wold, H., eds. Systems Under Indirect Observation: Causality, Structure, Prediction. Vol. 1. Amsterdam: North-Holland, 1982, pp. 183-195.

The paper discusses how to handle multiple group analysis in LISREl. Such an approach allows one to compare groups of individuals, e.g., to compare the means for certain constructs among these groups. The paper used the Head Start Summer Program data to illustrate the analysis. In detail, it describes the general model and its estimation.

  1. Sörbom, D. Model modification. Psychometrika 54(3):371-384, 1989.

The paper discusses the formulation of the "modification index" in LISREL program, which can be used as a guide in the search for a "better" model in covariance structure analysis. In statistical terms, the proposed index measures how much we will be able to reduce the discrepancy between model and data, as defined by a general fit function, when one parameter is added or freed or when one equality constraint is relaxed. The index also takes into account changes in all the parameters of the model when one particular parameter is freed.

Back to Top

6. Classification (Cluster Analysis)

  1. Arabie, P., Hubert, L.J., & De Soete, G. Clustering and Classification. Singapore: World Scientific, 1996.

The edited book deals with theories and applications on classification. It is a compendious scholarly review of the field by some of its eminent contributors. The chapter on "combinatorial data analysis" includes the field of clustering apart from probabilistic approaches. It is followed by chapters on "hierarchical models," "complexity theory," and "neural networks." Later chapters cover topics on clustering validation by simulation, statistical inference on cluster analysis, cluster analysis in Japan, and clustering and multidimensional scaling in Russia. The last chapters of the book review work on two-way clustering of 0-1 data and the fitting of tree models and network models.

  1. Blashfield, R.K., & Aldenderfer, M.S. The methods and problems of cluster analysis. In: Nesselroade & Cattell, eds. Handbook of Multivariate Experimental Psychology. New York: Plenum Press, 1988, pp. 447-473.

The chapter is an overview of cluster analysis. After briefly describing the history of the development of the technique, the authors go into detail about the various cluster analysis methods that are commonly used by researchers. Then the authors discuss the concept of similarity and conclude their chapter with some unresolved problems of cluster analysis and future direction of research in the field.

  1. Everitt, B. Cluster Analysis. New York: Halsted Press, 1980.

The text provides a nonmathematical account of the techniques of cluster analysis. After reviewing the general purpose of conducting cluster analysis, the choice of variable, and the measurement of similarity and distance, the author reviews some of the clustering techniques, followed by a discussion of the problems of cluster analysis and an empirical investigation of some methods of cluster analysis. The author concludes by comparing the advantages and disadvantages of different techniques and makes suggestions on using clustering techniques in practice.

  1. Bergman, L.R. You can't classify all of the people all of the time. Multivariate Behavioral Research 23:425-441, 1988.

When performing a classification study, it is sometimes a sound strategy not to classify all subjects but to leave a residue of unclassified entities to be analyzed separately. Starting from an interactional paradigm, theoretical reasons for this approach were given. The method RESIDAN, which uses a residue, is presented. It is argued that the concept of antitype (rare pattern) has theoretical significance and could be studied within the presented framework.

  1. Bergman, L.R. A pattern-oriented approach to studying individual development: Snapshots and processes In: Cairns, R.B., Bergman, L.R., & Kagan, J., eds. The Individual as a Focus in Developmental Research. New York: Sage Publications, 1996.

The implications of a person-oriented perspective for the study of individual development are discussed and various methodological solutions are suggested. Cluster analysis procedures are emphasized, and both a direct longitudinal approach and a cross-sectional approach followed by linking of the results at adjacent time points are presented. The program LICUR was used, and steps for using it are described. LISREL is also used in the paper to analyze the data and the results compared to that of LICUR.

  1. Bergman, L.R., & Wangby, M. The teenage girl: Patterns of self-reported adjustment problems and some correlates. International Journal of Methods in Psychiatric Research 5:171-188, 1995.

This article presents a pattern approach to the study of teenage girls’ adjustment problems, analyzing data concerning 519 15-year-old girls included in the Swedish longitudinal research program, "Individual Development and Adjustment." The girls’ profiles, as given by five self-reported adjustment problem indicators, are analyzed within a cluster analytic framework using the RESIDAN rationale, with due attention being paid to outliers and the importance of identifying a residue. Twelve clusters are identified. Some general features of the pattern approach are discussed.

  1. Kaufman, L., & Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.

This is an applied book on cluster analysis for general users or people who do not have a strong mathematical or statistical background. The first chapter of the book introduces the main approaches to clustering. Chapters 2 to 4 discuss partitioning methods. Chapters 5 to 7 cover hierarchical techniques.

  1. Milligan, G.W. An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika 45(3):325-342, 1980.

An evaluation of several clustering methods was conducted. Artificial clusters that exhibited properties of internal coherence and external isolation were constructed. The true cluster structure was subsequently hidden by six types of error-perturbation. The results indicate that the hierarchical methods were differentially sensitive to the type of error perturbation. In addition, generally poor recovery performance was obtained when random seed points were used to start the K-means algorithms. However, two alternative starting procedures for the nonhierarchical methods produced greatly enhanced cluster recovery and were found to be robust with respect to all of the types of error examined.

  1. Milligan, G.W. A review of Monte Carlo tests of cluster analysis. Multivariate Behavioral Research 16:379-407, 1981.

A review of Monte Carlo validation studies of clustering algorithms is presented. Several validation studies have tended to support the view that Ward’s minimum variance hierarchical method gives the best recovery of cluster structure. However, a more complete review of the validation literature on clustering indicates that other algorithms may provide better recovery under a variety of conditions. Applied researchers are cautioned concerning the uncritical selection of Ward’s method for empirical research. Alternative explanations for the differential recovery performance are explored, and recommendations are made for future Monte Carlo experiments.

Back to Top

Archive Home | Accessibility | Privacy | FOIA (NIH) | Current NIDA Home Page
National Institutes of Health logo_Department of Health and Human Services Logo The National Institute on Drug Abuse (NIDA) is part of the National Institutes of Health (NIH) , a component of the U.S. Department of Health and Human Services. Questions? See our Contact Information. . The U.S. government's official web portal