Skip Navigation

Link to  the National Institutes of Health  
The Science of Drug Abuse and Addiction from the National Institute on Drug Abuse Archives of the National Institute on Drug Abuse web site
Go to the Home page
Assessing the Impact of Childhood Interventions on Subsequent Drug Use Home
Assessing the Impact of Childhood Interventions
on Subsequent Drug Use
skip navigation About the Conference
Commissioned Papers
Barbara J. Burns, Ph.D.
Scott N. Compton, Ph.D.
Helen L. Egger, M.D.
Elizabeth M.Z. Farmer, Ph.D.
E. Jane Costello
Tonya D. Armstrong
Alaattin Erkanli
Paul E. Greenbaum
Chi-Ming Kam
Linda M. Collins
Selected Bibliography
Program Contacts

Report on the Developmental Epidemiology of Comorbid Psychiatric and Substance Use Disorders

Costello, Armstrong & Erkanli

Part 2. Data sets for further study

Links to other parts of this paper:

Part 2. Data sets for further study


The basic information required for the study of comorbidity is not difficult to collect, and any study that collects information on more than one diagnosis has already done so. What the researchers may not have done, however, is publish their data in a form that makes the calculation of comorbidity rates possible. These are laid out in detail elsewhere (Angold et al., 1999), but basically consist of (1) the base rates of disorders X and Y; (2) the prevalence of X|Y and X|not-Y; (3) the prevalence of Y|X and Y|not-X. Thus, a simple 2x2 table provides the data needed. However, most researchers, while having the data readily available to construct such tables, have seen no reason to publish them in this form.

At a more complex level, one needs to account for other comorbidities in examining the one of interest. Thus, one needs to account for comorbidities among psychiatric disorders in examining comorbidity of psychiatric disorders and substance use/abuse/dependence. This requires that the data are laid out in the form of a multiplex table of the kind needed to calculate, for example, a Mantel-Haenzsel chi-square. Once again, researchers have the data necessary for this purpose; they just don't normally present them in this format. Similarly, to examine risk factors and correlates of various types of comorbidity, one needs the basic 2x2 tables broken out by age, sex, race/ethnicity, poverty, etc., and possibly by all of these simultaneously. Problems with working from the published data increase when studies use complex sampling designs rather than simple random sampling, or both. It then becomes difficult to combine reports unless the relevant variance estimates are also available. Many studies also have the potential to examine comorbidity longitudinally using repeated data waves. Meta-analysis of longitudinal data is more complex, but still possible given the power of recent analytic software.

In this section we (1) review the data sets that have potential for this activity, (2) suggest some key questions that could be addressed, and (3) suggest an approach to answering these questions.


Sources of information

The three main sources of information about potentially useful data sets were (1) the National Institutes of Health's database of currently and previously funded grants (Computer Retrieval of Information on Scientific Projects [CRISP]), (2) the literature review described earlier, and (3) personal contact with researchers, especially those in other countries. The Principal Investigator (PI) of each study was identified and an e-mail address sought for each one. Over 60 studies were identified that might possibly be able to provide relevant data.

Type of information

The goal was to collect information to answer three kinds of questions about each data set: (a) Does it meet the basic requirement for comorbidity analyses? (b) If so, what are the characteristics of the data set relevant to these core requirements (sample size, etc.)? and (c) Does the data set have other characteristics that would make it valuable for additional analyses (e.g., repeated measures, risk and protective factors)?

The basic requirements were those discussed earlier; we were mainly interested in representative population samples, with reliably collected DSM or ICD diagnoses, and enough information to permit a determination of any substance use, substance abuse, and substance dependence, separately for alcohol, tobacco, and other drugs. Beyond these basic data, we were interested in knowing what other information might be available across several data sets. We were also interested in the potential for analyses using (1) longitudinal data, (2) different race/ethnic groups, (3) a range of putative risk and protective factors, (4) information on treatment for drug abuse or psychiatric disorders, and on the effectiveness of treatment, and (5) a range of "real world" outcomes, such as school dropout, arrest, incarceration, unwanted pregnancy, or suicide. However, these were not criteria for inclusion in the list of useful studies, but rather additional information for exploring what kinds of analysis might be possible.


A form (appendix D) was sent out as an e-mail attachment to everyone on the list of PIs. Responses were collected in a summary chart.

Back to Top


Table 3 presents a summary of the potentially usable data sets on which information has been collected so far. This is an ongoing project; as new studies reach an analyzable stage they can be added to the list. At this point we can say that at least 16 studies, collecting information on at least 17,000 children and adolescents, contain the minimum necessary data (psychiatric diagnoses, substance use and abuse, onset dates, demographic and risk factor data). What is even more important for NIDA's purposes is that most of these are panel studies, with repeated assessments of the same subjects. This provides the opportunity to examine the timing and precipitators of the onset of drug use, and progressions from use to abuse, prospectively, in large, ethnically diverse samples of children and adolescents.

All the studies include approximately equal numbers of male and female subjects. Several studies contain sizable samples of minority participants. There are more than 3,500 African American youth contributing over 11,000 person-observations, and 2,600 Hispanic participants contributing some 6,000 person-observations. However, data on American Indians (N = 450, person-observations = 2,000), Asians, and other minorities in the United States are sparse.

All the data sets contain information on a range of correlates and risk factors such as age, sex, school performance, urban/rural residence, family income, family structure and functioning, and neighborhood and community resources, although not all studies contain all the variables. A few provide information on service use for drug and mental health problems.


Data to examine the development of drug abuse comorbidity have already been collected on some 17,000 children and adolescents. With repeated assessments in many studies these data sets provide over 84,000 person-observations. At a very rough estimate, the dozen usable data sets have cost Federal and other agencies at least $60 million over the years since the early 1970s, when the first of these studies began. However, few of them have used their data to address the specific question that NIDA wants answered (exceptions are Costello et al., 1999; Newman, Silva, & Stanton, 1996). Additionally, the combined strength of this resource has certainly not been exploited to address this issue.

There are different methods of using data from multiple sources. Meta-analyses of the type used in the first part of this report are one approach. A second is for researchers to carry out cooperative projects, in which they agree to carry out parallel studies using common sets of variables (e.g., Costello, 1998, #11041]. A third approach is to combine the relevant variables from each study into a common data set. Programs for data analysis are much more flexible than was the case even a few years ago, and any or all of these approaches might be feasible, depending on the questions to be answered. Inevitably, problems would arise and considerable expertise would be needed to use any of these approaches.

Clearly there are many questions that further analysis of existing data will not answer. The core question of this conference—the impact of early treatment on later drug abuseneeds answering in new studies with different designs. But it would be helpful to be able to base those new studies on a firm foundation of knowledge about prevalence, comorbidity, and development.

Back to Top


American Psychiatric Association (1994). Diagnostic and Statistical Manual of Mental Disorders Fourth Edition (DSM-IV). Washington, DC: American Psychiatric Press, Inc.

Angold, A., & Costello, E. J. (1995). The Child and Adolescent Psychiatric Assessment (CAPA). Psychological Medicine, 25, 739-753.

Angold, A., Costello, E. J., & Erkanli, A. (1999). Comorbidity. Journal of Child Psychology and Psychiatry, 40, 57-87.

Berkson, J. (1946). Limitations of the application of fourfold table analysis to hospital data. Biometrics Bulletin, 2, 47-52.

Costello, A. J., Edelbrock, C., Kalas, R., Kessler, M. D., & Klaric, S. H. (1982). The National Institute of Mental Health Diagnostic Interview Schedule for Children (DISC). Rockville, MD: National Institute of Mental Health.

Costello, E. J., & Angold, A. (1995). Epidemiology. In J. March (Eds.), Anxiety Disorders in Children and Adolescents (pp. 109-124). New York, NY: Guilford Press.

Costello, E. J., Angold, A., Burns, B. J., Stangl, D. K., Tweed, D. L., Erkanli, A., & Worthman, C. M. (1996). The Great Smoky Mountains Study of Youth: Goals, designs, methods, and the prevalence of DSM-III-R disorders. Archives of General Psychiatry, 53, 1129-1136.

Costello, E. J., Erkanli, A., Federman, E., & Angold, A. (1999). Development of psychiatric comorbidity with substance abuse in adolescents: Effects of timing and sex. Journal of Clinical Child Psychology, 28, 298-311.

Kaplow, J. B., Curran, P. J., & Costello, E. J. (2001). The prospective relation between dimensions of anxiety and the initiation of adolescent alcohol use. Journal of Clinical Child Psychology, 30, 316-326.

Kessler, R. C., McGonagle, K. A., Zhao, S., Nelson, C. B., Hughes, M., Eshleman, S., Wittchen, H. U., & Kendler, K. S. (1994). Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States: Results from the National Comorbidity Study. Archives of General Psychiatry, 51, 8-19.

Loeber, R., & Keenan, K. (1994). Interaction between conduct disorder and its comorbid conditions: Effects of age and gender. Clinical Psychology Review, 14, 497-523.

Newman, D. L., Silva, P. A., & Stanton, W. R. (1996). Psychiatric disorder in a birth cohort of young adults: Prevalence, comorbidity, clinical significance, and new case incidence from ages 11 to 21. Journal of Consulting and Clinical Psychology, 64, 552-562.

Offord, D. R., Boyle, M. H., Szatmari, P., Rae-Grant, N. I., Links, P. S., Cadman, D. T., Byles, J. A., Crawford, J. W., Blum, H. M., Cyrne, C., Thomas, H., & Woodward, C. A. (1987). Ontario child health study: I. Methodology. Archives of General Psychiatry, 44, 826-831.

Robins, E., & Guze, S. B. (1970). Establishment of diagnostic validity in psychiatric illness: Its application to schizophrenia. American Journal of Psychiatry, 126, 107-111.

SAMHSA (1993). National Household Survey on Drug Abuse: Population Estimates 1992. Rockville, MD: U.S. Department of Health and Human Services.

Windle, M., & Davies, P. T. (1999). Depression and heavy alcohol use among adolescents: Concurrent and prospective relations. Development and Psychopathology, 11, 823-844.

Back to Top

Archive Home | Accessibility | Privacy | FOIA (NIH) | Current NIDA Home Page
National Institutes of Health logo_Department of Health and Human Services Logo The National Institute on Drug Abuse (NIDA) is part of the National Institutes of Health (NIH) , a component of the U.S. Department of Health and Human Services. Questions? See our Contact Information. . The U.S. government's official web portal