|
|
||||||||
PROCEEDINGS |
1 Division of Biostatistics, Indiana University School of Medicine, 1050 Wishard Blvd, RG 4101, Indianapolis, IN 46260, USA; and
2 Unilever Research, Port Sunlight, Wirral, UK;
* corresponding author, bkatz{at}iupui.edu
| ABSTRACT |
|---|
|
|
|---|
KEY WORDS: statistical analysis caries diagnostics clinical trials
| INTRODUCTION |
|---|
|
|
|---|
Some caries trials have incorporated radiographic assessment in addition to visual exam into the calculation of the caries increment. Decisions are made at the surface level, where a surface is considered carious if either measure shows evidence of a lesion. Although not generally thought of as a score based on multiple outcomes, this combined increment is a caries index that uses the maximum value of the available methods at each surface. The additional information provided by x-rays or other imaging methods presented few statistical challenges. One of the few formal analyses of multiple outcomes in caries trials was the use of multivariate analysis of covariance to analyze data from four caries trials (Geary et al., 1992). However, measures of dental health other than cariesincluding plaque, gingivitis, and calculuswere of interest in this analysis.
The caries process is not a step function where surfaces or teeth transition instantly from sound to cavitation. Nor can visual exams or radiographic methods with several categories instead of two adequately describe the process. Caries is a more gradual disease process, with demineralization and remineralization occurring over time. Caries lesions occur when demineralization is the dominant process. As a better reflection of the biology, the new caries diagnostic tools attempt to measure the caries process on a continuum in a quantitative manner. Thus, they yield continuous rather than ordinal or dichotomous results for each surface. In addition, these methods lead to unbalanced data, since they are usually available for only a subset of the surfaces of interest. One possibility for incorporating these data into the traditional analysis of covariance method for caries trials would be to dichotomize the continuous measures and then classify the surface as caries if any of the measures (including visual exam) was above the threshold. However, establishing useful cut-off values for these new measures is a difficult process and results in a large amount of lost information. The use of the continuous data from these new methods holds the most promise for increasing the efficiency of future caries trials.
There has been a variety of proposed methods for analyzing clinical trials with multiple outcomes, although not specifically for caries trials. Pocock (1997) looked at many of the proposed methods and discussed some of the associated practical and statistical issues. Although there are several possible classifications, we have chosen to characterize the methods broadly into four main themes: (1) Define each of the diagnostic measures as a primary or secondary outcome; (2) perform tests for each measure but formally control the type one error rate; (3) combine the data from the multiple outcomes into a single global test; and (4) construct a combined endpoint or index based on all of the methods.
The first approach depends on a pre-specified decision rule to determine the "success" of the trial. If there is only a single primary outcome, then the success of the trial depends solely on that measure. Other pre-specified decision rules based on combinations of primary and secondary outcomes are also possible (Chi, 1998). The second method is to adjust the p-values for the individual tests (e.g., Bonferroni) or to control the type I error rate in some other way (Cook and Farewell, 1996; Zhang et al., 1997). We will not consider either of these approaches further, since they can easily yield different conclusions for each outcome, leading to uncertain interpretation of the results and/or conflict between the sponsor and the regulatory agencies.
OBrien (1984) presented three global tests for multiple outcomes and showed that, for situations where group differences would be expected to be in the same direction for all measures, they had superior power to the traditional multivariate analyses. A brief description of the rank-sum, ordinary least-squares (OLS), and general least-squares (GLS) procedures follows.
For the rank-sum test, the value for each subject is ranked for each outcome, and then the ranks are summed across outcomes within subject. For large sample sizes, this sum can then be compared among groups by standard parametric analyses (e.g., ANCOVA). The OLS and GLS procedures require that the outcomes be transformed to a common scale. For continuous measures, this is usually accomplished by converting each value to a Z-score. The OLS method is equivalent to a repeated-measures analysis of variance, where the outcomes are the repeated factor. This yields equal weights for all of the outcomes. The GLS procedure is similar, except that the outcomes do not receive equal weights. The weights are computed from the inverse of the sample covariance matrix. The OLS and GLS tests are equivalent when the outcomes are equally correlated.
More recently, several other global tests have been proposed. One such method proposes treating the outcome measures as correlated data and then using generalized estimating equations (GEE) to perform a global analysis (Liang and Zeger, 1986; Lefkopoulou and Ryan, 1993). In addition, an approximate likelihood ratio test (Tang et al., 1989) and a multivariate linear mixed-model approach (Sammel et al., 1999) have also been developed. A simple method suggested by Wittes is to use the maximum value of the multiple outcome measures as the value for each subject and then perform a standard analysis (Follman, 1995). One thing that all of these methods have in common, with the notable exception of the rank-sum test, is that they require that the outcomes have the same scale. Since this is rarely the case, each outcome is transformed to a common scale. The mix of ordinal and continuous data in caries trials makes this an important issue.
The construction of a caries index could possibly be accomplished with the use of multivariate statistical methods such as factor analysis. However, we have chosen to develop candidate indices by extending current practice and combining data from different methods at the surface level. In the remainder of this paper, we will illustrate several of the global tests and construct some candidate indices using data from a recently completed caries trial.
| MATERIALS & METHODS |
|---|
|
|
|---|
|
Caries Indices
Surface data for each measure were transformed as described above. Surface scores were then calculated by two methods: the average (AVE) and the maximum (MAX) of the available data. In each case, the index for each subject was the mean of all of the surface scores.
Hybrid Method
We also developed analyses that created 4 partial mouth indices per person and then analyzed these with global tests. Two groupings of teeth and surfaces were used. The first grouping was based on caries susceptibility and used the following groups of teeth: (1) anterior teeth, (2) premolars, (3) first molars, and (4) second molars. The second grouping was based on the availability of the diagnostic methods and used the following groups: (1) anterior teeth (visual exam only), (2) posterior smooth surfaces (visual exam only), (3) posterior occlusal surfaces (all methods), and (4) posterior approximal surfaces (all methods except ECM). When more than one diagnostic method was available within a grouping, an index was created based on the maximum or the average of the transformed data. Thus, 16 hybrid analyses were performed (2 groupings, 2 index calculation methods, and 4 global tests).
| RESULTS |
|---|
|
|
|---|
|
Caries Indices
The two caries indices, MAX and AVE, were both approximately bell-shaped and slightly skewed. Each was analyzed by ANCOVA with baseline value, age, and design strata as covariates. For the original data, MAX showed significant differences between the products, but AVE did not. This is essentially because MAX used the CFX value when caries was detected and did not average them with the diagnostics that did not show a difference. For the augmented data, both indices showed a statistically significant difference and yielded smaller p-values and greater effect sizes than the individual diagnostic methods.
Hybrid Methods
For the original data, the MAX method of combining the measures was nearly significant for both groupings and for all the global tests (Table 3
). For the augmented data, a significant product difference was achieved in all 16 combinations. The two methods for grouping surfaces showed similar results. However, from a statistical standpoint, the method based on the availability of the diagnostic methods deals with the unbalanced data issues in a more logical manner.
|
| DISCUSSION |
|---|
|
|
|---|
Simulation studies comparing some of the global tests have shown that the GLS, which is the most flexible, tends to perform at least as well in most situations and better in some (OBrien, 1984). The rank-sum test has the major advantage that no transformations are needed, and descriptive statistics can be presented on the original scales. However, as with most non-parametric analyses, if the assumptions of a parametric approach can be met, the rank-sum test will be less powerful.
The caries indices may have great appeal, since they are essentially an extension of current practice. This is particularly true of the MAX index. One drawback of the new indices is that the resulting scale has no biological interpretation and is dependent on the baseline distribution of each measure. Perhaps in the future, a standard transformation could be used across trials. This would result in comparable numbers. Finally, the hybrid method performed as well as the others and exhibited more homogeneous results across the analysis methods. This may indicate that it is more robust, but it is difficult to draw conclusions from a single trial. The hybrid method does allow the investigator to examine different areas of the mouth but is the most computationally unwieldy. Still, it can handle the unbalanced data in the fairest way.
The major goal of adding new diagnostic tests to a caries trial is to increase our ability to detect differences among treatments. Analysis of the augmented data clearly shows that all of these methods are able to increase the power of a clinical caries trial if the diagnostic methods are an accurate and precise measure of the caries process.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
Caplan DJ, Slade GD, Biesbrock AR, Bartizek RD, McClanahan SF, Beck JD (1999). A comparison of increment and incidence density analyses in evaluating the anticaries effects of two dentrifices. Caries Res 33:1622.[ISI][Medline]
Chesters RK, Pitts NB, Matuliene G, Kvedariene A, Huntington E, Bendinskaite R, et al. (2002). An abbreviated caries clinical trial design validated over 24 months. J Dent Res 81:637640.
Chi GYH (1998). Multiple testings: multiple comparisons and multiple endpoints. Drug Information J 32:1347S1362S.
Cook RJ, Farewell VT (1996). Multiplicity considerations in the design and analysis of clinical trials. J R Statist Soc A 159:93110.
Fleiss JL (1984). Assessing treatment effects in caries clinical trials using ordered categorical data. J Dent Res 63(Spec Iss):778782.
Follmann D (1995). Multivariate tests for multiple endpoints in clinical trials. Statist Med 14:11631175.
Geary DN, Huntington E, Gilbert RJ (1992). Analysis of multivariate data from four dental clinical trials. J R Statist Soc A 155:7789.
Grainger DJ, Lehnhoff RW, Bollmer BW, Zacherl WA (1984). Analysis of covariance in dental caries clinical trials. J Dent Res 63(Spec Iss):766772.
Hujoel PP, Isokangas PJ, Tiekso J, Davis S, Lamont RJ, DeRouen TA, et al. (1994). A re-analysis of caries rates in a preventive trial using Poisson regression models. J Dent Res 73:573579.
Kingman A (1984). Stratification methods in caries clinical trials. J Dent Res 63(Spec Iss):773777.
Lefkopoulou M, Ryan L (1993). Global tests for multiple binary outcomes. Biometrics 49:975988.[ISI][Medline]
Liang KY, Zeger SL (1986). Longitudinal data analysis using generalized linear models. Biometrika 73:1322.
OBrien PC (1984). Procedures for comparing samples with multiple endpoints. Biometrics 40:10791087.[ISI][Medline]
Pocock SJ (1997). Clinical trials with multiple outcomes: a statistical perspective on their design, analysis, and interpretation. Controlled Clin Trials 18:530545.[ISI][Medline]
Sammel M, Lin X, Ryan L (1999). Multivariate linear mixed models for multiple outcomes. Statist Med 18:24792492.
Tang DI, Gnecco C, Geller NL (1989). An approximate likelihood ratio test for a normal mean vector with nonnegative components with application to clinical trials. Biometrika 76:577583.
Zhang J, Quan H, Ng J, Stepanavage ME (1997). Some statistical methods for multiple endpoints in clinical trials. Controlled Clin Trials 18:204221.[ISI][Medline]
This article has been cited by other articles:
![]() |
J.W. Stamm The Classic Caries Clinical Trial: Constraints and Opportunities J. Dent. Res., July 1, 2004; 83(suppl_1): C6 - C14. [Full Text] [PDF] |
||||
![]() |
P.B. Imrey and A. Kingman Analysis of Clinical Trials Involving Non-cavitated Caries Lesions J. Dent. Res., July 1, 2004; 83(suppl_1): C103 - C108. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| IADR Journals | Advances in Dental Research ® |
| Journal of Dental Research ® | Critical Reviews (1990-2004) |