J Dent Res 83(Spec Iss C):C99-C102, 2004
© 2004 International and American Associations for Dental Research
Using Survival Methodologies in Demonstrating Caries Efficacy
A. Hannigan
Department of Mathematics and Statistics, University of Limerick, Ireland; ailish.hannigan{at}ul.ie
 |
ABSTRACT
|
|---|
Exploiting recent advances in statistical methods, particularly for correlated intra-subject data, could increase the efficiency of caries clinical trials. Methods of analysis using the tooth surface as the unit should be investigated. Whole-mouth measures such as the DMFS increment ignore the variation in the number of surfaces at risk between subjects and within a subject over time. The use of "survival time" for each surface as the outcome measurei.e., the time from the start of the trial to when a surface is recorded as decayed or filledis proposed. Data from caries clinical trials could be described as clustered survival data, where clustering of tooth surfaces exists such that survival times within the same cluster or subject are correlated. Advances in the analysis of clustered survival data, such as the use of marginal models with robust variance estimators, have recently been exploited in the analysis of caries clinical trials. The analysis produced results similar to those achieved by conventional DMFS-based analysis. The results using survival analysis are easily interpretedfor example, the median survival time of tooth surfaces in female subjects using a toothpaste with a higher level of fluoride (1500 ppm F) is 1.07 times the median survival time of surfaces in female subjects using toothpaste with less fluoride (1000 ppm F). Further research is required to investigate if survival analysis is a more sensitive method of analysis, i.e., whether causative factors can be identified with fewer subjects than with the conventional method of analysis.
KEY WORDS: statistical analysis multivariate survival data efficiency
 |
INTRODUCTION
|
|---|
The objective of caries clinical trials is to test prophylactic agents for the prevention of dental caries. This objective is usually achieved by comparison of the net increase in the mean number of decayed, missing, or filled (DMF) surfaces or teeth for subjects in the study groups. The current outcome for caries clinical trials is therefore subject- and not surface-based. Worthington (1984) stated that because the units of recording in clinical trials (the tooth or surface) are not independent within a given subject, each individuals caries increment over a period of time must be used as the unit of analysis. The past decade, however, has seen many advances in both statistical methods and computing resources which enable standard statistical methods to be adapted for dependence of observations within the same subject (DeRouen et al., 1991). Exploiting these advances could increase the efficiency of caries clinical trials.
Most attempts at handling dependent data from clinical trials in dental research come from the area of periodontal research. The unit of analysis is taken to be the site and not the subject, because it is recognized that periodontal disease activity and the outcome from the disease may vary significantly among sites and subjects, and over time. The trend in periodontal research has been to look at statistical methods such as generalized estimating equations (GEE) and multi-level models rather than condense the information from all sites within a subject into a whole-mouth measure (DeRouen et al., 1995).
The caries process is similar to periodontal disease progression, in that caries onsets within a subject are correlated due to shared host factors. The number of surfaces at risk varies within a subject over time and also varies between subjects. Whole-mouth measures such as the DMFS increment ignore this variation and may not fully exploit the surface-based data collected during a caries clinical trial. If the DMFS increment is to be replaced as the outcome measure of three-year caries clinical trials, it needs to be replaced by something which is easily understood by non-statisticians. Time has always been identified as an important variable in caries clinical trialstime at risk for each surface, eruption time, and the length of treatment. For the outcome measure of a caries clinical trial, the use of "survival time" for each surface, i.e., the time from the start of the trial to when a surface is recorded as decayed or filled, would allow both patients and dentists to clearly understand the benefits from different products and oral hygiene habits.
 |
SURVIVAL ANALYSIS
|
|---|
Survival analysis is the analysis of data that correspond to the time from a well-defined time origin until the occurrence of a particular event or end-point. In medical research, the end-point may be the death of the patient, and so the resulting data are literally survival times. There is one feature of survival times that makes them unsuitable for analysis by other statistical methods, which is that the event of interest is rarely observed in all subjects. In the context of caries clinical trials, many surfaces remain sound throughout the length of the trial, with no decay being recorded. The survival time for these surfaces is unknown, except that it is longer than the length of the trial itself. Such survival times are said to be censored, to indicate that the period of observation was cut off before the event of interest occurred. Subjects are also followed for various lengths of time, i.e., they may leave the study before the end. These withdrawals also lead to censored observations.
There are three types of censoring. "Right censoring" is where the actual survival time is unknown but is known to be greater than a certain period of timefor example, if a surface remains sound for the length of the trial, its survival time is unknown but is known to be longer than the length of the trial. "Left censoring" is where the actual survival time of the individual is less than that observed, e.g., surfaces examined one year after the start of the trial that are found to be carious. The actual survival time is less than 12 months. "Interval censoring" is where individuals are known to have experienced failure within an interval of time, e.g., a surface known to have survived between two and three years. Clinical trials, where subjects are examined only at pre-scheduled visits, often give rise to interval-censored data.
 |
SURVIVAL ANALYSIS IN DENTAL RESEARCH
|
|---|
Survival analysis has been widely used in dental research to evaluate such diverse subjects as longevity of restorations, oral implants, and the natural history of caries. For example, Feigal et al.(2000) tested treatment effects and potential risk factors for sealant failure using survival analysis and found that variables that affected success differed between occlusal and buccal/lingual sealants. Eckert et al.(2001) investigated the survival of oral implants over time and the relationship between implant survival and explanatory variables such as implant location and history of tobacco use. Hujoel et al.(1998) evaluated changes in the 45-year tooth survival probabilities for Norwegian males. Survival analysis was also used by Larmas et al.(1995) to analyze longitudinal data collected from rural health centers in Finland. The time investigated was the time between eruption and first fillings in subjects up to 18 years of age. Filling placement curves were calculated for each tooth and were concluded to be sensitive indicators of oral health at both individual and population levels.
Most of these applications of survival analysis gave rise to univariate survival data, where a single time is recorded for each subject/sealant location, or a separate analysis is carried out for each tooth type. Standard survival analysis techniquessuch as the use of life tables (Table 1
), Kaplan-Meier estimates, and the Cox regression modelare used to evaluate the probability of survival over time and how the survival experience depends on the values of one or more explanatory variables.
View this table:
[in this window]
[in a new window]
|
Table 1. Life Table Estimate of the Survivor Function for the Occlusal Surface of the Lower Right Second Molar in a Three-year Caries Clinical Trial
|
|
 |
MULTIVARIATE SURVIVAL DATA
|
|---|
The application of survival analysis techniques to caries clinical trials has only recently been investigated (Hannigan et al., 2001), perhaps because of the requirement to combine the complex data collected in these trials, containing multiple observations from each subject, repeated over different time points, into one dataset for the evaluation of a prophylactic agent for the prevention of dental caries. Caries clinical trials, where data are collected on up to 140 tooth surfaces over a three-year period, give rise to multivariate survival data with multiple events per subject. Modeling multivariate survival data is a relatively new field in statistics (Hougaard, 2000). A major issue is intrasubject correlation. If the units of analysis, i.e., tooth surfaces, do not behave independently, an inflated Type I error will result when significance tests are carried out on the dataset, and a statistically significant result may be found where there is no difference between the groups being compared. The inflated Type I error is a result of the underestimation of the variance of the test statistic. Chuang et al.(2001) compared three models for estimating survival of dental implants: (1) randomly selecting one implant per patient; (2) utilizing all implants, assuming independence among implants from the same subject; and (3) utilizing all implants, assuming dependence among implants from the same subject. They concluded that, to obtain statistically valid variances, they should adjust for the dependence among implants from the same subject.
The most common approaches to modeling multivariate survival data include (Therneau and Grambsch, 2000):
- using the time to the first event in the patient as the survival time, ignoring the multiplicity (While this is an easy approach to interpret, information may be wasted.);
- conditional models, such as the frailty model, which include a random per-subject effect; and
- the marginal models approach.
 |
FRAILTY MODELS
|
|---|
Conditional models for multivariate survival data induce a correlation structure between related survival times through an unobserved random parameter. The most widely used conditional model is the frailty model. The idea is that individuals or families may have different frailties, and those who are most frail will die sooner than the others (Therneau and Grambsch, 2000). An application of frailty models to caries data was considered by Härkänen et al.(2000) with 240 subjects where the dependence of lifetimes between the teeth of a subject was accounted for by introducing subject-specific frailty parameters. Härkänen et al. assumed that risk factors such as genetic and environmental factors were represented by this parameter. Variability in the values of this parameter across different subjects gave rise to "extra-binomial" variance.
The EM algorithm is generally used as the estimation tool for frailty models, but the algorithm is slow, and proper variance estimates require further computation. Apart from Stata 7 (Stata Corp., College Station, TX 77845, USA), which estimates some parametric survival models with individual-level frailty, no implementation has appeared in any of the more widely available software packages. Given the large number of individuals in a typical caries clinical trial, the use of frailty models may not be computationally practical.
 |
MARGINAL MODELS
|
|---|
Marginal modeling is a term used for an approach where the effect of explanatory variables is estimated based on consideration of the marginal distribution. The dependence of observations is not of interest, but after the modeling, the variability of the regression coefficient estimators is determined by a procedure that accounts for the dependence between the observations (Hougaard, 2000). The advantage of marginal models is their flexibility and the availability of software to implement this approach in packages such as SAS (SAS Institute Inc., Cary, NC 27512-8000, USA). One version of this approach is the independence working model approach, where the analysis is carried out in three steps:
- Decide on a model.
- Fit the data as an ordinary model, ignoring the possible correlation.
- Replace the standard variance estimator with one that is adjusted for the possible correlation.
This approach is similar to generalized estimating equations (GEE), where standard regression models are adapted to handle dependent data by adjustment for a correlation structure. A "working" correlation matrix for the responses from each subject is specified. GEE methodology then uses a robust variance estimator to estimate the variance of the regression coefficients.
 |
VARIANCE ESTIMATORS
|
|---|
There are several options for robust variance estimation in the marginal models approach to multivariate survival data. Wei et al.(1989) modeled the marginal distribution of each survival time variable using a Cox proportional hazards model, where no particular structure of dependence among survival times on each subject was imposed. The regression parameters were estimated by maximum likelihood, and a robust variance estimate was proposed which was consistent for the asymptotic variance of the estimates from the Cox model. Lipsitz and Parzen (1996) proposed a "one-step" jackknife estimator of variance that is asymptotically equivalent to the estimator of Wei et al.(1989) but saves computational time for large datasets and is easier to obtain from available computer packages. Lipsitz et al.(1994) also proposed a "one-step" jackknife estimator of variance for the analysis of clustered survival data using parametric models. This was based on the sandwich estimator of White (1982), who examined what happened to the properties of maximum likelihood estimators if one does not assume that the probability model is correctly specified. Lipsitz et al.(1994) suggested that if the number of clusters is very large, a grouped jackknife estimator could save computational time.
 |
APPLICATION OF THE MARGINAL MODELS APPROACH TO MULTIVARIATE SURVIVAL DATA FROM A CARIES CLINICAL TRIAL
|
|---|
Previous work by this author and co-workers (2001) converted the data from a three-year caries clinical trial carried out in North Wales, UK, between 1989 and 1992 (OMullane et al., 1997) into interval-censored survival data. An accelerated failure time model was fitted to the data by means of a standard survival analysis procedure in SAS. The log-logistic distribution, which allows the rate of decay to change direction, i.e., increase or decrease over time provided the best fit for the data. Under the assumption that surfaces within an individual behave independently of one another, a multiple regression model was fitted to the data by maximum likelihood. The "one-step"-grouped jackknife estimator of variance proposed by Lipstiz et al.(1994) was used to calculate standard errors for the regression coefficients. All tests of significance were carried out on the log scale with the use of the fitted regression coefficients and the corresponding jackknife estimates of standard errors.
The previously reported findings for this clinical trial with the use of the conventional method of analysis stated that the mean three-year clinical-only DMFS increment for subjects using the 1500-ppm-F toothpaste was 3.93 compared with 4.19 for those using the 1000-ppm-F toothpastes, a statistically significant 6.2% difference (p < 0.05). There was no significant difference between the mean DMFS increment for those using paste with or without the agent trimetaphosphate (TMP) added. Subjects who claimed to brush more frequently or who claimed not to use a tumbler to rinse after toothbrushing also had lower three-year DMFS increments.
The results from survival analysis showed that fluoride level, baseline caries and calculus, brushing frequency, rinsing method, and the interaction between fluoride level and gender all had statistically significant effects on the survival time of tooth surfaces (Table 2
). The positive parameter for fluoride level means that surfaces in subjects using toothpaste with the higher level of fluoride (1500 ppm F) have longer survival times than those using toothpaste with the lower level of fluoride (1000 ppm F). For example, the median survival time of tooth surfaces in female subjects using toothpaste with the higher level of fluoride (1500 ppm F) is 1.07 times the median survival time of surfaces in female subjects using toothpaste with the lower level (1000 ppm F). Other explanatory variables that had a significant positive effect on survival time were brushing more frequently, using any method of rinsing except a tumbler, and having calculus at baseline. The negative parameter for baseline caries means that the higher the baseline caries value, the shorter the survival time of surfaces in a subject. The parameter estimate for the fluoride * gender interaction term is negative, indicating that the beneficial effect of using paste with the higher level of fluoride is less for males than it is for females.
View this table:
[in this window]
[in a new window]
|
Table 2. Parameter Estimates, Standard Errors, and Corresponding p-values for Variables in Accelerated Failure Time Model Based on Log-logistic Distribution
|
|
The survival curves illustrating the effects of gender and fluoride level on the survival rates for the occlusal surfaces of the 4 first molars, for a 12-year-old subject using toothpaste without TMP added with a baseline DMFS value of 10, shows the beneficial effect of the higher level of fluoride on the survival rates for surfaces in females (Fig.
). No benefit was observed among males, probably due to the relatively large proportion of males who claimed to brush less than once a day on average (32% compared with 12% of females).

View larger version (15K):
[in this window]
[in a new window]
|
Figure. Survival curves illustrating the probability of survival over time of males and females using toothpaste with different levels of fluoride (1000 and 1500 ppm F).
|
|
 |
FURTHER RESEARCH
|
|---|
The marginal model approach may be more suitable than the frailty model for multivariate survival data from caries clinical trials, since most of the source of variability is due to different anatomic susceptibilities of the tooth surfaces rather than to the difference in subject-specific frailty. Computationally, the marginal model approach is easier to implement with standard software packages. This approach has already been used for a caries clinical trial by this author (Hannigan et al., 2001), but further research is required to investigate if it is a more sensitive method of analysis, i.e., whether causative factors can be identified with fewer subjects than with the conventional method of analysis. The methodology also needs to be applied to other caries clinical trials so that it can be determined if the results are consistent with those of the conventional method of analysis. The assumption of a parametric model may not be suitable for all datasets, but an SAS macro is available for fitting non-parametric models to interval-censored survival data. Lipsitz and Parzen (1996) have also developed a jackknife estimator of variance for non-parametric models.
 |
ADVANTAGES OF SURVIVAL ANALYSIS
|
|---|
The use of survival time as the outcome measure is an approach that can be understood by everyonedentists, patients, and researchers. It has an intuitive appeal, particularly given the importance of time in caries researcheruption time, time to decay, and length of exposure to treatment. The results of the analysis are easily interpreted with the use of survival curves and median survival times. The results could also be presented as three-year survival rates for each surface, similar to the five-year survival rates in cancer clinical trials.
Because the methodology proposed here is surface-based, it uses all the data collected in the trial, including data collected at the intermediate exams and data from subjects who are not present at the final examination. Comparison of survival analysis with the conventional method of analysis could be carried out by their application to randomly chosen subsets of a given clinical trial, with the results noted and the procedure repeated several times, so that power estimates can be calculated. This investigation would be repeated for a range of sample sizes. Survival analysis based on a sample of size n may be as powerful as the conventional method of analysis based on a sample of size Kn for some value of K > 1. If K = 2, for instance, it could be concluded that survival analysis based on a sample size of 1000 is as powerful as the conventional method of analysis based on a sample size of 2000. If it is shown that survival analysis is a more sensitive method of analysis than the current method, the number of subjects required for caries clinical trials, and hence the corresponding costs, could be reduced substantially.
 |
SUMMARY
|
|---|
Exploiting recent advances in statistical methods, particularly for correlated intra-subject data, could increase the efficiency of caries clinical trials. Methods of analysis using the tooth surface as the unit should be investigated. Data from caries clinical trials could be analyzed as multivariate survival data with the use of a marginal models approach. This method has been applied to a caries clinical trial, but further research is required to investigate if it is a more sensitive method of analysis.
 |
ACKNOWLEDGMENTS
|
|---|
The author acknowledges Prof. OMullane (Oral Health Services Research Centre, University College Cork), Prof. Barry (Department of Mathematics and Statistics, University of Limerick), and Unilever Dental Research for data, financial support, and guidance for the original project on which this paper is based.
 |
FOOTNOTES
|
|---|
Presented at the International Consensus Workshop on Caries Clinical Trials, Glasgow, Scotland, January 710, 2002
 |
REFERENCES
|
|---|
Chuang SK, Tian L, Wei LJ, Dodson TB (2001). Kaplan-Meier analysis of dental implant-survival: a strategy for estimating survival with clustered observations. J Dent Res 80:20162020.[Abstract/Free Full Text]
DeRouen TA, Mancl L, Hujoel P (1991). Measurements of associations in periodontal diseases using statistical methods for dependent data. J Periodontal Res 26:218229.[ISI][Medline]
DeRouen TA, Hujoel PP, Mancl LA (1995). Statistical issues in periodontal research. J Dent Res 74:17311737.[Abstract/Free Full Text]
Eckert SE, Meraw SJ, Weaver AL, Lohse CM (2001). Early experience with wide-platform mk II implants. Part I: Implant survival. Part II: Evaluation of risk factors involving implant survival. Int J Oral Maxillofacial Implants 16:208216.
Feigal RJ, Musherure P, Gillespie B, Levy-Polack M, Quelhas I, Hebling J (2000). Improved sealant retention with bonding agents: a clinical study of two-bottle and single-bottle systems. J Dent Res 79:18501856.[Abstract/Free Full Text]
Hannigan A, OMullane DM, Barry D, Schafer F, Roberts AJ (2001). A re-analysis of a caries clinical trial by survival analysis. J Dent Res 80:427431.[Abstract/Free Full Text]
Härkänen T, Virtanen JI, Arjas E (2000). Caries on permanent teeth: a non-parametric Bayesian analysis. Scand J Statist 27:577588.
Hougaard P (2000). Analysis of multivariate survival data. New York: Springer-Verlag.
Hujoel PP, Löe H, Anerud A, Boysen H, Leroux BG (1998). Forty-five-year tooth survival probabilities among men in Oslo, Norway. J Dent Res 77:20202027.[Abstract/Free Full Text]
Larmas MA, Virtanen JI, Bloigu RS (1995). Timing of first restorations in permanent teeth: a new system for oral health determination. J Dent 23:347352.[ISI][Medline]
Lipsitz SR, Parzen M (1996). A jackknife estimator of variance for Cox regression for correlated survival data. Biometrics 52:291298.[ISI][Medline]
Lipsitz SR, Dear KB, Zhao L (1994). Jackknife estimators of variance for parameter estimates from estimating equations with applications to clustered survival data. Biometrics 50:842846.[ISI][Medline]
OMullane DM, Kavanagh D, Ellwood RP, Chesters RK, Schäfer F, Huntington E, et al. (1997). A 3-year clinical trial of a combination of trimetaphosphate and sodium fluoride in silica toothpastes. J Dent Res 76:17761781.[Abstract/Free Full Text]
Therneau TM, Grambsch PM (2000). Modeling survival data: extending the Cox model. New York: Springer-Verlag.
Wei LJ, Lin DY, Weissfeld L (1989). Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 84:10651073.
White H (1982). Maximum likelihood estimation of misspecified models. Econometrica 50:125.[ISI]
Worthington H (1984). Statistical methodology for clinical trials of caries prophylactic agentscurrent knowledge. Int Dent J 34:278284.[ISI][Medline]
This article has been cited by other articles:

|
 |

|
 |
 
J.W. Stamm
The Classic Caries Clinical Trial: Constraints and Opportunities
J. Dent. Res.,
July 1, 2004;
83(suppl_1):
C6 - C14.
[Full Text]
[PDF]
|
 |
|