1 [email protected] 1 Clinical Trial Protocol Critical statistical issues: a regulatory view
2 http://ferran.torres.name/docencia/mtcr Documentación
3
4 [email protected] 4 ICHE9 Statistical Principles for Clinical Trials ICHE9 CPMP/EWP/908/99 CPMP Points to Consider on Multiplicity issues in Clinical Trials (Apr 2003) CPMP/EWP/908/99 CPMP/EWP/2863/99 Points to Consider on Adjustment for Baseline Covariates (Nov 2003) CPMP/EWP/2863/99 CPMP/2330/99 Points to Consider on Application with 1.) Meta-analyses and 2.) One Pivotal study (May 2001) CPMP/2330/99 CPMP/EWP/2158/99 Guideline on the Choice of a Non-Inferiority Margin (Jan2006) CPMP/EWP/2158/99 CPMP/EWP/482/99 Points to Consider on Switching between Superiority and Non-inferiority (Feb 2001) CPMP/EWP/482/99 CPMP/EWP/1776/99 Points to Consider on Missing Data (Jan 2002) CPMP/EWP/1776/99 CHMP/EWP/83561/05 Guideline on Clinical Trials in Small Populations (Feb2007) CHMP/EWP/83561/05 CHMP/EWP/2459/02 Reflection Paper on Methodological Issues in Confirmatory Clinical Trials with Flexible Design and Analysis Plan CHMP/EWP/2459/02 Regulatory Guidances
5 [email protected] 5 Today’s talk is on statistics
7 Statistics Considerations [email protected]
8 The role of statistics “Thus statistical methods are no substitute for common sense and objectivity. They should never aim to confuse the reader, but instead should be a major contributor to the clarity of a scientific argument.” The role of statistics. Pocock SJ. Br J Psychiat 1980; 137:188-190 [email protected] 8
9 9 Key statistical issues Multiplicity Subgroups: interaction & confounding Superiority and non-inferiority (and ) Adjustment by covariates Missing data Others –Interim analyses –Meta-analysis vs one pivotal study –Flexible designs
10 MULTIPLICITY [email protected] 10
11 [email protected] 11 Lancet 2005; 365: 1591–95 To say it colloquially, torture the data until they speak...
12 [email protected] 12 Torturing data… –Investigators examine additional endpoints, manipulate group comparisons, do many subgroup analyses, and undertake repeated interim analyses. –Investigators should report all analytical comparisons implemented. Unfortunately, they sometimes hide the complete analysis, handicapping the reader’s understanding of the results. Lancet 2005; 365: 1591–95
13 [email protected] 13 DesignConductionResults
14 [email protected] 14 Multiplicity K independent hypothesis : H 01, H 02,..., H 0K S significant results ( p< ) Pr (S 1 | H 01 H 02 ... H 0K = H 0. ) = 1 - Pr (S=0|H 0. ) = 1- (1 - ) K
15 [email protected] 15 Same examples
16 [email protected] 16Multiplicity Bonferroni correction (simplified version) –K tests with level of signification of –Each test can be tested at the /k level Example: –5 independent tests –Global level of significance=5% –Each test shoud be tested at the 1% level 5% /5=> 1%
17 [email protected] 17 But this is the simplified version for the general public
18 Cautionary Example RCT to treat rheumatoid arthritis Basic Clin Med 1981, 15: 445 Several end ‑ points repeated at various timepoints and various subdivisions 48 of these gave p-values < 0.05 But… expect 5% of 850 = 850/20 = 42.5 =>so finding 48 is not very impressive [email protected] 18
19 [email protected] 19 Some strategies to ‘burden’ with multiple contrasts
20 [email protected] 20 Handling Multiplicity in Variables Scenario 1:One Primary Variable –Identify one primary variable -- other variables are secondary –Trial is positive if and only if primary variable shows significant (p < 0.05), positive results
21 [email protected] 21
22 [email protected] 22 Handling Multiplicity in Variables Scenario 2Divide Type I Error –Identify two (or more) co-primary variables –Divide the 0.05 experiment-wise Type I error over these co-primary variables, e.g., 0.04 for the 1st, and 0.01 for the 2nd co-primary variable –Trial is positive if at least one of the co-primary variables shows significant, positive results
23 [email protected] 23 Handling Multiplicity in Variables Scenario 3 Sequentially Rejective Procedure –Identify n co-primary variables, e.g., n = 3 –Order obtained p-values Interpret the variable with the highest p-value at the 0.05 level; if significant, then interpret the variable with the 2nd highest p-value at the 0.05/2 level; if positive, then interpret the variable with the smallest p-value at the 0.05/3 level. Test procedure stops when a test is not significant.
24 [email protected] 24 Handling Multiplicity in Variables Scenario 4Hierarchy –Prespecify hierarchy among n co-primary variables, –All tested at the same level interpret 1st variable at 0.05 level, if significant, then interpret 2nd variable at 0.05 level; if positive, then interpret 3rd variable at 0.05 level. … Test procedure stops when a test is not significant. –Trial is positive if first co-primary variable shows significant, positive result
25 [email protected] 25 Secondary Variables Secondary Variables Secondary variables can only be claimed if and only if –the primary variable shows significant results, and –the comparisons related to the secondary variables also are protected under the same Type I error rate as the primary variable. Similar procedures as already discussed can be used to protect Type I error
26 [email protected] 26 Handling Multiplicity in Treatments Similar procedures as how to handle multiplicity in variables. Additional procedures are available, mainly geared to very specific settings of the statistical hypotheses. –Dunnett, Scheffee, REGW, Williams …
27 SUBGROUPS [email protected] 27
28 Subgroups Indiscriminate subgroup analyses pose serious multiplicity concerns. Problems reverberate throughout the medical literature. Even after many warnings, some investigators doggedly persist in undertaking excessive subgroup analyses. Lancet 2000; 355: 1033–34 Lancet 2005; 365: 1657–61 [email protected] 28
29 [email protected] 29 Interacción Edad < 45 años Edad >= 45 años d=5 % d=0.7% d=11.5%
30 [email protected] 30 Factores de confusión No fumadores Fumadores d=6% d=0%
31 [email protected] 31 Subgroups & Simpson’s Paradox
32 [email protected] 32 Subgroups & Simpson’s Paradox cont.
33 [email protected] 33 Subgroups AspirinPlacebo Vascular Death150 147 Total 1357 1442 11.1%10.2% p=0.42045d=-0.9 ISIS-2: Vascular death by Star signs Geminis/LibraOther Star Signs AspirinPlacebo Vascular Death 654 868 Total 7228 7157 9.0% 12.1% p
34 [email protected] 34 Changes from ISIS-2 results Lancet 2005; 365: 1657–61
35 [email protected] 35 “The answer to a randomized controlled trial that does not confirm one’s beliefs is not the conduct of several subanalyses until one can see what one believes. Rather, the answer is to re- examine one’s beliefs carefully.” –BMJ 1999; 318: 1008–09.
36 [email protected] 36 Lancet 2005; 365: 1657–61
37 the question is NOT: ‘Is the treatment effect in this subgroup statistically significantly different from zero?’ BUT… are there any differences in the treatment effect between the various subgroups? The correct statistical procedures are either a test of heterogeneity or a test for interaction
38 [email protected] 38 Subgroups Recommendations: –1) Examine the global effect –2) Test for the interaction –3) Plan adjustments for confirmatory analyses –4) Some points which increase the credibility: Pre-specification Biologic plausibility
39 [email protected] 39 Lancet 2005; 365: 176–86
40 MULTIPLE INSPECTIONS [email protected] 40
41 [email protected] 41 Interim Analyses in the CDP Z Value +2+10-2+2+10-2 10 20 30 40 50 60 70 80 90 100 Month of Follow-up (Month 0 = March 1966, Month 100 = July 1974) Coronary Drug Project Mortality Surveillance Circulation. 1973;47:I-1 http:// clinicaltrials.gov/ct/show/NCT00000483;jsessionid=C4EA2EA9C3351138F 8CAB6AFB723820A?order=23
42 [email protected] 42 Lancet 2005; 365: 1657–61
43 [email protected] 43 CPMP/EWP/482/99: PTC on Switching between Superiority and Non- Inferiority & CPMP/EWP/2158/99: PtC on the Choice of Delta
44 [email protected] 44 ENSAYOS DE NO-INFERIORIDAD NECESIDAD Implicaciones legales. Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de placebo. Limitaciones prácticas a la superioridad frente a control activo. Necesidad de información comparativa. Posibles valores añadidos.
45 [email protected] 45
46 [email protected] 46 ENSAYOS DE NO-INFERIORIDAD NECESIDAD Implicaciones legales. Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de placebo. Limitaciones prácticas a la superioridad frente a control activo. Necesidad de información comparativa. Posibles valores añadidos.
47 [email protected] 47 Aproximación con el Poder (prueba clásica + cálculo del poder)
48 [email protected] 48 ENSAYOS DE NO-INFERIORIDAD NECESIDAD Implicaciones legales. Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de placebo. Limitaciones prácticas a la superioridad frente a control activo. Necesidad de información comparativa. Posibles valores añadidos.
49 [email protected] 49
50 [email protected] 50 ENSAYOS DE NO-INFERIORIDAD NECESIDAD Implicaciones legales. Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de placebo. Limitaciones prácticas a la superioridad frente a control activo. Necesidad de información comparativa. Posibles valores añadidos.
51 [email protected] 51 Lancet 2001,356: 1668-75
52 [email protected] 52 ENSAYOS DE NO-INFERIORIDAD NECESIDAD Implicaciones legales. Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de placebo. Limitaciones prácticas a la superioridad frente a control activo. Necesidad de información comparativa. Posibles valores añadidos.
53 [email protected] 53 Valores añadidos Posología: 1 vez al día Vía: vía oral Seguridad: Acontecimientos adversos Poblaciones especiales: Ancianos, pediatría Interacciones
54 [email protected] 54 Ensayos de Equivalencia Ensayos de bioequivalencia (producto genérico vs comercializado) Nuestro producto no es peor y puede presentar otras ventajas (seguridad, comodidad posológica …) –No-inferioridad
55 [email protected] 55 ESTUDIO DE SUPERIORIDAD d > 0 + efecto IC95% d = 0 No hay diferencia d < 0 - efecto Mejor TestMejor Control
56 [email protected] 56 ESTIMACIÓN POR INTERVALO (ESTUDIO DE SUPERIORIDAD) Es estadísticamente significativa d = 0 No hay diferencia d < 0 - efecto d > 0 + efecto IC95% Mejor TestMejor Control
57 [email protected] 57 ESTIMACIÓN POR INTERVALO (ESTUDIO DE SUPERIORIDAD) Es estadísticamente significativa con P=0,05 (justo en el límite) d > 0 + efecto d = 0 No hay diferencia d < 0 - efecto IC 95% Mejor TestMejor Control
58 [email protected] 58 ESTUDIO DE EQUIVALENCIA d > 0 + efecto d = 0 No hay diferencia d < 0 - efecto -d +d Región de equivalencia clínica Delta: ( ) mayor diferencia sin relevancia clínicamayor diferencia sin relevancia clínicao la menor diferencia con relevancia clínicala menor diferencia con relevancia clínica
59 [email protected] 59 EQUIVALENCIA 0++ -- Equivalencia No equivalencia
60 [email protected] 60 NO-INFERIORIDAD TERAPÉUTICA No-Inferioridad No No- Inferioridad 0-- Mejor TestMejor Control
61 [email protected] 61 30% B A P 1/2 ? 1/3 ?
62 [email protected] 62
63 [email protected] 63
64 [email protected] 64 30%
65 RANDOMIZATION & COVARIATES [email protected] 65
66 [email protected] 66 Adjustement The objective should be not to compensate unbalance (randomisation) but to improve the precision Avoid to adjust by post-randomization variables In RCT, never use this widespread strategy: “adjust by any baseline significant variable (5% or 10% level)”
67 Stratification A priori May desire to have treatment groups balanced with respect to prognostic or risk factors (co- variates) For large studies, randomization “tends” to give balance For smaller studies a better guarantee may be needed Useful only to a limited extent (especially for small trials) but avoid to many variables (i.e. many empty or partly filled strata) [email protected] 67
68 Testing for “baseline homogeneity” All observed differences are known with certainty to be due to chance. We must not test for it: there is no alternative hypothesis whose truth can be supported by such a test. If significant, the estimator is still unbiased Balance: –Decreases the variance and increases the power. –It has no effect on type I error. [email protected] 68
69 [email protected] 69 Observed Unbalanced… NEVER justifies the post-hoc adjustment: –Randomization is more important –The treatment effect is unbiased without adjustment (randomization) –Type I error level takes into account for “chance error” –Post-hoc: data driven analyses –Multiplicity issues : increase type I error by allowing a post-hoc adjustment
70 [email protected] 70 Adjusted Analyses ‘ When the potential value of an adjustment is in doubt, it is often advisable to nominate the unadjusted analysis as the one for primary attention, the adjusted analysis being supportive.’
71 [email protected] 71 Ajuste por covariables Definición a priori La aparición de desigualdades basales NO justifica el ajuste per se: –Se da más importancia a la randomización –Peligro de los análisis post-hoc –Multiplicidad Como estrategia general, el ajuste por variables significativas basales (ej, p
72 MISSING DATA [email protected] 72
73 [email protected] 73 Ex: LOCF & lineal extrapolation 36 32 28 24 - 20 16 12 8 4 0 2 4 6 8 10 12 14 16 18 Time (months) LOCF Lineal Regresion Bias Adas-Cog > Worse < Better
74 [email protected] 74 Ex: Early drop-out due to AE Adas-Cog 36 32 28 24 - 20 16 12 8 4 0 2 4 6 8 10 12 14 16 18 Time (months) Placeb o Active > Worse < Better Bias: Favours Active
75 [email protected] 75 Ex: Early drop-out due to lack of Efficacy Adas-Cog 36 32 28 24 - 20 16 12 8 4 0 2 4 6 8 10 12 14 16 18 Time (months) Placeb o Active > Worse < Better Bias: Favours Placebo
76 [email protected] 76 RND B Baseline Last Visit ≠ Frecuencies A Drop-outs and missing data AAAA AA B B A Visit 2 Visit 1 A
77 [email protected] 77 RND Baseline Last Visit ≠ Timing A Drop-outs and missing data AAAAB B Visit 2 Visit 1 BBB
78 [email protected] 78 Handling of MD Methods for imputation: –Many techniques –No gold standard for every situation –In principle, all methods may be valid: Simple methods to more complex: –From LOCF to multiple imputation methods –Worst Case, “Mean methods” Multiple Imputation But their appropriateness has to be justified Statistical approaches less sensitive to MD: –Mixed models –Survival models They assume no relationship between treatment and the missing outcome, and generally this cannot be assumed.
79 CONCLUSION [email protected] 79
80 [email protected] 80
81 [email protected] 81
82 2007 [email protected] 82
83 Gracias por su atención!! [email protected] 83 http://ferran.torres.name/docencia/mtcr