Median incremental cost-effectiveness ratio (ICER)

1 Median incremental cost-effectiveness ratio (ICER)Heej...
Author: Percival Morgan
0 downloads 3 Views

1 Median incremental cost-effectiveness ratio (ICER)Heejung Bang

2 Background Cost-effectiveness analysis (CEA) is an economic evaluation that examines both the costs and health outcomes of alternative strategies and has been applied in various fields. - PubMed finds >57,000 articles for ‘CEA’ Incremental cost-effectiveness ratio (ICER) has served as the most popular methodology in the CEA. Despite the extreme popularity of this method and some advantages of other central tendency measures, statistical measures other than mean for cost as well as effectiveness have seldom been used in the CEA.

3 CHANGE? As far as I know, people started to use median ICER from 2007.Bishai et al. (2007). The cost effectiveness of antiretroviral treatment strategies in resource-limited settings. AIDS. Nadler et al. (2009) reported mean & median ICERs together.

4 CEA is often conducted from RCTs (or observational studies) - as secondary outcome.CEA is a key part of “Comparative Effectiveness Research.” CEA is particularly interesting when a (new) treatment is more effective and costs more. Yet, International Society for Pharmacoeconomics and Outcomes Research Task Force (2005) recommended CEA be performed even when clinical effectiveness fails to be demonstrated – agree but mathematical challenge (more later)!

5 Some more background: Less known stories?But it does not take too long to realize that there are not many fields/methods that are as controversial as CEA/ICER. Interesting publications: Neumann. (2004). Why Don't Americans Use Cost-Effectiveness Analysis? Birch and Gafni (2006). Information Created to Evade Reality (ICER) Two editorials in Ann Int Med (2008): - A Menu without Prices - Cost-Effectiveness Information: Yes, It's Important, but Keep It Separate, Please!

6 Some more background: Reality?Published articles may be rejected by certain journals in part because of a reviewer's disagreement with the CEA methodology. As studies that use CEA continue to proliferate, this conflict is sure to reach a flash point. Journal editors may be forced to moderate methodological disagreements. (Hoch 1999)

7 Why Medians? Cost data are often (extremely) skewedMedian may represent TYPICAL data better. In effectiveness evaluation, ‘median’ survival time and nonparametric method (e.g., Kaplan Meier, log-rank) are considered as norms. ‘Total cost’ can be directly derived from the mean, not from median or other central measures. Also, mean may better reflect ‘societal perspective’. Median ~= individual/payer’s perspective?? Median is a norm in housing price and income statistics (No mean or SD in Amstat annual salary survey!)

8 Facts about Mean vs. MedianFor symmetric distribution, mean=median. Mean and median are two extreme of trimmed mean (mean=0% trimming and median=100% trimming). --- trimmed mean or median Consumer Price Index are often reported. mean(X+Y)=mean(X)+mean(Y) & mean(X-Y)=mean(X)-mean(Y) vs. median(X+Y)≠median(X)+median(Y) & median(X-Y)≠median(X)-median(Y) except for sorted data

9 If a variable (X)’s natural logarithm is normally distributed with mean=μ and SD=σ,then mean of X is exp(μ+σ 2/2) median of X is exp(μ). It has been reported that a wrong null hypothesis without accounting for skewness is frequently used in t-test on log-transformed costs in practice (Zhou et al. 1997)

10 Mean vs. median: with censoringMean (survival time) is estimated if the largest observation is uncensored – remember the warning in proc lifetest in SAS! --- common remedy: use restricted mean. Median is estimated as long as survival function (e.g., KM curve) reaches 0.5. Mean/median survival time ^= mean/median follow-up time. Almost all we know in standard regression is mean regression (e.g., proc reg).

11 Purposes of this talk 1. To propose median-based ICER2. To provide systematic recommendations about inferential procedures Remark: the same methods are equally applicable to mean-based ICER 3. To demonstrate mean- vs. median-based analyses with real RCT data Disclaimer: No new statistical methods are proposed here.

12 Basic concepts of CEA: intuitiveLet Mi and Ei denote cost and effectiveness measures, respectively, for the i-th group, i=1(new), 2(standard). If E1>E2 and M1 costly. If E1>E2 and M1>M2: New intervention is more effective and more costly. If E1 If E1M2: New intervention is less effective and more costly. If E1=E2, then compare M1 and M2. (Cost minimization) If M1=M2, then compare E1 and E2.

13 Cost-effectiveness Plane (Black 1990)

14 Mean-based ICER (standard approach)Remark: for simple presentation, we will let ICER represent both the population parameter and its estimate.

15 ICER is defined as the ratio of the change in costs of two competing strategies (e.g., new vs. standard intervention such as placebo or the best available alternative treatment) to the change in effectiveness of these two strategies. Effectiveness is measured as a clinically meaningful event experienced by a patient such as survival time, quality-adjusted life years (QALY), or symptom-reduced days, where a larger value of effectiveness generally implies a better outcome.

16 Median ICER

17 Methods Median ICER is constructed in the standard ICER formula by replacing the means by the medians. Nonparametric Bootstrap methods are proposed for estimating confidence intervals (CIs). -- the asymptotic distribution for the joint distribution of the difference of median costs and the difference of median effectiveness measures is inherently complicated (mean has more choices – more later). -- depending on which quadrants in the CE plane the bootstrap replicates lie, different bootstrap methods should be adopted (note 2-dimension!).

18 Notes: As a ratio statistic, the ICER is neither a sufficient or unbiased statistic. This ratio statistic is often highly skewed (b/c numerator is difference of two skewed data and denominator is often small).

19 CE plane Oftentimes, ICER value alone can be misleading or is not enough; different data can yield an identical ICER, for example, 2/2 = -2/-2. CE Plane can be intuitive and informative – it should be always accompanied with numerical analyses.

20 CIs for mean-based ICER (revisited)The existence of many methodological proposals for the construction of CI for ICER demonstrates the difficult and/or problematic aspects of the ICER parameter. Google Scholar finds >20 methodology or review/ comment papers for “confidence interval ICER,” – not that common in statistics! General consensus: Bootstrap percentile (BP) and Fieller’s methods are recommended (but caveat is needed! – more later).

21 How to construct CIs for ICERmedianStep1: Sample n1 effectiveness-cost pairs ‘with replacement’ from the n1 pairs in treatment 1 arm. Compute the median effectiveness and the median cost. Step2: Repeat Step 1 for treatment 2. Step3: Compute ICERmedian by plugging the sample medians (4 numbers). Step4: Repeat Steps 1-3 B-times (say, B≥1,000) to generate B sets of the bootstrap samples. Step5: Plot each pair of values of the numerator and the denominator of ICERmedian in the bootstrap samples on the CE plane.

22 In Step 5 1) One quadrant onlyIf the bootstrap replicates lie in the SE or NW quadrant only, CI may not be necessary. Instead, we may conclude that the new treatment is preferable for the SE quadrant and not preferable for the NW quadrant. If all bootstrap replicates lie in the NE quadrant, we can form the CI by the standard BP method (e.g., for a 95% CI, select the 2.5th and 97.5th %tiles). The same procedure can be used for the SW quadrant but the interpretation is different; in the SW quadrant, the higher the ICER is, the more desirable the new treatment is.

23 In Step 5 2) Two quadrants If the bootstrap replicates lie in the NE and SE quadrants, or the NW and SW quadrants, then the standard BP method still works. However, if the ICER replicates lie in the NE and NW quadrants, or the SE and SW quadrants, then the ordinary percentile method can be invalid. Instead, the ‘re-ordered BP method’ should be used (Wang & Zhao 2008).

24 Re-ordered BP method The ICER replicates are ordered from the negative largest (smallest in absolute value) to the negative smallest (largest in absolute value) in a descending order, followed by the positive largest to the positive smallest (e.g., -1, -2, -5, … , -97, -500, -1000, 999, 864, …., 4, 2, 1). Then we may select the 2.5th and 97.5th percentiles from these ‘re-ordered’ ICERs. In doing so, discontinuous open intervals that include +∞ or -∞ are allowed. Note: This method is better understood in the CE plane.

25 In Step 5 3) Three or four quadrantsWhen the bootstrap samples occupy 3 or 4 quadrants of the CE plane, which may happen when the cost difference and/or the effect difference are not significantly different from zero, a CI needs to be formed not only based on the value of the ICER but also the signs of its numerator and denominator. The wedge method can be used in this case (Obenchain 1999).

26 Bootstrap wedge methodFirst, we calculate the polar angle of the observed ICER. The 100(1-2α)% CI is formed by going 50(1-α)% of the ICER angle clockwise and 50(1-α)% of the ICER angle counter-clockwise from the observed ICER. Again, open intervals or angle-based wedge methods may be less intuitive numerically but could be best understood when they are presented in the CE plane.

27 Five cost-effectiveness regions (Obenchain 1999)

28 Example 1: Schizophrenia trialAn international trial to compare olanzapine vs. haloperidol as treatment of schizophrenia. N=1,996 patients who met DSM III revised criteria for schizophrenia-related disorders entered the study at 174 sites across 17 countries during US patients were considered for the CEA to avoid issues of cost conversions and different medical practice patterns in different countries.

29 All costs were calculated from pricing algorithms that used 1996 price lists for drugs and medical services. Effectiveness was defined as ‘responder days’. The standard CEA based on ICERmean was performed previously (Obenchain et al. 1999). In their and our analyses, the same 812 (548 in olanzapine and 264 in haloperidol) patients who had both cost and effectiveness measures from at least their first post-baseline visit were included.

30 Summary statistics and CEA

31

32 Example 2: Prostate cancer trialMitxantron and Prednisone (M+P) vs. Prednisone (P) were tested as treatment of symptomatic hormone-resistant prostate cancer (Tannock et al, 1996). N=161 hormone-refractory patients were randomized. Cost data were obtained from 114 patients at the 3 largest centers and used in the CEA (Bloomfield et al. 1998). Costs from hospital admissions, outpatient visits, therapies and palliative care based on retrospective chart review were included and measured in the Canadian dollar. For the effectiveness measure, quality-adjusted survival based on the European Organization for Research and Treatment of Cancer QLQ-C30 was used ---no censoring!

33 Summary statistics and CEA

34

35 Example 3: Simulated dataTo simulate another common scenario in clinical trials; a new treatment is more effective but costs more, either by mean or median. Effectiveness data (say, QALY) ~ N(mean=6, SD=1.2) for trt A & N(7, 1.5) for trt B. Cost data ~ log N (mean = subject’s QALY-1, SD = 1) for trt A & log N (mean = subject’s QALY-1.8, SD = 1.1) for trt B. This scenario generated symmetric effectiveness data (thus, yielding similar mean and median) and highly skewed cost data, where correlation(cost, effect)~= 0.47.

36 Summary statistics and CEA

37

38 Show Me the Data Some journal editors, Rossner et al. (2007), asked if Thompson Scientific would consider providing the median in addition to the mean. TS responded: It’s an interesting suggestion…. The median would typically be much lower than the mean. There are other statistical measures to describe the nature of the citation frequency distribution skewness, but the median is probably not the right choice. Rossner et al. replied: but it can’t hurt to provide the community with measures other than the mean.

39 Conclusion/DiscussionICERmean & ICERmedian should be considered together as ‘complementary’ tools in CEA for informed decision, acknowledging the pros and cons of each method. Mean & median are NOT competing – they estimate different parameters and address different questions. If mean- & median-based CEA provide different results, our confidence about the CE of a treatment may need to be adjusted accordingly. ICER is meant to be informative and aid in decision making and not in itself determine the decision (Gardiner et al. 2000).

40 Accepting more than one method/perspective would be an important step in evidence-based medicine that may make CEA be more utilized in the real world settings. - In such an important field of study, more than 1 measure/ perspective is needed. - Median-based methods may be way overdue ! As noted, ICERmean & ICERmedian can yield different results/conclusions but we do not think this is a problem; indeed, different results may mean additional information obtained out of the same dataset rather than confusion or inconsistency. Different results can be more useful than blinded applications of only one method, and an uninformative answer can be better than a misleading one (Jiang et al. 2000).

41 Take home question: If treatment/product A costs more by mean but costs less by median than B, what is your decision? Are you comfortable only when mean is presented when data are (highly) skewed?

42 Funding and AcknowledgementR01 HL096575 Source: NIH/NHLBI Title: A unified approach for cost-effectiveness analysis We thank Drs. Obenchain and Willan for providing data and Ms. Ya-lin Chiu for help in programming.