Article Access Statistics   Viewed  117   Printed  1   Emailed  0   PDF Downloaded  31   Comments  [Add]  

Click on image for details.



KNOW YOUR VITAL STATISTICS 



Year : 2020  Volume
: 68
 Issue : 3  Page : 654656 
Why There Are So Many Measures of Effect size (Relative Merits and Demerits)?
Kameshwar Prasad
Department of Neurology, All India Institute of Medical Sciences, New Delhi, India
Date of Web Publication  6Jul2020 
Correspondence Address: Prof. Kameshwar Prasad All India Institute of Medical Sciences, New Delhi  110 029 India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/00283886.289025
How to cite this article: Prasad K. Why There Are So Many Measures of Effect size (Relative Merits and Demerits)?. Neurol India 2020;68:6546 
There have been four previous papers on effect size in the series. It is natural to ask the question: why do we have so many measures of effects? Life would be easy if there were only one measure which all of us can easily understand. But this is not so, and this article in the series attempts to present the rationale for so many measures of effects. Specifically, what are the merits and demerits of the various effect measures?
One obvious reason for more than one effect measure is that some outcome data are dichotomous (e.g., death/survival; treatment success/failure and so on); others are continuous numbers (e.g., blood pressure or intracranial pressure and so on). Therefore, effects on outcomes have to be expressed using different measures, for example, mean difference for continuous outcome data; and risk difference for dichotomous outcomes. But, then the question is why do we have so many measures for dichotomous outcomes besides risk difference, for example, risk ratio, odds ratio, the number needed to treat, hazard ratio and others. This is briefly explained below with examples. I start with an example of how brain CT result is viewed differently by different specialists.
Different people may look at the same result for different purposes and in different ways. I am sure you have seen this many times in a patient's investigation results. Consider a brain C.T. of a patient suspected to have a hemorrhagic stroke. Emergency physician looks at the C.T. to see if this confirms his suspicion. If yes, he refers the case to a neurologist. The neurologist looks at it to determine whether the site is typical of a hypertensive bleed and calculates the volume of the hematoma to assess and inform the patient or relatives about his prognosis. If he thinks that the prognosis with medical treatment alone is not favorable, then refers the case to a neurosurgeon who tries to determine whether the haematoma is operable or what is the benefit vs. risk of surgery for this hematoma (there are many other issues but let us use only the above for illustration). Similarly, the results of a study have many clients. Each client has a different purpose in mind when he looks at the results.
Let us consider a study that compared the treatment of stroke in “stroke unit” versus “general medical ward.” The results have many clients with different objectives in mind.
 Hospital administration wants to know the costeffectiveness of stroke unit as compared with the current treatment policy of treating stroke patients in the general medical ward
 The physician wants to know how much benefit it would offer to his patients. He sees different kinds of patients. Some patients are young with few risk factors and mild stroke; say with 2% risk of institutionalisation. Some patients are old with many risk factors and severe stroke with (say) 90% risk of death or dependence. The physician wants to know the benefit in each type of patients.
Accordingly, we want effect measures which –
 are easy to understand
 give some idea about costeffectiveness
 are meaningfully applicable to all kinds of patients
 covey the same effect size, irrespective of whether you measure unfavourable (e.g., Death) or favorable (e.g., survival) outcome.
Number Needed to Treat (NNT)
Let us say the results of a study showed that 50% of stroke patients treated in the general medical ward (hereinafter called “ward”) were institutionalized, whereas only 25% were so in the group treated in a stroke unit for, say, one month (assume both had similar mortality). Thus, the stroke unit made a difference of – 25% (25–50%), yielding an NNT of 4. This means four patients need to be treated for one month in a stroke unit instead of treatment in general medical ward to avoid one institutionalization in patients with stroke.
The hospital administrator calculates how much would the treatment in stroke unit cost over a period of one month. If on an average it is $5,000, he can easily see that $20,000 needs to be spent to prevent one institutionalization. He can easily compare the cost of institutionalization vs. stroke unit treatment and take a decision. Thus, one advantage of NNT is that it gives a quick (even if dirty) idea about costeffectiveness.
It has another advantage. This becomes apparent when we are dealing with a very low frequency of outcome. For example, mammography programs for 7 years reduce the incidence of death from breast cancer from 0.08% to 0.02%; a difference of 0.06%. NNT turns out to be 1,666. This means 1,666 ladies need to have regular mammography for 7 years to prevent one death from breast cancer. You may find saying this easier and more understandable than a difference of 0.06%. Thus, NNT helps to convert a small decimal into a round figure, which can be easily understood and also pronounced (saves your tongue). Thus two advantages of NNT are:
 provides an easy way to get an idea of costbenefit
 more easily understood by policymakers and physician.
But NNT has disadvantages too:
(i) For example, if you are communicating with the patient and say  four patients need to be treated to save one additional patient, the patient might ask: Am I likely to be one of the saved or among the three dead? In other words, this is not easily interpretable by an individual patient.
Risk difference (RD) or ARR (Absolute risk reduction)
Risk difference has three merits:
 Easy to calculate and interpret: You have to do only a subtraction, RD tells you how much difference the intervention could make
 It is symmetrical, that is, conveys the same effect whether you measure the favourable or unfavorable outcome. In the stroke example, if you measured the favourable outcome like “going home,” still the difference will be the same. 50% went home in the “ward” group and 75% in the stroke unit group—a difference of 25%, which is the same in magnitude as earlier
 It helps in calculation of NNT
 The fourth merit of RD is that its confidence interval can be calculated even when no patient had the outcome of interest in any group. For example, no patient died in any group (yond the scope of this article).
However, it has some demerits:
 Sometimes, it is too small to be pronounced and interpreted easily (e.g., see mammography example above)
 It cannot apply equally to all types of patients. Consider the two patients with acute stroke—one mild and one severe. You might think (though it is not correct) that the severe patient's risk of death/institutionalisation would be down to 65% (9025), but what about the mild patient—as such his risk is 2%. How can stroke unit make a difference of 25%, when the total risk is 2%. This illustrates the difficulty in using the RD (or ARR) from the study data (However, RR is equally applicable in both cases—see below).
Risk ratio or relative risk (RR)
It has the merit of applicability to all kinds of patients. For example, in the stroke example RR would be 25%/50% = 0.5 (=50%). That means risk with treatment in a stroke unit is 50% of that with treatment in the general medical ward. Thus, it would be 45% (half of 90) with the stroke unit treatment in the severe case, whereas it would be 1% (half of two per cent) in the mild case. RR easily applies to both.
However, the demerit in RR is that this is not symmetrical. Above you have seen the stroke unit halves the risk of unfavorable. If you measure favorable outcome here (like “going home”), then it should double its rate; but no. With 75% going home in the stroke unit group and 50% in the ward treatment group, RR of “going home” is 75%/50% = 1.5, rather than two. The other demerit which you might have noticed is that it does not sound right to say the risk of “going home.” Going home is a favorable outcome and risk is a rather loaded concept which sounds awkward in association with a favourable outcome.
To summarise, the merits of RR is
 Applicability to all kinds of patients
 Easier to interpret than the odds ratio
Its demerits are:
 Asymmetry: If there is 10% risk of death in experimental group; and 40% in control group; RR = 0.25, that is, RRR = 10.25 = 0.75; or 75% relative risk reduction. If we counted survival, risk of survival in the experimental group will be 90% and in the control group 60%; RR = 1.5, that means relative risk increase of survival 50%. You can see that one way it is 75%, the other way 50%. This is the asymmetry. Risk of survival sounds awkward. Risk sounds alright only for unfavourable outcomes; not favourable ones. So this is not a neutral concept
 Lack of neutrality
 There is no way to calculate the confidence interval of RR when there is zero events in both the treatment groups, for example, no death in any of the two groups in a study.
Odds ratio
The merits of OR is that:
 Like RR, it is applicable to all kinds of patients, irrespective of their level of risk without the treatment.
 It is not a loaded concept. It's neutral. Odds of going home sounds as appropriate as odds of institutionalization or death. Just as odds of winning or losing both sound acceptable.
 It is symmetrical. In the stroke example, the odds of institutionalzsation in the “stroke unit” group is 50:50 = 1, whereas in the “general ward” group it is 75:25 = 3. The odds ratio for institutionalization with stroke unit vs general ward is 1/3. Now, let us see what happens if we measured the odds of going home. This is 50:50 (=1) with stroke unit group and 25:75 with the general ward group is 1/3. Therefore, odds ratio of going home is 1÷ (1/3) = 1 × (3/1) = 3. Thus odds of institutionalization with stroke unit care is 1/3 of that with a general ward. Similarly, odds of going home with a stroke unit is three times that with a general ward. The symmetry is clear and no matter what you measure – the favorable or unfavorable outcome, it gives the same impression.
 The fourth merit of OR is that it can be used in one of the commonest forms of adjusted analysis (using logistic regression). In contrast, RD or RR cannot be used.
 It has certain mathematical properties that make it a favoured measure for some statistical calculations including metaanalysis.
The demerits of OR are that:
 It is a difficult concept to understand and interpret for health professionals
 If interpreted like RR; it overestimates the treatment effects. OR and RR are similar only when events in the control and experimental group is 10% of less or when they are close to one.
 Like that in RR, there is no way to calculate the confidence interval around OR, when there are zero events in both the treatment arms. In this situation only RD lends itself to the calculation of C.I. in this situation.
Which measure of effect size should a clinician use?
It is generally enough o know the RD, NNT, RR and RRR associated with treatment. OR maybe treated as RR when events in the experimental and control group are 10% or less. Otherwise, there are formulae (available on the Internet) to convert OR in RR or NNT.^{[1],[2],[3],[4],[5],[6],[7],[8]}
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
» References   
1.  Guyatt G, Rennie D, editors. User's guides to the medical literature: A manual for Evidencebased Clinical Practice. AMA Press 2002, Chicago. Available from: www.amaassn.org. 
2.  Laupacis A, Sackett DL, Roberts RS. An assessment of clinically useful measures of the consequences of treatment. N Engl J Med 1988;318:172833. 
3.  Laupacis A, Sackett DL, Roberts RS. An assessment of clinically useful measures of the consequences of treatment. N Engl J Med 1988;318:172833. 
4.  Malenka DJ, Baron JA, Johansen S, Wahrenberger JW, Ross JM. The framing effect of relative and absolute risk. J Gen Intern Med 1993;8:5438. 
5.  Naylor CD, Chen E, Strauss B. Measured enthusiasm: Does the method of reporting trial results alter perceptions of therapeutic effectiveness? Ann Intern Med 1992;117:91621. 
6.  Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, et al. Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. Analysis and examples. Br J Cancer 1977;35:139. 
7.  Cummings P. The relative merits of risk ratios and odds ratios. Arch Pediatr Adolesc Med 2009;163:43845. doi: 10.1001/archpediatrics. 2009.31. 
8.  Gallis JA, Turner EL. Relative measures of association for binary outcomes: Challenges and recommendations for the global health researcher. Ann Glob Health 2019;85:137. Published 2019 Nov 20. doi: 10.5334/aogh. 2581. 





