| Article Access Statistics|
| Viewed||80 |
| Printed||0 |
| Emailed||0 |
| PDF Downloaded||25 |
| Comments ||[Add] |
Click on image for details.
|KNOW YOUR VITAL STATISTICS
|Year : 2020 | Volume
| Issue : 2 | Page : 472-473
Some More Measures of Effect Size
Deepti Vibha, Kameshwar Prasad
Department of Neurology, All India Institute of Medical Sciences (AIIMS), New Delhi, India
|Date of Web Publication||15-May-2020|
Department of Neurology, All India Institute of Medical Sciences, Sri Aurobindo Marg, Ansari Nagar, Ansari Nagar East, New Delhi - 110 029
Source of Support: None, Conflict of Interest: None
|How to cite this article:|
Vibha D, Prasad K. Some More Measures of Effect Size. Neurol India 2020;68:472-3
| » The Effect Size for Variables Measured in Continuous Scale (Mean and Standardised Mean Difference)|| |
When an outcome is a variable measured in a continuous scale, and we need to know the effect size, we can approach this in several ways. This is especially important if we want to determine if the outcome in the intervention group was better than the control group. Typically, we compare the means of the two groups. The measurement may be a simple difference of the mean or a difference in the measure before and after the intervention. The following note explains when the measures should be chosen, and how should they be interpreted when reading a paper.
| » Mean Difference|| |
Let us take the example of the National Institute of Health Stroke Scale (NIHSS), which is measured to assess severity in patients of stroke. The value can range from 0 (best score, no deficit) to 42 (severe stroke). If we want to compare the NIHSS at baseline with 24 h after stroke, between patients who received intravenous thrombolysis (intervention arm) versus those who received standard medical treatment (control arm), how can we do it?
The simplest way would be to take the mean of NIHSS (at 24 h) of patients in the intervention arm and in the control arm. Comparing the differences of the mean is called mean difference. The advantages are that it is simple to compute and understand. However, it does not take several factors into consideration:
- Is the difference meaningful?
- Will the difference change with the change in sample size?
- Should it not be the difference between the baseline NIHSS and 24 h NIHSS that is more meaningful (called a paired sample difference)?
- What should be the adequate sample size to detect clinically meaningful difference?
| » Standardized Means Difference|| |
One limitation of mean difference is that the treatment effect may be made to look small or big, depending upon the unit used. For example, a treatment may reduce intracranial pressure by one cm which may be expressed as a difference of 10 mm giving a novice in the field an impression of apparently big effect. Another limitation of mean difference is faced when conducting a meta-analysis of trials which have used different scales for measuring outcomes, for example, for memory, different scales are used. Combining these in a meta-analysis is problematic.
To overcome these limitations for an outcome measured on a continuous scale, some researchers estimate the difference in the mean outcome values of two groups, such as a treated group and a control group, and divide that difference by the standard deviation (SD) of the outcome values; this converts the estimated effect to SD units. This has been called a standardized mean difference (SMD), and it has three variations: (1) Glass's method, the difference divided by the SD for the control group; (2) Cohen's method, the difference divided by the pooled SD for both groups; and (3) Hedges' method, which modifies Cohen's method to remove bias that may arise in small samples.
The most commonly used one is Cohen's d, which is interpreted somewhat arbitrarily as small effect size for 0.2, medium for 0.5 and large for 0.8 or more. A major limitation of the SMD is that it is not clinically meaningful. SMD of 0.4 reduction in ICP is not meaningful for a neurologist or neurosurgeon. Therefore, SMD needs to be converted into natural units for meaningful interpretation. So, if we take the example of NIHSS forward, what would be a meaningful difference at 24 h:
- A difference of >4 between the groups?
- A difference of >2 from the baseline in the intervention group as compared to the control group?
- In addition, how would this effect size compare with changing the sample size of your study?
The mean difference and the variance are the two variables required to compute the sample size in the first place. So, during the stage of hypothesis testing and planning of the study, a rough estimate of effect size and variance will determine the sample size. You may have noticed in papers where the difference in the outcomes may not be meaningful, but the P value is significant, and this may happen with a large sample size. So, if we think that the NIHSS lowering (or improvement) by a score of 1 does not mean much, and a study shows a significant P value with that improvement number, it is possible, if the sample size is large.
Although P value tells us that the effect is not by chance, the effect size helps in determining the magnitude of the effect and helps us in defining the meaningfulness.
The standardized value has the advantage that it can be used across different studies for the same outcome. Cohen suggested that d = 0.2 be considered a “small” effect size, 0.5 represents a 'medium' effect size and 0.8 a 'large' effect size. This means that if two groups' means do not differ by 0.2 SD or more, the difference is trivial, even if it is statistically significant.
To illustrate, we can look at the results of a dataset in [Table 1]: the mean difference is 16.29–13.96 = 2.33. You would argue that in a sample of 24 patients in each arm, is this difference meaningful.
Look that the mean difference may have a negative or positive value depending upon the order of groups. The P value here is not significant.
We see that Statistical Package for the Social Sciences (SPSS) statistical software does not give us the effect size. However, if take the standardized value of NIHSS at 24 h and run the same analysis, we get the effect size instead of the mean difference.
There are many freely available online calculators which can also provide all effect size measures, and one is shown in [Figure 1].
|Figure 1: The mean, standard deviation, and sample size information required to give standardized effect size|
Click here to view
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| » References|| |
Cohen J. Statistical Power Analysis for the Behavioral Sciences. Hillsdale (NJ): Eribaum; 1988.
Grice JW, Barrett PT. A note on Cohen's overlapping proportions of normal distributions. Psychol Rep 2014;115:741-7.