Improving
Medical Statistics
www.improvingmedicalstatistics.com
and
Interpretation of Clinical Trials
_____________________________________________________
Subgroup
Analysis - Potential Limitations
Inappropriate subgroup analysis can lead to ludicrous results
Dr. Peter Sleight and colleagues have written insightful comments in regards to the limitations of subgroup analysis.1
Dr.
Sleight and the ISIS-2 trial investigators performed a subgroup analysis of
patients in the ISIS-2 trial by astrological sign to show the potential
limitations in reliability of subgroup analysis.
This subgroup analysis suggested
that the treatment was quite effective and statistically significant for all
patients except those born under the sign of Gemini or Libra.
The difference in outcome with respect to astrologic
sign was naturally an artifact and would not be reproducible in subsequent
studies. This was the point of their analysis.
ISIS-2 Trial: Details of the Trial and the Subgroup Analysis by Astrological Sign
The large ISIS-2 trial1 involved 17,000 patients. The beneficial effect of aspirin for patients having a heart attack was very substantial and equal to the effect of streptokinase (a powerful clot dissolving medication). Both were life saving medications. (The trial result for aspirin was very statistically significant (2p <.00001) with much less than a 1/1000 chance of these findings being the result of chance.)
The ISIS-2 investigators note: "When in a trial with a clearly positive overall result, many subgroup analyses are considered, false negative results in some particular subgroups must be expected."
The ISIS-2 authors then give as an example that “subdivision of the patients in ISIS-2 with respect to their astrological birth sign appears to indicate that for persons born under Gemini or Libra, there was a slightly adverse effect of aspirin on mortality (9% increase, SD 13; NS), while for patients born under all other astrological signs there was a striking beneficial effect (28% reduction, SD 5; 2p <0.00001).”
The subgroup of analysis suggesting that Gemini and Libra had an adverse effect rather than a beneficial effect with aspirin was not a true relation. These patients would benefit from aspirin to an equal degree as the rest of the group.
Subgroup analysis can lead to findings that are incorrect.
Validity
of subgroup analysis:
The view of this website is that subgroup analysis can be useful, but the
validity tends to be inversely proportional to
the number of subgroups which are analyzed. A study
is not immune to an incorrect subgroup analysis outcome simply because the
subgroups have been prespecified. This is particularly the case if there a large
number of prespecified subgroup analyses
(If
20 subgroup analyses are prespecified, then it is expected that one of these
subgroup analyses may show a false result for a P=.05 probability relationship.)
Part of the benefit of a prespecified subgroup
analysis is that there are necessarily fewer such analyses than the almost
unlimited number of ways to subdivide the data in a post hoc analysis after the
trial results have been obtained.
What
is the reliability of a finding for a small subgroup of a trial who unexpectedly
have a different outcome from the rest of the group?
In
particular, if a given therapy has a highly significant and strongly beneficial
effect for the group as a whole, a subgroup analysis that results in an
unexpected finding that certain subgroups do not have benefit, is frequently
incorrect.
In fact, it is more
likely that the unexpected subgroup finding which runs counter to the group
finding, is simply not valid. As pointed out by Dr Sleight, it is more reliable
to assume that the subgroup actually had the same outcome as the overall group.
Particularly vulnerable to error, is the post hoc analysis of trial data when a number is derived in retrospect from trial data and is then said to separate the responders from the nonresponders. Years ago, this type of analysis of the CARE study data by prominent investigators was said to indicate that treatment of initial LDL cholesterol levels below 124 was not helpful in patients with coronary disease (blocked arteries). Clearly,this was later shown to be erroneous. Similarly, a different post hoc subgroup data analysis concerning which patients with a cardiomyopathy (weak heart muscle) benefit from an implantable defibrillator led to erroneous conclusions by the Medicare (CMS) administration.
A
subgroup analysis which results in variance from the overall group outcome is
more likely to be true if it involves a large subgroup and there are a very
limited number of prespecified analyses. Even
then, the subgroup analysis findings tend to be most valid as a starting place
for subsequent clinical trials to confirm or refute the finding, rather than
being viewed as a definitive result.
Particularly vulnerable to error, is the post hoc analysis of trial data when a number is derived in retrospect from trial data and is then said to separate the responders from the nonresponders.
Years ago, this type of analysis of the CARE study data by prominent investigators was said to indicate that treatment of initial LDL cholesterol levels below 124 was not helpful in patients with coronary disease (blocked arteries). Clearly, this was later shown to be erroneous.
Similarly, a different post hoc subgroup data analysis concerning which patients with a cardiomyopathy (weak heart muscle) benefit from an implantable defibrillator led to erroneous conclusions by the Medicare (CMS) administration.
For an excellent look at potential limitations of subgroup analysis, the following articles are recommended:
1. Debate: Subgroup analyses in clinical trials: fun to look at- but don’t believe them! Peter Sleight. Current Control Trial Cardiovasc. Med. 2000 1(1): 25-27.
2. ISIS-2 (Second International Study of Infarct Survival). Lancet 1988: ii: 349-360 (pages of interest 356-357)
copyright 2005 www.improvingmedicalstatistics.com