Treatment Action Group (TAG): Volume 7, Issue 2 - March 2000
| C o n t e n t s | |||||||||||||||||||||||||
|
#1 Time for 10-year trials? #2 Observational Databases Promise to Solve Clinical Trials Lag #3 Observational Cohort Study Research Possibilities In Perspective |
|||||||||||||||||||||||||
|
|||||||||||||||||||||||||
| #1 | Behemoth Eradication's Ebb Finds Researchers Wrestling Again with Clinical Outcome Concerns |
||||||||||||||||||||||||
Time for 10-year trials? Immediately following the Division of AIDS January workshop on long-term clinical research in HIV infection, the workshop attendees were invited to attend a short statistical symposium. The half-day meeting of minds was to focus on the methodological issues of these long-term clinical endpoint studies. Along with a handful of other community representatives, Mark Harrington was in attendance. He prepared this report. It was early in the year 2000, three and one half years after the introduction and widespread adoption of HAART. Prospects for eradication of the virus with HAART alone had receded to the vanishing point, and the leaders of the National Institutes of Health (NIH)'s AIDS research effort had decided that the long-term effectiveness of HAART merited some focused attention from clinical researchers. The NIAID Division of AIDS was also trying to decide how to cut up the adult AIDS clinical research pie. It had refunded the Adult ACTG at $80 million per year for five years, but had deferred consideration of two other applicants-the CPCRA and the Veterans' Administration (VA) Network-while trying to develop its own long-term effectiveness research agenda. The January 2000 workshop, held at Bethesda's Holiday Inn, was designed to focus on the scientific needs and methodological concerns of this kind of research. Statistical Methodology Satellite Symposium Among the key questions:
"This an extremely high priority for the NIH," stated Killen. "Getting a research program in place which can do this is of the highest priority right now. We need to not be paralyzed by fear of not getting it exactly right. We're moving into new methodological territory so we need a lot of help." Michael Hughes, director of the Pediatric AIDS Clinical Trials Group Statistics & Data Analysis Center (SDAC) at Harvard, discussed ways that randomized, controlled trials could address these long-term questions. AIDS-related mortality in adults and children in clinical trials has dropped from over 5% in 1995 to less than 1% in 1999. "So to think about mortality-or even morbidity-endpoints means we're talking about a very long duration of trials. If these trends persist, they might have to be 5-6 years long to have any real meaning; ten years is more likely to be relevant." How big would these trials have to be? It depends on the magnitude of the effect we're looking for. To see a two-fold (100%) difference in events between two arms, with 90% power to detect a reduction in events from 10% to 5% over five years, we'd need 1,200 patients; to detect a reduction from 5% to 2.5% we'd need 2,500. Would we want a trial which only showed a benefit in a small proportion of patients, e.g., the ten or five percent mentioned above. Or would we want to know the outcome for the larger population? If so, the trial would have to be larger and longer. Importantly, there is also the possibility of early transient effects, as was seen with early AZT in ACTG 019 and Concorde. Hughes also raised other issues related to long-term randomized clinical trials:
Hughes stated that, within the ACTG, losses to follow-up are typically in the range of 5% per year. Thus, losses to follow-up occur more quickly than do study endpoints. How could a large, long-term randomized clinical trial avoid such disparities? [One possibility would be to conduct the study at primary care sites, or in a captive network such as the VA system.] Flexible study designs will be key. Patients should be allowed to change strategies or regimens during follow-up. Size requirements increase if event rates rise over time. One could also envision nested trial designs in which second randomized regimens could be related to starting regimens [e.g., protease inhibitor sparing vs. protease inhibitor containing, with cross-over, cf. ACTG 384 and CPCRA 059]. Hughes stated that we need some event-based randomized clinical trials to underpin cohort-based evaluations. Marker changes and clinical effects of protease inhibitors are comparatively large; we're probably interested in much smaller differences between strategies. In response to a question, Hughes stated that "I don't think it's useful to do studies to detect things which will only affect 5-10% of people." Jim Neaton of the University of Minnesota said that "50% effects-10%, 5%-are totally implausible for the designs you showed. We need trials with not 1,200 subjects but 1,200 events. A combination of larger sample size and longer follow-up." Victor DeGruttola of the adult ACTG SDAC pointed out that industry or product-specific studies could be nested within strategy comparisons, as is being done in ACTG 384 and FIRST. Tom Fleming of the University of Washington discussed sample size. Fleming clarified that if we were looking for a 100% improvement, i.e., a change in relative risk of 2.0-we'd need 88 events, so we could get by with 88 endpoints. If you were looking for a 20% improvement, that is, a change in relative risk of 1.2. we'd need 1,200 events. "These are important achievable differences... We need large numbers followed long-term." Neaton replied, "I don't think a 50% effect is plausible. Is 20% plausible? 1,200 is kind of low. 20% is a very meaningful effect, but I question whether we'll see it in a when-to-start trial. Another way to approach this would be to design trials with interconnected designs and a plan to combine them-non-identical trials, designed in an interlocking way to answer different parts of the puzzle." Peter Peduzzi of the Veteran's Administration suggested borrowing the large simple trial concept to get some efficiency: get many patients enrolled as soon as possible. "When trials go on too long, patients drop out and physicians lose interest. Embed substudies to study mechanisms. Ensure the design is flexible." Next Alvaro Muņoz, chief statistician for the Multicenter AIDS Cohort Study (MACS), addressed the potential role of observational cohort studies in determining long-term effectiveness. Randomized clinical trials, being randomized, are used to determine efficacy of various treatments or strategies. Observational cohort studies can supplement and complement randomized trials by providing additional information on individual and population effectiveness. Relative Hazard of AIDS by Duration of Infection in MACS Seroconverters In the conversation which followed, the general consensus was that observational cohorts are useful to complement or supplement-but cannot replace-randomized trials, and that randomized trials are needed in particular to eliminate bias and to tease out treatment effects more modest than the dramatic short-term impact of HAART. The rest of the afternoon was taken up by two presentations about the use of mathematical models in randomized clinical trials and observational cohort studies. For those attending who were not mathematically super-sophisticated, this portion of the symposium was relatively incoherent, and its relevance to the topic at hand was unclear. Amy Justice of the Veteran's Administration brought the conversation down to earth, commenting that, "I've been to three meetings at which these issues have been discussed over the last six months. It's not randomized clinical trials vs observational cohort studies, it's how much do we want to spend, how large are the effect sizes we want to see? If it's two- to four-fold, observational databases are likely to be OK. If it's 20%, I don't see how anyone in this room can be confident that they've measured all the relevant covariates. How long do we have to follow them? Are there going to be differences in event rates early or late? These questions we need to answer." Victor DeGruttola pointed out that, "Having devoted several years of my life to the design of a study of 'tight' versus 'loose' virological control, only to see it bit the dust, I certainly admit these questions are important. But the most difficult aspect of this science isn't the statistics but formulating clinical questions for studies in which doctors and patients are willing to enroll. People have to be willing to live with these questions through several cycles of technological change." Amy Justice added, "Part of what randomization is trying to buy us is a simple design so that clinicians can understand the results. If they don't understand the results, they're not going to believe them. So for God's sake let's keep it simple: clinical endpoints." In summary, Steve Self of the Fred Hutchinson Cancer Research Center in Seattle said, "There is no substitute for randomized trials to give us answers to strategy questions that can give us an even 20% effect." Tom Fleming added, "There's so much we don't know regarding the disease process, multiple endpoints, intended and unintended intervention effects, ancillary care effects. We need to have all these types of studies. In particular, now, large-scale randomized trials-that's what are lacking in the landscape... These need to be long-term studies and large studies... I'm an advocate of large simple trials-in cardiology they've established themselves as very important tools. But, given the complexities in HIV disease, these trials are not going to be as simple. The length of these trials... I'd like them to be revealing some insights at 5-7 years. But for front-line therapy, I want to know ten years. Those will be expensive trials. The size and length of these studies will be study specific. If important insights emerge over five to seven years-or if there's a [r]evolution in clinical care-we need to be sure the question we've formulated will remain relevant over time." In summary, we need:
|
|||||||||||||||||||||||||
| #2 | ART Real World Observational Databases Promise to Solve Clinical Trials Lag, But Experts Warn All Data Not Created Equal |
||||||||||||||||||||||||
| 'Built-in bias' Four years into the era of HAART-for-life, as the feasibility of eradication recedes to the vanishing point, the Division of AIDS at the National Institutes of Health (NIH) sponsored a workshop on "long-term effectiveness research" in HIV disease. If the disease is going to be chronic, how can researchers make it manageable? Over-optimistic assumptions about adherence, eradication, and toxicity governed initial recommendations to "Hit early, hit hard." As a plethora of new and bizarre side effects became apparent, the wisdom of hitting early became somewhat eroded-especially as the immune system displayed a greater ability to reconstitute itself than had previously been expected. Hence the new interest in hitherto heretical questions such as "When to start?" and "How to change?" antiretroviral therapy. Some notes from the workshop follow. NIAID Director Anthony S. Fauci opened the workshop, saying "I'm here because it's important. Over 200,000 Americans with HIV are unaware of their infection, and 40,000 become infected each year. The burden of treatment [in the USA] that's ahead of us is going to be greater than all we've treated to date. If we don't need to treat everyone every day-e.g., with structured treatment interruptions (STIs)-it's possible we could treat some people in developing countries abroad who wouldn't otherwise be able to be treated..." John Bartlett, Fauci's co-chair on the HHS Antiviral Guidelines Committee, summarized his view of the data standards used in developing and updating the treatment guidelines. "For the Guidelines, we always use randomized, controlled trials. Sample size is unspecified. The duration of trials is between 24-48 weeks. For analysis, we use intent-to-treat and on treatment. The endpoints we look at include viral load <50, <500, and a separate analysis for those entering with viral loads over 100,000 copies/mL. We also look at adverse drug events, and tolerance. We'd like a comparison to the regimens in the preferred category." "Simply put, what we do in the clinic is different than what we say in the Guidelines. Among the main concerns with current guidelines: long-term outcomes are not available from most of the studies; initiation of "When to start?" is arbitrary; the need for individualization of regimens; the outstanding quandary of protease-sparing initial regimens; the benefit of partial viral suppression vs. the rapid squandering of future therapeutic options; the continuing threat of drug resistance issues." "Things in this field change with great speed. In most areas of medicine, the average time for clinical trial results to affect practice is 10 years; in HIV it is two years. Clinical trials have a participant bias. Fifty to eighty percent of individuals in clinical trials achieve a viral load below 50 copies/mL; in clinical practice this rate is a 20-40%. Enrollment and retention are problems in randomized clinical trials. This is largely driven by Medicaid policies, which are state-specific. Where you have a good Medicaid, you have a disincentive for enrollment-look at Maryland, Massachusetts, New York. Moreover, none of the big trials to date has assessed cost." No one in Bartlett's clinic is now on a single protease inhibitor (as part of a 4+ drug combo)-with the possible exception of nelfinavir. Bartlett pointed out that, using MACS data (Mellors, Ann Intern Med 1997), there was little difference in the incidence of AIDS at 3-5 years between those with CD4 over 500 or between 350-500. Similarly, the Swiss HIV cohort did not see major differences in progression between injecting drug users and others who started HAART later, and early starters (Junghans, AIDS 1999). Gwen Scott of University of Miami School of Medicine spoke about the Pediatric Guidelines. There is a sub-population of children whose immune systems are preserved, and it is not known whether they should be started on antiretroviral therapy (ART). Other important questions on the pediatric front include: Which combination therapy can best preserve growth and development? What is the mechanism of neurologic disease in children and how can it be prevented? How should resistance testing be best interpreted and incorporated into clinical practice? What is the role of Cesarean section in women with undetectable viral load on HAART in preventing perinatal transmission? Trip Gulick of the Cornell ACTU pointed out that in the AIDS field, a three-year study is considered "long-term." [Actually, in HIV infection a one-year study is considered long-term.] Clinical trial demographics do not always represent clinic populations, and study outcomes do not always match clinic outcomes. Carlton Hogan of the CPCRA Statistical Center gave an activist perspective. He assumes that most patients initiating therapy in the next five years will be able to get their plasma viral load beneath the limit of quantitation-at least in the short-term. Therefore, drug side-effects are likely to outweigh AIDS progression events. Hogan also predicts we won't have many novel drug classes or compounds, and that combination ART will still be necessary. What events are relevant? AIDS opportunistic infections (OIs) have decreased dramatically since the advent of potent combination therapy. Most future acute OI incidents are likely to be in individuals with barriers to care or unaware of their HIV seropositivity. These groups will be difficult to study. In patients on antiretroviral therapy, rates of adverse events-some life-threatening (e.g., cardiovascular)-are climbing. These may well become more common than OIs themselves. We cannot imagine (much less predict) the potential consequences of 10-20 years of antiretroviral therapy, complicated by interactions with cardiovascular drugs and aging. Other questions not raised in this workshop: when/if to stop antiretroviral therapy? when/if to restart? what to do with patients with discordant viral load and CD4 responses to antiretroviral therapy? Melanie Thompson of the Atlanta CPCRA unit cited some pithy quotes: "A physician is a person who pours drugs of which she knows little into a body of which she knows less. A doctor is someone who kills you today to prevent you from dying tomorrow." Randomized, Controlled Trials vs. Observational Cohort Studies Saag asked how the ACTG site will get the adverse event and hospitalization data if it's not providing primary care? Using data from the UAB cohort, Saag pointed out the complexities of current care. Among 143 patients whose viral load rose to over 5,000 copies/mL, there were 1,067 "regimen events," 242 unique regimens, and 107 unique regimen sequences. Every patient experienced a virtually unique regimen. Eighty-nine regimens included at least four drugs. Average time on each regimen was a median of four months, mostly due to toxicity. Lawrence Friedman of the National Heart, Lung & Blood Institute (NHLBI) discussed analogous experiences from cardiovascular disease. He focused on coronary heart disease and selected risk factors. Similar to HIV infection, cardiovascular disease is chronic, takes decades to develop, and has many interacting risk factors. There are many ways for adverse consequences to develop-heart failure, aneurysm, embolism, plaque rupture, and others. Further, there are many interventions: lower LDH, raise HDH, stop smoking, lower blood pressure, modify diet, increase exercise, give antiplatelet drugs or antioxidants. Treatment approaches include anticoagulants, ACE-inhibitors, beta blockers, defibrillators, bypass surgery, angioplasty, heart transplant, and others. In 1972 the NHLBI recommended reducing blood pressure; in 1985 the institute recommended reducing cholesterol. Both of these approaches were initially based on risk evaluation from observational cohort data-without much data from large, randomized clinical trials to go by. Later the statin trials produced results, proving that lowering cholesterol saves lives; however, there were some other not-so-happy examples. Anti-arrhythmic therapy was recommended based on observational studies. Yet the randomized clinical trials showed that, rather than prolonging life, the anti-arrhythmics actually increased mortality. Similarly, observational cohor studies showed a high level of cardiovascular benefit from estrogen replacement therapy, while randomized clinical trials are, if anything, showing a negative effect. Friedman looked at the salt controversy. Should you reduce salt intake to reduce hypertension? Observational data are inconsistent. Most randomized clinical trials have been small and short. The few larger longer term trials show quite modest effects on blood pressure. No trials have looked at clinical outcomes. Meta-analyses show statistically significant but quite modest differences. Some advocate reducing salt only in high-risk individuals, since the effect on blood pressure is minimal for most, and salt reduction reduces quality of life. Others favor a population strategy, claiming that even small shifts could have great public health importance. NHLBI held a workshop, which waffled: "The evidence that salt intake contributes to high blood pressure continues to increase. Americans take too much salt. The population strategy could affect cardiovascular mortality as much or more than a high-risk strategy." Friedman concluded by stating that cardiovascular medicine has been revolutionized by trials. Not all important trials have been done. Randomized clinical trials were interpreted along with observational cohort study (OCS) data. Caroline Sabin of the Royal Free Hospital in London further discussed the value and limitations of randomized clinical trials versus OCS. We have randomized clinical trials with clinical endpoints, randomized clinical trials with surrogate marker data, and OCS, case control studies, etc. Before HAART, progression was faster, clinical endpoints were possible within a 2-3 year time frame, treatment options were limited, drop-outs were 'relatively simple' to deal with, and results were easier to interpret. Even then, however, clinical endpoint trials seemed to take too long. Surrogate marker-based trials were then adopted based on the biology of disease and the need to speed up 'answers.' First CD4 cell counts were used, then plasma viral load. There are some benefits to observational studies. They reflect routine clinical practice, look at lots of regimens, sometimes have longer follow-up, and are perceived to be quicker. They are still too small to look at rarer, more complex regimens or adverse events, however, and treatment comparisons can come with built-in bias. Compare patients starting regimen A or B without randomization: Do they really share the same prognosis? What determines the choice of regimen A and regimen B? If the choice between A and B is made based on subjective clinical or laboratory evaluation-as it often should be-then the treatment outcomes will necessarily reflect these differences as well as any differences between A and B. Sabin showed data-generated earlier by Andrew Phillips-comparing three randomized clinical trials and three observational cohort studies looking at the relative benefits of AZT vs. AZT/ddI and AZT/ddC. Here, the three randomized clinical trials-ACTG 175, CPCRA 007, and Delta-and the three observation cohort studies-EuroSIDA, the French Hospital Cohort and the Swiss Cohort-showed comparable results. The databases agreed. This was nice. However, in a second comparison-using ACTG 320 as the randomized clinical trial and the same cohort studies-ACTG 320, EuroSIDA and the Swiss Cohort all agreed. But the gargantuan French Hospital Cohort study (N= 70,000) went in exactly the opposite direction, suggesting that you'd be better off adding just 3TC, not 3TC and indinavir, to your AZT. Has the database given us the wrong answer to the correct question, the correct answer to a different question, or what? Observational cohort studies are also of limited use unless the strategies are in current practice. They can be quick if retrospective data exist, but they will take nearly the same amount of time as a randomized clinical trial if data must be collected prospectively. OCS are also confounded by amount of prior treatment received. For example, the earliest individuals to go on HAART are likely to have been on mono or dual nucleoside therapy the longest, whereas later HAART starters probably had less prior nucleoside exposure. No surprise, then, that the late HAART group experienced greater benefit than the earlier group. With pulsed therapy studies-because pulsed therapy is a recent-to-emerge strategy-OCS will, like randomized clinical trial data, need to be collected prospectively. Finally, when can we justifiably rely on OCS data? Only in case where the treatment effect is large? Where many independent, well-run OCS are consistent? When there is no apparent confounding in studies? Or when the confounding occurs in opposite directions in different studies? On balance, long-term studies are essential to consider clinical events and toxicities and long-term surrogate marker values. If randomized clinical trials are feasible and ethical, we should do them. ¤ |
|||||||||||||||||||||||||
| #3 | OCS Redux International Statistical Panel Puts Observational Cohort Study Research Possibilities Into Perspective, Identifies Limitations |
||||||||||||||||||||||||
| 'Blunt instrument' One of the most frequent objections to carrying out randomized, controlled trials to answer questions about long-term effectiveness is that observational cohort studies could provide the answers faster, more cheaply and as a closer proxy to real world clinical experience. The National Institutes of Health funds many observational cohort studies such as the men only Multicenter AIDS Cohort Study (MACS), the Women's Interagency Health Study (WIHS), and the ALIVE study of injection drug users. The Centers for Disease Control and Prevention also funds some observational cohorts such as the Adult & Adolescent Spectrum of Disease (AASD), in over 23,000 HIV-infected patients, and the Hospital Outpatient Study (HOPS), in about 5,000. Researchers-both cohort-based as well as trial-based-provided a number of explanations to why observational studies will be inadequate to tease out the answers to "When to start?" "What to start with?" "When to change?" and "What to change to." Here are a collection of quotes and paraphrases from the Division of AIDS Satellite Statistical Symposium, held on January 11, 2000.
| |||||||||||||||||||||||||
DT 000310
DOCN TG000301
Copyright © 2000 - Treatment Action Group (TAG), 200 East 10th Street, #60, New York, NY 10003, phone: (212) 260-0300, fax: (212) 260-8561. All rights reserved. No part of this publication may be copied or reproduced in any form or by any means without the written permission of Treatment Action Group. Original formatting by Joel Beard, Web Manager, http://aidsinfonyc.org. Used with permission.
ÆGiS is made possible through unrestricted grants from Roxane Laboratories, Inc., iMetrikus, Inc., the National Library of Medicine, and donations from users like you. Always watch for outdated information. This article first appeared in 1988. This material is designed to support, not replace, the relationship that exists between you and your doctor.
ÆGiS presents published material, reprinted with permission and neither endorses nor opposes any material. All information contained on this website, including information relating to health conditions, products, and treatments, is for informational purposes only. It is often presented in summary or aggregate form. It is not meant to be a substitute for the advice provided by your own physician or other medical professionals. Always discuss treatment options with a doctor who specializes in treating HIV.
Copyright ©1985, 2000. ÆGiS & the Sisters of Saint Elizabeth of Hungary. All materials appearing on ÆGiS are protected by copyright as a collective work or compilation under U.S. copyright and other laws and are the property of ÆGIS and the Sisters of Saint. Elizabeth of Hungary, or the party credited as the provider of the content.