If real-world endpoints are indeed accurate proxies for clinical trial endpoints, real-world endpoints should start to look increasingly similar to the clinical trial endpoints as more stringent criteria are applied, said Jeff Allen, president and CEO of Friends of Cancer Research.
Friends has published a set of common definitions for real-world endpoints, including overall survival, progression-free survival, and other non-traditional endpoints. The project focuses on patients with advanced non-small cell lung cancer who received immune checkpoint inhibitors.
“By running these analyses in parallel with 10 different partners, we’re able to identify different data characteristics, such as the histological distribution within each dataset, that can influence the outcomes measured,” Allen said.
[The collaboration] required an upfront agreement from all the participants to provide a high level of transparency regarding the type and level of detail of data they had available to align on common definitions.
“We have also been cognizant of how disease setting, practice patterns, and specific treatment regimens would impact the meaningfulness of the real-world endpoints that we used and tried to account for those possible differences in our framework.”
The pilot project is ongoing: the collaboration will focus on applying inclusion/exclusion criteria in order to isolate clinical trial “eligible” real-world patients and compare outcomes of both groups.
Allen spoke with Matthew Ong, associate editor of The Cancer Letter.
Matthew Ong: In a nutshell, could you describe the Friends effort—and FDA’s role—in defining the utility of real-world endpoints? Also, what would this mean for oncologists and patients?
Jeff Allen: The RWE Pilot 2.0 project used our existing framework, developed during RWE Pilot 1.0, to assess several frontline treatment regimens in real-world patients with advanced non-small cell lung cancer.
Friends worked with 10 health care research organizations, FDA, and NCI to develop the pilot methodology and endpoint definitions used in Pilot 2.0. FDA was instrumental in providing expertise throughout the entirety of the project, including its development.
Ultimately, the results of this project will be informative to oncologists and patients by filling evidence gaps about the performance of medical products used in a real-world setting, including populations that may not have been represented in clinical trials.
It has also helped to characterize how several metrics that are readily obtained from electronic health data (such as time-to-treatment discontinuation) correlate to more traditional clinical measures like tumor progression or survival.
What are your primary considerations in developing a real-world endpoints framework that is consistent and meaningful?
JA: Currently, a significant challenge for the field is in validating real-world endpoints that can be extracted from real-world data independent of the data source.
Sensitivity analyses, which validate real-world endpoints by testing the ability of the endpoints to detect changes within a population, are important internal controls that we are including in ongoing analysis.
We have also been cognizant of how disease setting, practice patterns, and specific treatment regimens would impact the meaningfulness of the real-world endpoints that we used and tried to account for those possible differences in our framework.
That said, our future efforts will explore the use of the framework in a different disease setting to determine its broader applicability.
What was the process for creating this set of common definitions for real-world endpoints?
JA: There were a lot of in-depth discussions with the groups concerning each variable to be extracted and endpoint generated.
It required an upfront agreement from all the participants to provide a high level of transparency regarding the type and level of detail of data they had available to align on common definitions.
All in all, this was a very collaborative process that demonstrates the importance of the work and commitment of the participating partners.
How do you ensure that these common definitions can be used to generate evidence that might be substantially equivalent to—or that approximates—evidence created based on conventional endpoints?
JA: RWE Pilot 1.0 compared the correlation of real-world endpoints with the conventional endpoint of overall survival (OS).
In Pilot 2.0, we went back to the original pilot to identify outliers in the data and variations in how each group interpreted the endpoints to refine and standardize the real-world endpoints definitions.
What have you learned, so far, using these common definitions, about the real-world outcomes for aNSCLC patients treated with frontline therapies?
JA: Although not necessarily surprising, patients treated with PD-(L)1 therapy are generally older and less healthy in real-world populations as compared to clinical trial patients.
We have also observed some interesting differences in practice patterns across the data sets that we will be exploring in the coming months with respect to distribution of drugs used across treatment groups.
By running these analyses in parallel with 10 different partners, we’re able to identify different data characteristics, such as the histological distribution within each dataset, that can influence the outcomes measured.
This will be important for future applications of real-world data to consider reporting so that variations can be better understood.
Within this project, what are some of the challenges with using different sources of data to provide information on treatment outcomes? What works, and what hasn’t worked?
JA: In general, this is one of the biggest challenges associated with real-world data—establishing protocols and definitions that are applicable across different data sources, specifically because data sets range in the source of their data (electronic health records, claims-based, or some combination of the two) and granularity of the data that is visible within that data set.
In some cases, the alignment had to be on the intent of the definition, not the definition itself. For example, when identifying patients with complete records, this protocol looks very different, depending on whether you are looking at EHR or claims-based data.
What would Pilot Project 3.0 be focusing on? Will your partners be stratifying patient cohorts to study treatment effect and outcomes at a more granular level?
JA: There was a wealth of data collected during Pilot 2.0 that can still be analyzed to provide further insights. Of specific interest to the group was the idea of stratifying patient populations by PD-(L)1 status to compare impact on outcomes.
There was a wealth of data collected during Pilot 2.0 that can still be analyzed to provide further insights. Of specific interest to the group was the idea of stratifying patient populations by PD- (L)1 status to compare impact on outcomes.
We will be pursuing this and looking into other questions like using the framework in other disease settings, and applying clinical trial inclusion/exclusion criteria to the real-world populations to validate endpoints over the coming months.
In 2020, we hope to bring the 10 partner organizations back together to present these additional analyses for public discussion.
We’ve also been exploring with several additional data partners how real-world evidence and the endpoints characterized thus far may be leveraged in addition to a variety of other clinical and health care system endpoints to measure treatment effectiveness, toxicity management strategies, and acute service utilization rates (e.g. ER visit or hospitalization) to inform value assessments and quality of care.
As you get closer to being able to validate real-world endpoints and benchmark them against clinical trials, what is the best-case scenario?
JA: Ongoing work with this pilot is to apply clinical trial inclusion/exclusion criteria to real-world populations in order to isolate clinical trial “eligible” real-world patients and compare outcomes of both groups.
If real-world endpoints are accurate proxies for clinical trial endpoints, we should see real-world endpoints become increasingly similar to the clinical trial endpoints as we apply increasingly stringent criteria.
While our goal isn’t to make real-world studies mirror clinical trials, this may be an important internal validation step to increase confidence in the data quality and conclusions being drawn from a broader real-world dataset.
Also, it could support the use of less strict clinical trial criteria to make clinical trial results more broadly applicable to real-world populations.