The world of oncology clinical trials is on the cusp of a revolution. A recent report by Phesi, a clinical development analytics company, concluded that the current volumes of contextualized patient data are sufficient to completely modernize the landscape.
As oncology becomes increasingly biomarker driven, clinical trials must also evolve with a precision approach. The report suggests that data-led clinical development opens up the potential to generate digital twins, optimize protocol design, and make targeted site selections – all improving trial efficiency.
We connected with Gen Li, Founder and CEO of Phesi, to learn more about the future of data-led studies.
What has prevented the clinical research ecosystem from adopting a data-led precision approach to oncology clinical development until now?
In the past it hasn’t always been easy for clinical research professionals to access real-world data systematically – but this is changing thanks to the development of sophisticated data and analytics platforms that bring together multiple real-world and patient data sources.
The clinical research space is also traditionally conservative – understandably so, given the need for strict regulatory compliance. Clinical teams will often default to the tactics they have used successfully in the past. For example, going to investigator sites they already know rather than using new data to find more appropriate investigators for the current trial.
That said, there are several successful examples of where a data-led, precision approach has been transformative for oncology clinical trials in the past. For example, precision profiling of patients in breast cancer through biomarkers such as HER2, PIK3CA, BRCa1 and BRCa2. Meanwhile, the successful KEYNOTE trials for the immunotherapy Keytruda harnessed real-world data and performance scores to aid investigator site and patient targeting.
The real question is how such successes can be repeated across the entire sector, and how senior decision makers can be encouraged to sponsor such initiatives.
How does this scale of real-world data now available directly translate into actionable improvements in trial design or patient outcomes?
We’ve identified three main pillars where better use of available real-world data can have a significant positive impact on oncology trial design and patient outcomes:
Patient profiling: Misalignment between the target patient population and the actual population recruited in a study can lead to enrollment delay, avoidable protocol amendments and even trial termination. By analyzing real-world patient data, sponsors can create a digital patient profile that gives them a statistical view of patient attributes for the patient population specifically targeted by each individual protocol – such as demographics, outcomes and concomitant medications. This guides researchers in ensuring that trials are designed with the relevant patients in mind.
Protocol design: Research reveals that oncology trial protocols are often overcomplicated with high numbers of outcome measures. This both increases patient burden and causes recruitment difficulties, while not all the data collected are even used in the final submission. Comparing a trial’s protocol against real-world data and historic trial protocol designs allows sponsors to assess whether their number of outcomes measures is too far above average. This ensures protocol design is optimized against the medical, scientific and commercial objectives.
Investigator site selection: A digital patient profile aligned with a specific protocol design also identifies the physicians caring for and treating the targeted patient population. This information informs the patient access score (PAS), which measures the probability of an investigator site to access the patient population as defined in the protocol. A PAS substantiated with contextualized data improves precision in investigator site selection above and beyond historical clinical trial experience. By accurately assessing patient enrollment performance in this way, sponsors can realize higher enrollment rates and shortened enrollment cycle times.
How do you define a digital twin in the context of oncology R&D, and how are these models validated for clinical relevance?
Digital twins can be developed to replicate a specific kind of patient, for example, a particular subtype of lung cancer patients receiving standard-of-care treatment. Their construction is informed by real-world patient data; patient selection for a twin mirrors clinical trial processes to ensure they are validated for clinical relevance. That means using eligibility criteria as being defined by a protocol to “recruit” patients from databases, building digital versions of baseline, efficacy, and safety outcomes based on real world data.
Digital twins can support the industry to overcome longstanding challenges faced in patient recruitment, the ethical issues associated with placebo arms, and the impact these constraints have on cycle and approval times.
Studies have shown that digital twins offer real potential to replace standard-of-care comparator arms to streamline the implementation of clinical trials and dramatically reduce patient burden.
The FDA recognizes digital twins as in silico models and has included guidance tools like the ENRICHMENT Playbook in its Regulatory Science Tools Catalog. The EMA has also set regulatory precedent for allowing digital twin arms and its 2025–2028 workplan pledges review of digital twin data as part of future regulatory decision-making.
The regulatory environment for digital twin cohorts is still developing, but the major agencies have given clear signals that they are keen to make it a standard part of new drug assessments.
Could you elaborate on the ways clinical data science can improve efficiencies in trial site selection and investigator matching?
Applying the PAS described above can help improve enrollment though avoiding underperforming sites, by giving sponsors a deep understanding of the target patient population and building a “data bridge” between an investigator site and relevant patients. Identifying the relevant investigator sites with the highest enrollment potential has significant benefits. More than 40 percent of trial sites are non-enrolling; applying that data bridge could reduce the number to less than 10 percent.
With a comprehensive patient view from the outset, sponsors can then optimize trial design and reduce both patient burden and investigator site burden. This allows sponsors to select investigator sites and countries with far higher precision and relevance.
Today, sponsors often use simple ranked lists of all available sites for a trial. The issue with this approach is that it loses nuance in matching investigators with patients. Plus, if every sponsor is going to the same “highest ranked” investigator sites, then those sites will be oversubscribed. Using NSCLC trials as an example, our analysis showed that top investigator sites are recruiting patients for more than 50 trials in a competitive market like the US, making it impossible for them to contribute a meaningful number of patients. Clinical data science can add that missing nuance and identify the investigator sites able to recruit patients for a trial against specific inclusion and exclusion criteria.
What role can real-world patient data play in closing the expertise gap for biomarker-driven trials?
The key is ensuring that the right trial is paired with the right investigator sites globally. As mentioned, there are huge amounts of historical real-world data available to enable sponsors to have an informed, granular view of the expertise of each particular investigator site they might be considering. That includes viewing which indications and medicines an investigator has worked on before, and which modalities.
This is particularly important for those biomarker-specific cancer clinical trials. Aided by clinical data science, we can assess if an investigator is familiar with those biomarker defined patients, and if his/her site has adequate infrastructure, such as laboratory capabilities, to process various bio-samples. It also allows sponsors to see historical recruitment success, and speed of recruitment and enrollment compared to peers. These insights empower trial planners to confidently assess whether that investigator will be appropriate for a trial involving a specific biomarker.
How do you see laboratory medicine professionals integrating with AI-augmented real-world data tools in the near future?
A major benefit will be in patient profiling: AI-analyzed real-world data tools will allow those professionals to build digital patient profiles for a statistical view of patient attributes for each indication – such as demographics, including ethnicity, comorbidities, outcomes and concomitant medications, among many other key variables associated with patient safety and efficacy analysis. This gives them an up-to-date view of changes in standard of care and patient pathways.
What future innovations do you foresee at the intersection of AI, digital twins, and laboratory diagnostics that could further reshape how we run oncology trials?
We’re excited to see digital twins become increasingly recognized by the regulatory community as a validated way to overcome challenges with placebo arms and reduced burden. It’s still early days, but we’re certain to see an increasing number of trials harnessing digital twins as the space grows into wider mainstream acceptance.
Moreover, the value of digital twins goes beyond replacing control arms. We can use digital twins to better interpret results from single arm trials or trials with small samples. Digital twins can also be used to detect early signals in double blind trials without de-blinding, avoiding traditional interim analysis. And we can use digital twins as historical controls in regulatory filings. The list of potential applications goes on.
Ultimately, taking real-world data and proactively using AI and clinical data science to address the longstanding issues with oncology clinical trials will empower sponsors to get treatments to patients faster and without the huge burdens often associated with such trials.