Estimating overdiagnosis is hard—we need all the help we can get

Over the past decade there has been much consternation about overdiagnosis, the detection of cancers that would not have been diagnosed without screening.

The growing awareness of the problem of overdiagnosis has gone hand in hand with a collective sobering up about early detection.

The reality is that the size of the population that stands to benefit is much smaller than the population exposed to potential harm. Only a small minority of those screened for any given cancer have their lives extended by the test, since only a small minority would have died of that cancer.

At the same time, every person screened is exposed to potential harms from overdiagnoses and false-positive tests.

This does not automatically imply that screening should be abandoned, but it does lead to a nuanced perspective regarding benefit vs. harm. This, in turn, creates a need to quantify harm-benefit tradeoffs and brings us to the challenge of estimating overdiagnosis.

The frequency of overdiagnosis is extraordinarily difficult to assess. To know whether any screen-detected case has been overdiagnosed would require ignoring the diagnosis, withholding treatment, and waiting to determine whether other-cause death precedes clinical (symptomatic) detection.

Since this practically never happens, we are left with two methods for estimating overdiagnosis, excess incidence and lead-time modeling. Both were covered as part of a recent commentary (Kramer BS and Prorok PC, “A comparison of study designs for estimating overdiagnosis in cancer screening,” The Clinical Cancer Letter (CCL), May 3).

The first method recognizes that an overdiagnosed case is a true excess diagnosis—a cancer detected only because of the screening test. This reasoning leads to an excess-incidence formulation for capturing the extent of overdiagnosis: subtract the incidence without screening from the incidence with screening to yield the incidence of overdiagnosis.

The excess-incidence method is simple in principle, but studies (e.g. Duffy SW and Parmar D, “Overdiagnosis in breast cancer screening: the importance of length of observation period and lead time,” Breast Cancer Research, May 2013) have shown that it is difficult to get right in practice.

Indeed, many published excess-incidence estimates are inflated and have contributed to alarmist media headlines about overdiagnosis such as “The Great Prostate Mistake” (New York Times, April 2010) and “It’s Time to Rethink Cancer Early Detection,” with a table titled “Fatal Retraction” (Wall Street Journal, Sept. 2014).

Why is it so hard to get excess incidence right in practice?

There are several reasons. In the population setting (breast or prostate cancer in the U.S. for example), incidence with screening is generally available, but the background incidence without screening is not.

Attempts to impute this may ultimately amount to guesstimates that cannot be verified or defended. In the trial setting, the control group provides the background incidence, but the trial design, measure of excess incidence, and follow-up duration must satisfy very specific conditions in order to avoid a result that is provably biased (Gulati R, Feuer R, Etzioni R, “Conditions for valid empirical estimates of cancer overdiagnosis in randomized trials and population studies,” American Journal of Epidemiology, July 2016).

The recent CCL commentary concurred that the best chance of a valid result comes from cumulative excess incidence in a trial with a stop-screen design and adequate follow up after screening stops.

In other settings, corrections have been suggested as having potential to reduce bias (e.g. Ripping TM et al, “Quantifying overdiagnosis in cancer screening: A systematic review to evaluate the methodology,” JNCI, Oct. 2017).

We agree with the CCL commentary that excess incidence is a useful idea and can, in ideal settings, produce a valid result. But we do not agree with its dismissal of the second method for overdiagnosis estimation, lead-time modeling. Before explaining why we believe that lead-time modeling has value, we provide some background about this approach.

Lead time is defined as the time by which diagnosis is advanced by screening. This, in turn, is inextricably linked to the underlying disease latency. Lead-time modeling first estimates the lead time distribution. Then, overdiagnosis is estimated as the proportion of individuals whose lead time is longer than their time to death from other causes, which can be estimated from life tables.

Lead-time modeling uses excess incidence, but indirectly. A seminal study published in the early 1990s (Feuer EJ and Wun LM, “How much of the recent rise in breast cancer incidence can be explained by increases in mammography utilization?” American Journal of Epidemiology, Dec. 1992) showed that the increase in incidence expected under screening ties directly to the lead time—given a specific pattern of screening dissemination, longer lead times produce a more pronounced rise in incidence than shorter lead times. In principle, therefore, we should be able to learn about lead time and disease latency from excess incidence under screening.

In fact, the connection between excess incidence and disease latency has a rich history in both the cancer and HIV literatures, and was harnessed in many studies of HIV-tested cohorts to predict the size of the AIDS epidemic in the 1980s and 1990s (e.g., Brookmeyer R, “Reconstruction and future trends of the AIDS epidemic in the United States,” Science, July 1991).

Both excess incidence and lead-time modeling have the potential to produce misleading results. But today we have a better understanding than ever before of the circumstances under which the two methods can be trusted.

So, the notion of using excess incidence to inform about disease latency is well established. Why, then, the dismissiveness of lead-time modeling for estimating overdiagnosis? As the CCL commentary explains, this arises from concerns about the assumptions made by the models.

The majority of published modeling studies make a simplifying assumption about the shape of the disease latency distribution. They do not explicitly include a non-progressive or indolent fraction of cases which have infinite latency. This leads to a concern that lead-time estimates may be substantially biased and becomes grounds for some to dismiss the entire approach.

It is true that models inevitably simplify disease biology and that the estimated lead-time distribution will be different if a fraction of non-progressive cancers is allowed. But the model-based procedure for estimating overdiagnosis is agnostic to whether the lead time is infinite or simply very long.

If the assumed family of distributions allows for a fraction of lengthy lead times, and the models are identifiable (they can be uniquely estimated on the basis of the available data), the results should permit approximation of the overdiagnosis frequency or at least provide a sense of whether it is likely to be non-trivial.

In practice, modeling studies should examine multiple shapes for the lead time distribution as part of a thorough sensitivity analysis. Identifiability of the estimates should ideally be confirmed, even though this can be quite challenging and data-set dependent (Ryser, M et al, “Identification of the Fraction of Indolent Tumors and Associated Overdiagnosis in Breast Cancer Screening Trials,” American Journal of Epidemiology, Jan. 2019). Uncertainty in the estimates should be quantified and potential sources of bias considered.

Not least, model descriptions should be transparent and methods described should be reproducible. It is understandable that skepticism about models surfaces when these steps are not followed. But there are thoughtful modeling analyses that have provided useful insights about overdiagnosis in both trial and population settings.

In conclusion, both excess incidence and lead-time modeling have the potential to produce misleading results. But today we have a better understanding than ever before of the circumstances under which the two methods can be trusted. We need to bring this understanding to how we estimate and report the extent of overdiagnosis. Regardless of estimation approach, high-quality efforts should be recognized. Where feasible, it can be illuminating to compare multiple approaches (Etzioni R, et al, “A reality check for overdiagnosis estimates associated with breast cancer screening,” JNCI, Dec. 2014).

Well-founded cancer screening policies and properly-informed patient decisions are at stake.

Byline

Ruth Etzioni

Division of Public Sciences, Fred Hutchinson Cancer Research Center, Center for Early Detection Advanced Research, Knight Cancer Institute

Roman Gulati

Senior statistical analyst, Division of Public Sciences, Fred Hutchinson Cancer Research Center

DOWNLOAD THE PDF

Table of Contents

YOU MAY BE INTERESTED IN

News Analysis

GRAIL presses on with Galleri test despite missed primary endpoint in pivotal study
Where GRAIL sees signals of benefit in the subgroups, screening experts see signs of overdiagnosis

If you listen to GRAIL executives discuss the results of the long-awaited trial of the company’s multicancer detection test, you might be led to conclude that the company’s pivotal NHS-Galleri study had an overwhelmingly positive result.

March 06, 2026

Vol.52 No.09

By Jacquelyn Cobb and Sara Willa Ernst

Conversation with The Cancer Letter

GRAIL’s Megan Hall: “I think we can be confident that there is clinical benefit to implementing this technology. And I think that’s really hard to argue with.”
Mainstream epidemiologists beg to differ

Undeterred by the negative topline result of its pivotal trial of Galleri, a multicancer detection test, the test’s sponsor, GRAIL, said it’s forging ahead with its plan to get FDA approval and reimbursement from CMS and private insurers.

March 06, 2026

Vol.52 No.09

By Jacquelyn Cobb and Sara Willa Ernst

Conversation with The Cancer Letter

NCI’s Philip Castle on NHS-Galleri results: “I’m sorry that it didn’t work, but it keeps the national conversation going.”
Castle is “disappointed” by Galleri results but not discouraged by implications for MCDs

Philip E. Castle, director of the NCI Division of Cancer Prevention, said he was disappointed to hear that GRAIL’s NHS-Galleri trial did not meet its primary endpoint of reduction in late-stage cancers.

March 06, 2026

Vol.52 No.09

By Jacquelyn Cobb and Sara Willa Ernst

Obituary

Erin Cummings, Cancer Nation advocate, dies of stomach cancer at 68

Erin Geddis Cummings , advocate and founder of Hodgkin’s International, died on Feb. 24, at Martha’s Vineyard Hospital, surrounded by her family. She was 68.

March 06, 2026

Vol.52 No.09

Vasan Yegnasubramanian: How AI is transforming cancer research and treatment

Vasan Yegnasubramanian, MD, PhD, is the director of Precision inHealth Medicine at Johns Hopkins. He spoke with Johns Hopkins Kimmel Cancer Center communications staff about how AI is transforming cancer research and treatment. The transcript of the conversation follows.

March 06, 2026

Vol.52 No.09

By John’s Hopkins Kimmel Cancer Center Communications

Cancer Policy

Mt. Sinai forms committee to probe Epstein links to breast center founder Eva Dubin, other faculty members

Mount Sinai hospital has formed a committee to investigate the ties between Jeffrey Epstein and Eva Dubin a leading doctor at the hospital and founder of the Dubin Breast Center at the Tisch Cancer Institute whose name is featured prominently in the Epstein files.

March 06, 2026

Vol.52 No.09

By Claire Marie Porter and Sara Willa Ernst

Estimating overdiagnosis is hard—we need all the help we can get

YOU MAY BE INTERESTED IN

GRAIL presses on with Galleri test despite missed primary endpoint in pivotal study
Where GRAIL sees signals of benefit in the subgroups, screening experts see signs of overdiagnosis

GRAIL’s Megan Hall: “I think we can be confident that there is clinical benefit to implementing this technology. And I think that’s really hard to argue with.”
Mainstream epidemiologists beg to differ

NCI’s Philip Castle on NHS-Galleri results: “I’m sorry that it didn’t work, but it keeps the national conversation going.”
Castle is “disappointed” by Galleri results but not discouraged by implications for MCDs

Erin Cummings, Cancer Nation advocate, dies of stomach cancer at 68

Vasan Yegnasubramanian: How AI is transforming cancer research and treatment

Mt. Sinai forms committee to probe Epstein links to breast center founder Eva Dubin, other faculty members

Renew today!

Subscriber content

Never miss an issue!

Login

Estimating overdiagnosis is hard—we need all the help we can get

YOU MAY BE INTERESTED IN

GRAIL presses on with Galleri test despite missed primary endpoint in pivotal studyWhere GRAIL sees signals of benefit in the subgroups, screening experts see signs of overdiagnosis

GRAIL’s Megan Hall: “I think we can be confident that there is clinical benefit to implementing this technology. And I think that’s really hard to argue with.”Mainstream epidemiologists beg to differ

NCI’s Philip Castle on NHS-Galleri results: “I’m sorry that it didn’t work, but it keeps the national conversation going.”Castle is “disappointed” by Galleri results but not discouraged by implications for MCDs

Erin Cummings, Cancer Nation advocate, dies of stomach cancer at 68

Vasan Yegnasubramanian: How AI is transforming cancer research and treatment

Mt. Sinai forms committee to probe Epstein links to breast center founder Eva Dubin, other faculty members

Never miss an issue!

Login

GRAIL presses on with Galleri test despite missed primary endpoint in pivotal study
Where GRAIL sees signals of benefit in the subgroups, screening experts see signs of overdiagnosis

GRAIL’s Megan Hall: “I think we can be confident that there is clinical benefit to implementing this technology. And I think that’s really hard to argue with.”
Mainstream epidemiologists beg to differ

NCI’s Philip Castle on NHS-Galleri results: “I’m sorry that it didn’t work, but it keeps the national conversation going.”
Castle is “disappointed” by Galleri results but not discouraged by implications for MCDs