Three years after launching a project to automate the
process of collecting cancer data from select pathology laboratories and
reporting the data to cancer registries, the National Cancer Institute
reports that it has just passed the halfway point in reaching its goal.
The project is part of the NCI’s Surveillance Epidemiology
and End Results, or SEER, program, in which cancer data are collected
routinely from designated cancer registries in the United States. Public
and private entities then use those data to derive trends in cancer incidence,
mortality, and patient survival; conduct cancer-related studies; and help
state governments and other organizations predict their resource needs
for managing the disease and conducting epidemiological studies.
The SEER program now collects and publishes cancer
incidence and survival data from 14 population-based cancer registries
and four supplemental registries in the U.S. which, together, cover about
26 percent of the U.S. population. In total, SEER has data on more than
3 million in situ and invasive cancer cases, and about 350,000 cases are
added each year within the SEER coverage areas, says Carol Kosary, a statistician
in the NCI’s Division of Cancer Control and Population Sciences. That’s
more than double the annual volume of cases that were being plugged into
the data bank just two years ago, she says.
The SEER registries routinely collect data on patient
demographics, primary tumor site, morphology, stage at diagnosis, first
course of treatment, and followup, Kosary told CAP TODAY. It is the only
comprehensive source of population-based cancer information in the United
States that includes stage of cancer at the time of diagnosis and survival
rates within each stage, she says.
Steve Peace, a public health analyst with the SEER
program, says it has been the "primary means of measuring the national
burden of cancer" and that the epidemiologic benefits are powerful. "SEER
is often where changes in cancer incidence and death rates are first detected,
stimulating additional epidemiologic investigation to reveal the cause,"
Historically, most, if not all, of the laboratories
participating in the SEER program have relied on manual processes to identify
cancer cases and forward reportable data to the cancer registries involved
in the SEER program, Kosary says. That changed about three years ago when
the NCI signed a five-year, annually renewable contract with Artificial
Intelligence in Medicine Inc., or AIM, a Toronto-based medical informatics
engineering firm, to begin automating those manual processes with software
"We’re building the infrastructure to improve
cancer data collection," Kosary explains. "It’s a five-year piece of work
that we’re trying to get accomplished, and we’ve just entered year three.
Within the five years we’re attempting to get electronic pathology reporting
installed in 80 percent to 85 percent of our 207 labs. We’re probably
a bit more than halfway there."
That plan will ultimately create a massive public-use
database with real-time, complete, and accurate information on cancer
incidence, mortality, and survival, something unattainable with manual
data gathering and reporting. Most of the cancer cases the NCI registers
in the SEER program—in the high 90 percent range—are found
in pathology reports by hospital tumor registrars, Kosary says. Generally,
registrars do this by hand, an inefficient process that often results
in missed cases.
Programs like E-Path do away with the inefficiency,
says Peter Brueckner, MD, AIM’s chief executive officer. "E-Path goes
through the electronic pathology records and automatically identifies
those cancer cases that are reportable through lexical analyses of the
text of the reports," Dr. Brueckner told CAP TODAY. "Then it automatically
sends these cases to the registries, where they need not be transcribed.
They enter right into the system."
Automating cancer data identification, collection,
and reporting frees up busy tumor registrars to do other things, an important
benefit because of how hard it is to find and employ qualified tumor registrars,
Kosary says. "It allows us to use them in other areas of cancer registration
other than wading through massive volumes of pathology reports."
Dr. Brueckner says they also find that quality and
timeliness of the reporting is far better—"quality in terms of accuracy
and completeness of reporting and timeliness in that this takes place
in a day or two rather than several months when it is done manually,"
E-Path, which AIM introduced about eight years ago,
has evolved to the point today where it accurately detects about 99 percent
of all reportable cancer cases during its automated review of pathology
reports, Dr. Brueckner says. That compares with a sensitivity that may
be as low as 70 percent in a manual process, he says. And the software
is about 98 percent specific, meaning only two percent of the cases are
false-positives—cases the software identifies as being reportable
when they are not.
CAP member Murray Treloar, MD, chief of laboratory
and genetic services for Lakeridge Health Corp. in Oshawa, Ontario, has
experienced these high sensitivity and specificity rates firsthand. Though
his laboratory is not participating in the SEER project, he has been using
E-Path for about five years, beginning as a beta site for the software
when it was introduced. Before that, his laboratory had been using an
"error-prone" manual process to identify and report cancer data it was
required to submit to the Ontario cancer registry. "We used a manual process
of culling them on a monthly basis," Dr. Treloar explains. "Pathologists
were reminded regularly that they were supposed to indicate cases that
needed to be sent into the registry. The secretarial staff also were on
alert to recognize cases. A senior secretary spent one to two days at
the end of every month collating cases. Then we packaged and mailed [the
data] to the registry."
After installing E-Path, the laboratory’s time commitment
devoted to fulfilling its reporting requirements "went to zero," Dr. Treloar
says. "And we had a much better capture rate [of cancer data]. We were
missing about 30 percent of reportable cases."
The NCI’s Steve Peace says the lexicon that AIM created
and cultivated for E-Path has matured into a highly sophisticated tool.
"It started out with a basic word-search capability. It did an automatic
search of the pathology reports for words and word-string combinations
to match for cancer cases. It has evolved into a much more sophisticated
lexicon that looks at not just combinations of words and word strings
but a number of other different factors that more accurately identify
what we need."
"It does a lot of the ’thinking’ to identify
the cancer cases that need to be reported and incorporated in the [SEER]
database," Peace says.
For example, E-Path can make sense of complex pathology
reports that are produced in different ways by different pathologists,
he explains. "The software works through all of those different variations
to come to an accurate conclusion, whether the word is spoken in XYZ format
or YZX format," he says.
Although they are not built into E-Path, the CAP’s
cancer protocols and short-form checklists are playing an indirect but
important role in limiting the variations the E-Path software has to accommodate
as the laboratories participating in the SEER project report their data,
Peace says. Many U.S. labs, including those participating in the SEER
project, use these structured reports to ensure that their cancer data
identification, collection, and reporting processes are standardized,
Peace says. "AIM software and the electronic reporting system take advantage
of those protocols to allow more sophisticated searching and identification
of cases and the components of the pathology reports for the things we
need to collect."
One of the main challenges the SEER program has had
to contend with since implementing its plan to automate cancer data collection
and reporting has been a lack of uniformity in collection and reporting
methods among the laboratories and cancer registries. The NCI has met
that challenge by laying the groundwork for standardized electronic pathology
reporting for cancer surveillance, Peace says.
"As a leader in electronic reporting, we’ve
had to try to set those standards and work through the problems and difficulties
that come with doing that," he says. "CAP’s protocols have helped with
that in terms of establishing more consistent structures and formats for
reporting cancer data."
Dr. Brueckner and CAP representatives are discussing
incorporating the cancer protocols and checklists into E-Path, which would
entail expanding the software’s lexicon to include the language used in
the protocols. "Having a standard makes our life easier, so we would welcome
it. It would allow for more complete and more consistent data collection,"
says Dr. Brueckner.
The SEER program’s automated disease surveillance is
one of many applications for the CAP’s terminology SNOMED CT, and talks
are ongoing between AIM and the CAP about incorporating SNOMED CT into
E-Path. AIM now uses the International Classification of Diseases for
Oncology, Third Edition (ICD-O-3), to automatically encode cancer cases
in E-Path. ICD-O-3 is embedded within SNOMED CT either through a one-on-one
integration of the morphology concepts or through a mapping of the ICD-O-3
topography concepts to appropriate codes in SNOMED CT.
Within the next 12 months, NCI expects to have all
of the SEER regions in some phase of implementation, says Peace. "Our
program is trying to make sure that E-Path reporting as a component of
all of the electronic reporting processes is in place in the overall SEER
program. It’s been a gradual process, but when you look at the larger
scope of things, [the program] has actually moved quite quickly."
Tony Sullivan is a writer in Wheaton, Ill.