If W. Edwards Deming was right that variation is the enemy of quality, things should be getting far friendlier in the quality arena for predictive breast cancer testing.
Like its predecessor for HER-2, the new American Society of Clinical Oncology-CAP guideline for estrogen receptor and progesterone receptor immunohistochemistry seeks to stamp out the common causes of variation in ER/PgR testing known to undermine its accuracy.
The guideline’s ultimate goal is to help physicians do a better job of identifying those patients with breast cancer who may well benefit from endocrine therapy.
That group, says Elizabeth H. Hammond, MD, co-lead author of the guideline, will include patients whose tumors have one percent or more invasive breast cancer cells with any degree of nuclear staining for ER or PgR antibodies. “One percent is about the lowest level that any pathologist can detect,” says Dr. Hammond, professor of pathology at Intermountain Healthcare, Salt Lake City.
To put that new threshold into perspective, Dr. Hammond notes that in some places a 10 percent or even a 30 percent threshold has been used for reporting a positive result.
In accordance with the new guideline, says Patrick Fitzgibbons, MD, the pathologist must report the percentage of cells that are immunoreactive, and an interpretation of positive or negative. Historically, the two haven’t always been reported. Some pathologists would report the percentage of cells staining but not indicate whether the cancer was ER- or PgR-positive or -negative, says Dr. Fitzgibbons, a co-author of the guideline and pathologist at St. Jude Medical Center, Fullerton, Calif. Similarly, some pathologists would report only that the tumor was positive or negative without reporting the percentages. “In some cases, tumors would be improperly classified as ER-negative when five percent or more of the cells were immunoreactive,” he says.
Under the guideline, pathologists are also required to report the average intensity of the staining (weak, moderate, strong) compared with the assay’s external controls. Pathologists have the option of adding a composite score (Allred, H, or Quick scores), which is calculated based on the percentage of immunoreactive breast cancer cell nuclei and the intensity of the staining.
With the one percent threshold for ER/PgR positivity now in place, one might wonder: Why not just treat everyone? Based on a large body of level-one evidence, says Johns Hopkins oncologist Antonio Wolff, MD, co-lead author of the ER/PgR guideline, patients with ER- and PgR-negative disease appear to get no benefit from endocrine therapy.
The evidence in support of the one percent cutoff comes from research by Washington University pathologist D. Craig Allred, MD, one of the guideline’s co-authors and a member of its steering committee, and colleagues (Harvey JM, et al. J Clin Oncol. 1999.17:1474–1481). Guideline co-author David Hicks, MD, says the researchers took tumors tested by IHC’s predecessor—ligand-binding assays—and performed ER testing on them using IHC. “They found that if the Allred score... was below three, the benefit from endocrine therapy appeared to be lost.” A score of three equates to either one to 10 percent of cells showing weak staining or as few as one percent of invasive tumor cells showing moderate staining, says Dr. Hicks, professor of pathology and laboratory medicine at the University of Rochester (NY) Medical Center. This has been simplified to greater than one percent tumor cells staining for ER as an indication that the patient is a candidate for endocrine therapy.
The guideline advises testing the ER/PgR status of all newly diagnosed invasive breast cancers and recurrent disease (testing for ductal carcinoma in situ is optional).
Testing recurrent breast cancer is important for more than confirming the status of the original tumor. An ER-negative cancer can turn ER-positive, but is an uncommon occurrence, oncologist Kent Osborne, MD, said in a presentation last fall at CAP ’09. Dr. Osborne recalls caring for a patient years ago whose breast cancer had tested ER-negative on three separate occasions. “With lots of intervening therapies the patient converted to positive and had a nice remission on hormonal therapy,” says Dr. Osborne, a guideline co-author and director of the Lester and Sue Smith Breast Center at Baylor College of Medicine, Houston. An ER-positive tumor becoming negative is a more common occurrence; with intervening tamoxifen therapy it happens 20 percent to 25 percent of the time, Dr. Osborne told CAP TODAY.
The guideline addresses another major cause of inaccuracy, which stems from improper specimen handling in the prenalytical phase. In a CAP ’09 presentation on the guideline, Dr. Hammond reported that a study she and colleagues conducted at Intermountain Healthcare’s core laboratory used the day of the week surgery was performed as a proxy for specimen fixation times. As hypothesized, the study showed that ER and PgR test results varied by day of the week patients had surgery and by hospital. Saturday cases showed the highest rate of ER-negative results, with Friday coming in second (Nkoy FL, et al. Arch Pathol Lab Med. 2010; 134:606–612).
“It’s unlikely,” Dr. Hammond says, “that the analytical part of the test was different, as the ER tests from all hospitals were done on the same equipment with the same staff.” What was different was how the tissue was handled before it arrived at the lab, says Dr. Hammond, referring to when the breast tissue sample was placed into fixative—and, probably, in some cases, how.
To produce optimal testing accuracy, the guideline recommends breast specimens be placed in 10 percent neutral buffered formalin no later than one hour—but ideally much sooner—after being removed from the patient. This tissue collection time should be recorded on the tissue specimen requisition. The fixation duration, which should be recorded on the pathology report, should be at least six hours and not longer than 72 hours. The fixation time requirements also apply to needle core biopsies.
Unfortunately, complying with the time-to-fixation requirement isn’t always as simple as having people in the OR place a specimen into fixative and send it to the lab. “Most people in remote locations often send a lump of breast tissue in a bucket of formalin to a lab,” Dr. Hammond says. Yet “formalin only penetrates tissue at the rate of one mm per hour.” If the laboratory receives a large piece of tissue containing tumor, it’s going to take time for the formalin to penetrate the tissue before fixation can even begin.
In such cases, Dr. Hicks says, “you have to cut into the tissue so the tumor is exposed to formalin.” That “means having a pathologist or pathologist assistant available to deal with the specimen right away, or alternatively—and this is a less desirable option—training OR personnel to properly cut the specimen so the formalin can penetrate, while preserving the integrity of the surgical margins.”
If the person who cuts the specimen is not careful, however, he or she could compromise the specimen. Thus, Dr. Hicks questions whether breast surgery should be done at remote locations where it’s not possible to incise the tumor and begin fixation in a timely manner.
Dr. Fitzgibbons is optimistic that the majority of freestanding imaging centers and other places will be able to find a way to meet the fixation requirements. He acknowledges, though, that a few remote settings might have difficulty complying with the guideline.
While the CAP’s Laboratory Accreditation Program will enforce essential aspects of the ER/PgR guideline, starting with the 2010 edition of the accreditation checklists (scheduled to be published in June), a lab that can’t get its clients to comply with recording the interval to fixation will not be at risk of losing accreditation. Says Stephen Sarewitz, MD, chair of the CAP Checklists Committee: The “CAP IHC and Surgical Pathology committees felt it would be difficult for labs to enforce the time frame for putting the specimen into fixative after it’s removed from the patient.”
That’s not to say that labs and pathologists shouldn’t do their best to educate their clients and hospitals about the fixation guideline. At Intermountain Healthcare, such efforts have worked out well, Dr. Hammond said in her CAP ’09 talk. For one, the breast surgeon talked to the operating room staff about why the interval to fixation is so important. If the grossing room staff receives a specimen without the times recorded, the pathologist assistant calls into the operating room to request the information. “It only takes doing that a few times until the OR staff start writing down the time.” The lab’s reminders have reduced the time to fixation from a mean of less than an hour to 18 minutes.
Having the fixation-related “time points,” Dr. Hicks says, “could be invaluable in troubleshooting” unexpected results, such as an ER-negative, low-grade infiltrating ductal carcinoma. “We could then look and see if a delay from collection to the start of fixation might be responsible for a false-negative test result.”
Dr. Allred offers this: “In the extreme cases where tissue wasn’t fixed promptly enough, you can identify the poorly preserved tissue under the microscope when using a routine stain [and] not just IHC stains.” The guideline also talks about using normal breast cells in the sample as internal positive controls. “Negative normal cells are possible but rare,” Dr. Allred says, so that “raises a red flag.”
“If the tumor is negative,” Dr. Fitzgibbons says, “and the normal ductal cells are negative or only very weakly positive, that could be either improper tissue fixation or the assay is too weak.”
Repeating the test in that situation is a reasonable option, says Dr. Hammond, though if a stronger assay is used, you’re moving outside the parameters of your testing and controls. But perhaps the drop of antibody did not get applied to the test by the automated machine.
If repeating the test doesn’t do the trick, then what? Dr. Hammond recommends repeating the test on another tissue block. If there’s no more tissue available, “the pathologist should report that he or she doesn’t know if the tumor is positive or negative and maybe the clinician treats the patient anyway. If I were the patient, I’d want a trial of tamoxifen or an aromatase inhibitor. Or I’d ask for another biopsy to be done, if possible.”
The guideline calls for the laboratory to include a note in the report to alert the clinician to unexpected negatives or positives. For example, Dr. Hicks says it’s rare to see a PgR-positive, ER-negative tumor. These were found perhaps less than one percent of the time when testing with older antibodies, he says. As antibodies improved and became more sensitive, that percentage bumped up to under five percent, meaning that with the older antibodies, the ER was truly positive but there was a problem with the assay’s sensitivity. And using real-time PCR, which is even more accurate, “a PgR-positive, ER-negative tumor is a very rare event.”
Dr. Hicks’ point about test results varying based on the test’s sensitivity points to another cause of potentially incorrect results. And that is a test that has not been validated in the right way, which the guideline stresses must involve comparison with a test that’s been clinically validated in studies linking test results with patient outcomes.
One option for validation, according to the guideline, is to use the single FDA-cleared kit (Dako). In fact, Dr. Hammond says, “It would be ideal if the only tests pathologists used were FDA-approved and completely standardized tests.” That would make the lives of pathologists so much easier, she says, because “they would not have to go through a rigorous process of defining which antibody and what protocol to use, as they’ve had to do with ER and PgR for years.” But each laboratory still has to demonstrate that the kits work like they’re supposed to in their hands, Dr. Allred says, adding, “Many times these kits are misused.”
Another alternative is to use a ligand-binding assay, the test that IHC replaced. The LBA is the test used in the 1999 Allred, et al, study that provides the basis for recommending the one percent cutoff. Daniel Visscher, MD, a guideline co-author and professor and director of surgical pathology at the University of Michigan Health System, says he guesses the “gold standard” for ER/PgR would actually be the old LBA. “But you would have to have unfixed, frozen tumor to do that, and most people don’t have that.”
The evaluation of ER can also be performed by assays that assess messenger RNA, Dr. Hicks says. The Oncotype DX (Genomic Health) 21-gene recurrence score (RS) assay includes ER and PgR as part of the RS score, and the results for these single genes are now reported separate from the recurrence score. Published comparison studies between measures of ER and PgR at the level of protein by local IHC and the gene transcript by central RT-PCR testing have shown discordance rates of nine percent and 12 percent, respectively. Given the lack of published data correlating the ER and PgR individual measures within the 21-gene RS assay with clinical outcome, Dr. Hicks says, the guideline committee concluded it was premature to recommend these individual measures for assay standardization and validation.
Labs can also rely on another lab’s testing as a source of validation, if that lab’s assay complies with the ER/PgR guideline.
So how many tests will a lab have to do as part of the initial validation? Five won’t cut it, says Dr. Fitzgibbons, who notes that CAP Surveys indicate that “in many cases, labs have not done a true validation study or, if they have, they have assessed an insufficient number of cases—for example, five, which we think is insufficient for predictive marker assays.” Dr. Fitzgibbons is co-author of a paper on validating ER/PgR, which will be published in the June 2010 issue of Archives of Pathology & Laboratory Medicine. The paper is referenced in the guideline as providing additional information on test validation.
If a lab has an ER/PgR assay that was in use clinically before the guideline was released and has used it on at least 200 specimens, it may use those results in a validation study, according to the validation paper.
Labs initially validating any laboratory-developed or -modified test must validate 40 negative and 40 positive specimens. Ten of those specimens should be weakly positive, defined as 10 percent or less. The validation paper says the laboratory should not include more than 20 specimens in the same test run.
Labs doing the initial validation of an FDA-approved or -cleared assay have to do half that many, five of which must be weakly positive. Or they can use the verification procedure the package insert specifies.
Dr. Fitzgibbons admits that the numbers of tests the paper specifies are “somewhat arbitrary.” “But we believe,” he says, “these represent a practical number of cases that will allow labs to be reasonably certain their assays are performing as expected. The agreement between assays specified in the recommendations has been achieved in clinical trials.”
Labs should not confuse the initial test validation with ongoing assay assessment.
“You don’t have to repeat the validation study unless there’s a major change in the assay—for example, a change in the primary antibody or the antigen retrieval or detection systems,” Dr. Fitzgibbons says.
Labs must validate new antibody preparations separately, Dr. Hammond says. In particular, she cautions that “new rabbit monoclonal antibodies against ER and PgR should be used at higher dilutions than their mouse monoclonal counterparts.” If they aren’t, the laboratory may get a false-positive result. In a recent clinical trial (BIG98), a rabbit monoclonal antibody against PgR was used, which led to a falsely positive result in 30 percent of the patients in the trial.
Labs that use a fixative other than 10 percent neutral buffered formalin will also have to validate use of that fixative, even if they are using the FDA-cleared kit, Dr. Hammond says. “Under CLIA, in order to use the test and render a result, the lab would have to validate that it’s getting the same result with the alcohol fixative” as it would using neutral buffered formalin, or NBF.
“That’s doable in many situations,” Dr. Allred says, “but I think most people will just say, ‘Let’s change our fixative to 10 percent NBF.’”
The validation paper also discusses labs using trend analysis as an ongoing monitor to flag a pattern that may show all is not well with the lab’s ER results. Says Dr. Fitzgibbons, “About 80 percent of breast cancers are ER-positive, so an ER-negative rate of 30 percent or higher suggests something is wrong with your assay.” (He credits Dr. Hicks with suggesting that trend analysis be included.) Trend analysis is also a useful means of problem solving when searching for the causes of variation in results—for example, individual pathologists, day of the week.
The guideline calls for proficiency testing. “Starting in 2011,” says Checklists Committee chair Dr. Sarewitz, “CAP-accredited labs will be required to participate in PT for ER/PgR testing,” if they offer the testing. “It will be like the HER-2-required PT to ensure lab performance in that area.”
The guideline delves into the biomarkers themselves, including progesterone receptor, which does add something to the predictive picture. “If you are ER-positive but PgR-negative,” Dr. Hicks says, “you are going to have a lot more aggressive disease than if you are ER- and PgR-positive.” The combination of a positive ER and PgR predicts a better response to hormonal treatment, Dr. Allred adds.
Dr. Wolff says patients with a positive PgR only are still candidates for endocrine therapy. The guideline itself says, “There is evidence that the small proportion of patients with ER-negative/ PgR-positive disease may respond to endocrine therapy.”
On one point, everyone agrees: Current ER/PgR testing, regardless of how accurate it is, doesn’t tell the whole story. “For all practical purposes,” says Dr. Osborne, “having a correct positive ER or PgR result doesn’t guarantee that the hormonal therapy will work.”
And, says University of Michigan oncologist Daniel Hayes, MD, another co-author of the guideline, the benefit seen from hormonal therapy seems to level off at about 30 percent to 50 percent positivity. “A patient is not more likely to benefit if her tumor has 100 percent positive cells versus 50 versus 30.”
Current testing and the guideline focus only on ER alpha, leaving the “other” ER, named beta, out of the testing loop. And given that there are the alpha and beta forms of the ER gene, Dr. Hammond says, “theoretically one could be switched on and the other off, and theoretically it would be useful to test for both.” The guideline committee, however, discussed that at length, “and the consensus was there’s no value in testing for [ER beta] at this point,” based on the evidence, she says.
Dr. Osborne points out that there are six or seven papers showing that high levels of ER beta are associated with an increased chance for a response to tamoxifen. “The cumulative data,” he says, “are very suggestive that ER beta is a predictor of tamoxifen benefit, but there is no level-one evidence to support the development of a clinical test at this time.” Yet “almost every study shows that high ER beta in the face of an ER alpha-positive tumor is indicative of a more endocrine-responsive tumor, whereas low ER beta is associated with tamoxifen resistance.”
Dr. Hicks is of the view that more evidence is needed before using ER beta clinically. ER alpha is “mainly in the nucleus,” whereas ER beta can be found in the membrane of the cell or the nucleus. What’s so interesting, he says, “is that ER can signal in the nucleus or at the membrane.” And “as we begin to understand [the pathways] better, it might give us clues about endocrine treatment resistance and how to overcome it.”
Dr. Hayes says there are other theories about how to boost the predictive power of ER/PgR testing. “Some people have suggested looking at the ER/PgR status of micrometastases, but to do that you would do a bone marrow.” He notes that about 20 percent of newly diagnosed patients with breast cancer will have micromets in their bone marrow. Some of those patients have recurrences but many do not. “So it’s not clear what’s going on,” he says. Others have suggested looking instead at circulating tumor cells and performing ER/PgR testing on those. “Those are all Star War ideas,” Dr. Hayes says, “none of which have come close to panning out. So right now we are left with what’s in the guideline paper, which is criteria for doing [IHC testing] on the breast tumor.”
And those guidelines aren’t written in stone. Dr. Visscher calls them “living documents”—they’ll be adjusted as new high-level evidence becomes available.
“One thing we learned with HER-2 is that with this kind of testing,” he says, “we are still on a learning curve. So I would view the ER/PgR guidelines as more of the starting point than the end point.”
Karen Lusky is a writer in Brentwood, Tenn. At press time, the ASCO-CAP guideline was scheduled to be posted by mid-April at www.archivesofpathology.org.