Arraying the data
Collaborating over the Web
Although tissue microarrays have great potential to sharply reduce the time and cost involved in conducting what would otherwise be lengthy and expensive research, there is a catch: They pose great challenges in terms of capturing, organizing, updating, exchanging, and analyzing the data they produce.
"You could make the case that the information-management requirements of the data-numbers, text, and images-from a single tissue microarray over time rival those of a small pathology department," says Bruce Friedman, MD, professor of pathology and director of clinical support information systems, University of Michigan Health System, Ann Arbor.
Inspired by revolutionary work in genetics research, tissue microarrays build on a pre-existing technology known as gene expression microarrays, or gene arrays. Gene arrays allow researchers to "extract DNA and put thousands of tiny little dots on a chip, ending up with a high-throughput system, which, thanks in large part to automation, can do multiple tests in parallel," explains Dr. Friedman.
A seminal article in the journal Nature Medicine (Kononen J, et al. 1998;4:844-847) first described the related technology of tissue microarrays. The article was authored by a team of researchers led by Olli Kallioniemi, MD, PhD, at the National Human Genome Research Institute at the National Institutes of Health.
The NIH group came up with the ingenious idea of taking as many as 1,000 paraffin blocks of samples from archived tissues, punching out small cores from each block, and re-embedding those cores in a new paraffin block. This results in a tissue array of up to 1,000 different tissue samples. It’s like taking a cross-section of 1,000 different specimens and capturing it in one small location and in a way that makes it easy to study.
"You can then obtain a section of the new block and stain for, say, tumor antigens," explains Dr. Friedman. "And what you have is also a high-throughput array, such that any testing you perform is 500 times faster and more efficient."
"This is an exciting technology," says Jules Berman, MD, PhD, "one that involves using slides containing up to a thousand specimens and performing large research studies at once and in miniature." Dr. Berman is program director for pathology informatics in the National Cancer Institute’s Cancer Diagnosis Program.
What does this dramatic increase in productivity mean? Among other possibilities, it means identifying possible cancer biomarkers and checking their validity in many samples, across many cases, very quickly. "If you’re looking for a tumor marker," says Dr. Friedman, "and you’re using an automated system so you don’t have to laboriously inspect every core manually but can scan for specific staining patterns, a microarray allows you to, in effect, run hundreds of experiments simultaneously."
It also means pathologists have a way to build on gene array research. The two technologies, gene arrays and tissue microarrays, complement one another. "Once genes of interest are identified in a gene array study," says Dr. Berman, "their value as biomarkers can be tested using tissue microarrays."
Dr. Berman offers the following hypothetical example. A molecular biologist may perform gene array studies on several prostate cancers for which history has been obtained that allows the researcher to categorize each tumor as clinically indolent or clinically aggressive. Analysis of the gene array expression patterns may identify a few genes that have expression patterns that are different in the two clinically distinct types of prostate cancers. Once candidate genes are identified, the researcher may test their biologic relevance in a tissue microarray containing large numbers of samples of prostate cancers with well-characterized clinical information. By examining the expression of candidate genes in a tissue microarray, the researcher can validate or invalidate the hypothesis that the gene’s expression is a marker of the tumor’s clinical behavior. In this scenario, the gene array is used as a discovery, or hypothesis-generating, tool, and the tissue microarray is used to test the hypotheses generated by the gene array study.
"One of the planned projects for the NCI’s Cooperative Prostate Cancer Tissue Resource will be to prepare prostate cancer tissue microarrays that can be used in just these kinds of studies," says Dr. Berman.
A need for standards
Though the community of academic and corporate laboratories working with tissue microarrays numbers only a hundred or so, it is growing rapidly. The absence of standards to allow the data generated to be exchanged or even assembled into a single file at any institution, however, could limit the potential of this technology.
Researchers working with tissue microarrays operate more or less independently, using a variety of different formats, procedures, and data-collection and organization strategies to complete their studies. And they do not typically make special attempts to share their results, other than in a traditional manner. "They go forward, do their research, come up with some sort of observation based on the slide and the data and, hopefully, get published," says Dr. Berman. "Their publications contain only summary data. Their actual measurements are never shared."
Achieving agreement on simple standards will greatly enhance the value of this technology for collaborative research projects that would otherwise stop at the traditional research boundaries. "What we want to be able to do," says Dr. Berman, "is exchange the data, share it, and merge it into large datasets, to distribute multiple copies of one tissue microarray to multiple labs supporting a wide range of studies, and then access all the data from different labs." (See "Collaborating over the Web," page 62.)
The standards needed are, in many ways, basic. "We have to agree on very simple things, like how we’re going to identify a file as a tissue microarray file, the name of its creator, the date it was created or modified, where the clinical data starts and stops and how it is organized, where the image data starts and stops, how to identify specimens uniquely while maintaining patient confidentiality-things like that," Dr. Berman says.
A parallel might be drawn, Dr. Berman suggests, with the informal process that shaped the standards underlying a much earlier and simpler technology, that of writing a traditional letter. At some point when this medium was becoming established as a prevalent means of communication, many simple conventions became generally accepted: Writers would put the date in the upper left or right corner, then, a bit lower, the name of the person to whom the letter was addressed, followed by the address of the recipient, a salutation, the body of the letter, and so on, to the close. The evolution of these formatting conventions made the technology of the letter more easily and widely understood and thus more useful. But in the case of tissue microarrays, the challenge is understandably more complex.
In the world of gene arrays, by contrast, standards are less of an issue, in part because large manufacturers-Affymetrix being the largest-make the arrays most researchers use. "Because there is a standard producer of these arrays, there are certain database schemas, certain formats associated with the Affymetrix arrays, that we already understand," says Thomas Wu, MD, PhD, a bioinformatics scientist at Genentech.
Database tools are key
Another crucial area in which work must be done to fulfill the promise of tissue microarrays is database design. "The major problem is going to be specimen and information tracking, and most pathology laboratories do not have database systems that can deal with it," says Kenneth Hillan, FRCS, MRCPath, director of pathology at Genentech. "A lot of thought and energy will have to be devoted to the design of databases, both for putting samples into the tissue array, and then for tracking the data that comes out of the tissue array."
In some ways, the data-management challenge for tissue microarrays is even more complex than it is for gene microarrays. "For gene microarrays, the output is generally a single number, a measure of an intensity level," Dr. Wu says. "But for tissue microarrays, the output, the description of each spot on the array, tends to be much more complex-the data is much more complex." For example, each spot on a tissue microarray is a collection of many different types of cells rather than a single cDNA value, and the representation of this fact in a database must, of necessity, be more complex. This complexity is compounded because a given tissue sample may be used in many different arrays, and numerous types of studies may be conducted on any given array.
Researchers obviously must be able to do more than keep track of their data. They also need to work with it, to analyze and query it. "You also want to be able to assemble all the data, correlate it, perhaps look at it across multiple tissue microarrays, at multiple different genes or gene products," explains Dr. Hillan. "That’s a very complex thing to do, and to achieve it, you will want to use a powerful relational database." Basic desktop business software tools like Microsoft Excel and Access, commonly used in the early phases of this kind of research, soon reach the limits of their ability to handle the demands of such work.
Dr. Hillan’s group at Genentech itself needs to move from these relatively simple tools to something more robust: "We have used an Excel application that was custom-built for tracking tissue array information," he says. "But we’re currently in the process, in fact have actually completed the design of, a tissue array module in our Oracle database that links to all those clinical tissue specimens that we have in our human tissue database."
Finally, flexible, high-end reporting tools will be needed. "It’s one thing having this information in a database; getting information out of that database is a separate task," says Dr. Hillan. "Tools like Crystal Reports or Brio, for example, will be critical."
A significant payoff
The expectation, over the long run, is that work with tissue microarrays will help improve clinical practice, most likely in cancer treatment first. "Ultimately what we’re going to do is look at the tumor, look at the DNA, and then direct the therapy based on the genetic constituents of that tumor, which define its biologic behavior," says Dr. Friedman.
"Let’s say you have something you think is going to be an important marker for prognosis that can predict response to therapy for a given cancer," says Dr. Berman. To investigate that hypothesis might take a decade because it involves collecting tissues from numerous patients who are at different stages of the disease-patients with precancers, patients in early stages, patients with indolent cancer forms, patients who died in less than five years, patients who resisted therapy, and so on.
The study process also would be extremely costly. To collect those tissues, conduct studies on them using the markers of interest, and determine their validity could cost millions of dollars and take years to complete. "We need to have methods that will accelerate progress toward developing diagnostic and prognostic markers that can be used in clinical laboratories," says Dr. Berman. "Tissue microarrays have enormous potential benefit in this area."
"If a researcher can work with a tissue microarray that has been designed so that it has large numbers of samples of, say, breast cancers, collected as different histologic types, with different grades of the cancer, different clinical types, different prognostic types, all on one slide, he or she can conduct an otherwise lengthy and costly study at once with a small amount of reagent,"adds Dr. Berman.
"You’ve got the entire experiment on a slide, and you can start to see whether or not your new markers have the kind of expression patterns that are useful," he says. "You can make large numbers of observations very quickly on a tissue microarray, and that is one of the key benefits of this technology."
Predictions are that tissue microarrays will cost several thousand dollars, but they have the potential to save hundreds of thousands of dollars. "Collecting tissue, obtaining consent as appropriate, data acquisition, quality control, and so on-all of that is done up front, so researchers will be paying for the finished slides and associated data," says Dr. Berman. "If they can acquire a well-designed tissue microarray slide for $3,000, it might be a tremendous windfall for them."
Eric Skjei is a freelance writer in Stinson Beach, Calif.
Dr. Friedman hosts the annual AIMCL (Automated Information Management in the Clinical Laboratory) conference at which pathology informatics issues are addressed. The 2001 event, which will include sessions on standards and tissue microarrays, will be held May 30-June 1 in Ann Arbor. For more information, go to: www.pathology.med.umich.edu/education_training_and_cme.htm