College of American Pathologists
Printable Version

  The long, hard road to LIS data conversion


CAP Today




March 2009
Feature Story

Anne Paxton

Your business has expanded, and your current laboratory information system can’t handle the volume. Or your workflow has changed and your LIS doesn’t support it. Or perhaps the hospital you work for has decided to go with an enterprisewide solution with a different IT vendor, and it’s forcing you to switch.

Whatever the reason, a conversion of your legacy LIS data is probably in the cards, and the process is going to be much more complex than “cut and paste.” Says John H. Sinard, MD, PhD, director of pathology informatics at Yale University School of Medicine: “It’s going to require a lot of thought about what data needs to be preserved. And it’s going to take longer than you expect it to.”

In general, says Walter Henricks, MD, director of the Center for Pathology Informatics at the Cleveland Clinic, labs are more likely to have to replace their hardware before their LIS. “When you have an LIS, you’re regularly upgrading hardware platforms on an equipment replacement cycle. In addition, the LIS vendors come out with new releases or new versions that are incremental (or larger) upgrades, each one adding more code and more complexity, and eventually requiring more and more processing power and hardware support. So hardware might need to be replaced every five to six years. Typically an LIS will last a lot longer than that in a laboratory.”

In the clinical laboratory, of course, a new LIS often has to be acquired because the needs of the lab have outstripped the current system’s capacity, or the vendor will no longer support the LIS or a particular version of it. When a laboratory does convert to a new LIS, whether it’s for clinical lab data or anatomic pathology, it’s usually a huge undertaking—and converting legacy data from the old system is the fundamental challenge.

There are myriad reasons why new LISs don’t get launched on a “plug and play” basis, Dr. Henricks points out. “While there will be overlap, there will be types of data fields with names that don’t match, or that have data split up differently. Typically you need to work with the vendor to be sure those fields in the old system are mapped correctly to the new system. That’s the real work that needs to be done.”

Different methods for assigning accession numbers can alone create a problem. “Maybe in the old system there were some constraints, some decisions were made that things had to be done a certain way, and the data look kind of funny. Or maybe some of the older numbers never were converted. That may get propagated into the new system, and you may or may not have a good solution for it, but you have to be aware of it.”

Dr. Henricks recommends that laboratories start discussing conversion issues early on, because the planning questions are as important as the technical questions. “Do you want to convert the data? How much of the legacy data do you want to convert? How much will it cost? How much time do you need to factor in to accomplish it? And how much will be a ‘standard’ conversion versus a ‘custom’ that might incur additional fees?”

At Yale University, where the ­pathology department is separate from the laboratory medicine department, Dr. Sinard coordinated the conversion of AP data in 1999 in response to Y2K compliance concerns. That move showed that all kinds of snags can occur in conversions, he says.

For example, a highly tedious part of the process can involve translations of coding. “Most of the information stored in the databases is not stored as viewable data but as pointers to dictionaries and tables and those sorts of things. In our old system, each pathologist wasn’t specifically identified on a case, but was identified as a pointer to that entry in the pathologist dictionary. That same pathologist will have a different pointer in the new system.”

“You may think, when I’m building a new dictionary I don’t need Pathologist X because he hasn’t been here for 30 years. But if you’re bringing over specimens from 30 years ago, you do need to bring in Pathologist X.”

Some vendors supply help in conversions, but typically they bring over only the data they consider most important. “If you want to do it right, typically you’ll need custom software written, and such programs must be very carefully written and well-debugged,” Dr. Sinard cautions. It’s not going to have the chance to be seasoned and improved: “You’re only going to run it once—then you can’t use it again, and nobody else has any need for it.”

“If there’s an error in it, you’re not going to know unless you stumble across something, and if you’re bringing in more than 2 million data points, you’re going to have a whole bunch of problems.”

Those in charge of conversions may need to pay particular attention to required fields. “If the new system requires data in a certain field that the old system did not require, you have to figure out what to put there. If you put nothing, then it could either give you an error or crash the system—you don’t know how it will behave in response to that missing data element. Sometimes you have to make up dictionary entries like ‘missing data’ or ‘unknown,’ just to stick some value in those fields, then inactivate them after the conversion.”

When the AP department converted from its old Cerner CoPath to the new client server CoPath, the vendor provided software to extract the old data. “And we did use it, but it didn’t bring over any of our histology information. Their philosophy was that they needed to bring over enough information for you to reproduce a diagnostically sufficient report, and histology information wasn’t required for that.”

The pathology department, however, wanted to convert everything. “Our philosophy was we wanted the entirety of the data because we have research as well as clinical needs. For us, it was potentially important to know which cases we had done special stains on. So we wrote custom software.”

How far back should a data conversion extend? In anatomic pathology, it’s very important to know patients’ prior biopsy results, or any prior malignancies or cancers the patients may have. “Just as in laboratory medicine, the value of that data diminishes the further back in time you go,” Dr. Sinard notes. “And the value of results will make a difference too. For example, if it were a negative pro­state biopsy, that doesn’t carry a lot of value. On the other hand, if a patient had breast cancer on a biopsy 20 years ago, that’s clearly relevant to any specimen I’m looking at today, because breast cancer is notorious for reappearing 20 to 30 years later.”

Similarly, a hematocrit from two months ago is probably irrelevant, if the clinician knows what it was yesterday. “But a genetic test for cystic fibrosis done 50 years ago is still relevant, because the patient’s genes haven’t changed.”

Some laboratories and pathology departments find it practical not to convert at all, Dr. Sinard says. They might keep the old system running for six months alongside the new one, and decide, anytime there is a new specimen, just to look it up in the old system. “After six months, the yield is probably not so important anymore, and they’ll just go by clinical history.” This may work for many institutions, but he’s skeptical that it will serve for academic institutions where there’s potential research value to all that old material. “You need it to be computerized or you’re never going to find it again.”

Sometimes, confidentiality problems can result from conversions, because there is generally an intermediate stage for the data. “Usually, you don’t remove data from the old system and insert it into the new one. You export it from the old system into holding files, and import those holding files into the new system. But people usually don’t want to delete the holding files right away in case they need them later. They contain all the data in unencrypted format, so you’ve got to be careful what you do with them. You can’t just throw the CD away or delete them; you have to purge all that data before getting rid of the equipment.”

“We’re right in the middle of converting our anatomic pathology data from one system to a new one,” says Bruce Beckwith, MD, chief of laboratory medicine at North Shore Medical Center in Salem, Mass., co-director of the Partners Healthcare pathology informatics fellowship based at Massachusetts General Hospital, and until recently chair of the CAP Informatics Committee. It’s a conversion of eight years of cases, 30,000 a year, a total of about 250,000 pathology reports, though he’s been involved in much bigger conversions as well.

North Shore, Brigham and Wo­men’s Hospital, and other partner hospitals have stored their clinical lab data in a centralized data repository. So when they take on a new LIS, “they’ve generally decided that it’s not worth repopulating the new system with data from the old, because you can look it up in the rare cases where you need it.” In anatomic pathology, however, that’s not case. “The prior cases are truly important, and for workflow issues you want to have them available in the new system so you can see them all in one place.”

Dr. Beckwith’s department at North Shore has been performing last-minute tweaks and final testing and validation, with the new AP LIS poised to go live this month. “We had explored a couple of options for converting the data. What we were hoping to do was a two-step conversion where we’d take a snapshot of the data in the system a month before conversion, then load it into the new system, so when we went live we’d finish all the cases started under the old system and run parallel systems just for a little while. The idea is once you start a case in one system, you want to finish it in the same system, and certain tests take a long time, so it may take a month or two.”

The vendor could not accommodate that plan, however. Instead, “when we switch over, we’re just going to turn the system on for real, and take as new anything that comes in, then just use the old system to finish up reports or cases from the past week or month that are still outstanding. And once we’ve got 99.9 percent of that done, we’ll stop with the old system, have the data converted to a generic format, then have our new vendor convert it to the new LIS.”

“So two months after ‘go live,’ we’ll have history available. But it will be painful in the interim where people are going to have to look up history in the old system.”

To Dr. Beckwith’s knowledge, there’s no standard way yet to formulate a data record, and there will continue to be proprietary differences in the ways IS vendors set records up. “There’s a ‘semi’ standard format for certain types of pathology reports,” Dr. Beckwith says, and the CAP and others are working on ways to further standardize content, “but standardization of reports at the electronic level has not really happened yet.” That’s why third-party companies offer custom programming under contract, essentially going into the system or taking whatever the current vendor provides and transforming it.

In his prior experience at Beth Israel Deaconess Medical Center, “we had a homegrown LIS. They wrote their own systems starting in the 1970s, and were able to more or less maintain it themselves. They never found commercial systems that were that much better until recently.” Now Beth Israel Deaconess is converting to a new system, but with all the data conversion being done in house.

Technical formatting issues are a chronic problem with conversions, he says. “You name it and it can crop up.” To give a simple example, one system might have only one field for the pathologist, and the other might have four fields, with names like “attending,” “resident,” or “consulting.” “Which one do you choose? What do you do with the other four? How do you do all the data mapping? Do you need three fields in the gross, seven in the microscopic, and 14 in the frozen? There’s no standardization around that, and somebody has to decide.” Another issue is how to store the data—as text or as separate fields? Should some data be quantitative, or should it be “choose one of the following five things”?

Decisions about data conversion can definitely limit the research that can be done, Dr. Beckwith agrees. “Research and clinical are two different beasts, and generally the systems are designed for clinical needs, not research needs.” But one thing he’s done in the past is pull data from the clinical system, de-identify the data, put it in an entirely separate database, and allow people to look through it to find cases that might be relevant to their research projects.

Another potential pitfall can be covering the cost of conversion. At North Shore, “I think we’re paying on the order of $25,000 or $30,000 to do the conversion, so it’s a noticeable expense, but if you’re paying $250,000 or $500,000 for a system, and maybe another $100,000 for servers, in terms of the scope of the project it’s not an overwhelming cost. On the other hand, if you forget about it and don’t plan for it, you are going to be stuck looking for money.”

The pattern Dr. Beckwith is seeing in data conversion is that blood bank data are getting special attention. “Blood type is always relevant, and transfusion reactions and antibodies you might have are relevant forever, so what I’m seeing is that pathology, cytology, and blood bank are getting converted, while chemistry, microbiology, and hematology data don’t get converted but just stored centrally.”

In 2004, when Henry Ford Health System in Detroit converted its old anatomic pathology module from Sunquest to a CoPath system, it chose only to convert data going back to 1991, says J. Mark Tuthill, MD, division head of pathology informatics at Henry Ford. “The data going back before the conversion cutoff is still on paper, but it’s very rare to need reports that old.”

With 15,000 employees, 600 of them in information technology, Henry Ford is unique in size and in the resources it can devote to IT. But as for most medical centers and hospitals, system LIS storage capacity used to be a problem at Henry Ford because this was limited by the size of the LIS’ physical hard drive. Network storage tools have almost eliminated that problem.

“Our AP and clinical pathology information systems send data out to the hospital EMR system via an HL7 interface, so all the labs and all the AP on a patient are available in reverse chronological order. We used to be constantly monitoring our back-end database because if we started to run out of space we would have to replace the computer’s hard drive, archive data, or get a whole new computer. But now you can easily add terabytes of space to storage arrays even on the fly—just like adding a new record to a record player. It used to be a terabyte of storage was unthinkable in terms of cost; now the storage requirements are on the order of terabytes per month,” Dr. Tuthill says.

Improvements in imaging technology, however, are posing a continuing challenge. “Images for AP now have the potential to be gigabyte file sizes, so you have not just storage issues but the issue of whether your network can handle and move image files that are that large. This has to be planned for.”

One of the biggest challenges of converting Henry Ford’s old anatomic pathology reports was that they had a limited number of fields, with much of the data stored as free text. “The new system is actually a database program, so it’s structured into fields and tables and the other was much simpler and more limited.” Fortunately, Dr. Tuthill says, conversion engineers are familiar with this issue, and when they ready a “system go live” initiative, there are tricks of the trade they can apply and take advantage of.

As a result, most of the roughly six-month process of converting the data was not manual. “We used automatic scripts. The basic process is that you do the conversion on 100 cases and 99 are fine. One will have an error. Then you do a second pass where you attack all those that have errors, and the process continues iteratively. We tended to do cases in chunks of 10,000 and would typically get about 50 errors. So sometimes you could build something in the script to allow the system to fix that error, and sometimes they had to be held in a ‘remedial camp’ file where you’d come back and fix them later.”

Of course, new systems almost always are put in place to do new things with the data. “If you have free text reports, rather than going into fields in the new database, they may just be going into a single field that contains the entire data set. So they’re not actually as highly parsed and data-based as the new information you’ll have going forward. You may not have as rich an analysis opportunity as with the data that you’re creating prospectively.” There are many ways around that, he says. “But how valuable is that legacy data five years in the future? The answer tends to be: not very, especially as regards clinical laboratory data. So how much effort should you spend?”

Now that the text-based LIS database has been converted, conversion from this system to another in the future should, in theory, be easier, Dr. Tuthill says. “The biggest problem with this sort of conversion is that the old LIS may not have as rich a field architecture. So as one migrates that data from the legacy system to the current system, you will have to help IT to remap some of those fields to make them fit together more appropriately.”

There are also legacy data that are not necessarily “clean,” he adds. “There may be lab codes you don’t know the meaning of anymore, maybe from a third-party hospital that’s no longer working with our health system, so there is often a legacy of ‘garbage’ that exists, whether it’s in the EMR or the LIS, and you can, by making judicious choices on how to treat this data, either make life much more difficult or much easier.”

Dr. Tuthill took the position that there was no good reason to convert legacy data on clinical tests like sodium or CBCs because those test results lose their value rapidly. “Why spend hours trying to convert ancient CBC values and map them into your modern database when in actuality the value of that retrospective information is not particularly high?” Not everybody shares that attitude, Dr. Tuthill concedes, and the data may have possible research and clinical care uses, which is particularly true of anatomic pathology data. “But if you can’t figure out what the data means, what the reference ranges were, if the lab test was normal or abnormal at the time it was done, or the sex of the patient, the value for research is also moot.”

Though there are vendors, software engineers, and computer scientists who specialize in this realm of “porting” data from legacy to modern systems, at times conversion can be literally impossible, and Dr. Tuthill has seen several examples of that.

“One thing we saw recently: Somebody wanted to convert a database but forgot the password. The vendor had gone out of business, and the data was encrypted. Guess what? You can’t help them. It could take years, or lots of money, to crack the code and access the database because it used strong encryption.”

In other cases, users can be “locked out” of data because they have an expired license to the software, or the vendor is demanding payment for access, or the application has been retired. “That’s one reason we are constantly updating and always using the current version of our systems. Because if you’re using an LIS that’s 10 years old, you are running a risk of finding yourself phased out.”

“There are a lot of ways to skin the cat,” he says, recalling that his first experience in data conversion was actually done by having people completely retype old reports into the new LIS. There are also companies that will do optical character recognition from microfiche or old paper reports. “So where there’s a will—and money—there’s a way. The question is how much money or effort do you want to spend converting data, and what data are valuable?”

The troubled econ­omy makes that question even more pressing. Dr. Sinard says: “Right now everybody’s very concerned about spending money on anything like LIS upgrades, with everyone going toward the EMR, and they may want to wait until their hospital has chosen an EMR solution. Or, the EMR is potentially going to ease some of the lab’s IT conversion needs, and some labs may say, if I put in a new lab system, I don’t have to bring over all the old data because it’s all in the EMR. That may be true because the data will still be available to clinicians, but it may not be available to the lab. So there’s some playoff there.”

At the moment, Dr. Beckwith notes, “a lot of hospitals are still in the budget cycles established before the crisis. In my own institution we just had our capital for the year released in the last couple of weeks. So for the small things, the process won’t be terribly different. For the $500,000 or million-dollar projects, there’s going to be a different level of scrutiny. If people hadn’t yet signed on the dotted line for a new system, those big projects are not going to be happening for the next couple of years.”

In the meantime, laboratories that are undertaking a conversion need to prioritize and be strategic, Dr. Tuthill says. “You can get your data converted. Plan carefully, get yourself good software engineers who know what they’re doing, and if you take the appropriate steps, you can get the job done.”

Anne Paxton is a writer in Seattle.