New guidelines released for clinical trials incorporating AI interventions
September 12, 2020
There is consensus that rigorous clinical trials are required to establish the safety and efficacy of AI-based healthcare tools and interventions, but these new approaches require trials that are designed to address their particular features. The SPIRIT and CONSORT trial guidelines have now been extended to incorporate elements supporting AI trials. SPIRIT-AI (Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence) includes 15 new protocol elements that should be included in AI trial protocol descriptions, in addition to the core SPIRIT items. CONSORT-AI (Consolidated Standards of Reporting Trials–Artificial Intelligence) includes 14 new items that should be reported routinely in AI clinical trial results, in addition to the core CONSORT items. These extensions to the core guidelines are expected to promote transparency and rigor in the evaluation of new AI systems and strategies.
- Editorial. Setting guidelines to report the use of AI in clinical trials. Nat Med 2020;26:1311.
- Cruz Rivera, S., Liu, X., Chan, A. et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med 2020;26:1351–1363.
- Liu, X., Cruz Rivera, S., Moher, D. et al. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med 26, 1364–1374 (2020).
Automated interpretation of plasma amino acid profiles
September 4, 2020
A machine learning method is reported as a proof of concept for interpreting ion-exchange chromatography profiles of serum amino acids to diagnose inborn errors of metabolism. Training and test data consisted of 2000 cases submitted to a clinical service in one year that were additionally enriched with rare cases, and several ensemble decision tree algorithms were evaluated (random forests, weighted-subspace random forests, and extreme gradient boosted trees, XGBT). Automatic classification was compared with manual classification by two independent experts. Classification of normal vs. abnormal yielded mean area under the precision-recall curve of > 0.94. Multiclass classification for specific diseases yielded a mean F score (weighted for recall) of about 0.78. In both cases, XGBT performed slightly better than the other algorithms. Performance against very limited external data from a proficiency testing program yielded overall accuracy values of 0.8-0.9, suggesting generalizability with performance similar to the internal test data.
- Wilkes EH, Emmett E, Beltran L, Woodward GM, Carling RS. A machine learning approach for the automated interpretation of plasma amino acid profiles. Clin Chem. 2020;66:1210-1218.
Automated cancer detection and grading in prostate core biopsies
July 29, 2020
A new study in the Lancet evaluates a deep learning system for cancer detection, grading, and evaluation of perineurial invasion in whole slide images of prostate core biopsies. Discrimination for cancer was excellent on the primary test data set (>99% sensitivity, >90% specificity) and, after site-specific tuning, was also excellent on data from a second location with a different brand of slide scanner (>98% sensitivity, >97% specificity). The system was implemented as a second read and alerting system in a routine clinical workflow. In 941 sequential cases, 9% of cancer alerts led to additional cuts or stains, 4% led to third opinion requests, and one missed cancer was detected.
- Pantanowitz L, Quiroga-Garza GM, Bien L et al. An artificial intelligence algorithm for prostate cancer diagnosis in whole slide images of core needle biopsies: a blinded clinical validation and deployment study. The Lancet Digital Health. 2020;2:e407-e416.
- An editorial is also available, Janowczyk A, Leo P, Rubin MA. Clinical deployment of AI for prostate cancer diagnosis. The Lancet Digital Health. 2020;2:e383-e384.
AI analysis of routine histology predicts genomic alterations and prognosis across multiple cancer types
July 27, 2020
A deep learning system trained on H&E slides, genomic data, and survival data from 28 cancer types was able to correctly predict gene duplications, chromosomal aneuploidy, focal gene amplifications and deletions, driver gene mutations, gene expression levels, and prognosis across the cancer types from only routine, paraffin-embedded H&E stained slides. The results indicate that common histologic features are associated with these molecular changes and suggest that AI tools may be able to substantially increase the amount of information available from routine histology. A separate study found similar results in 8 cancer types and developed a system that could be implemented on mobile devices.
- Fu Y, Jung AW, Torne RV et al. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nature Cancer. 2020. Read the article
- Kather JN, Heij LR, Grabsch HI et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nature Cancer. 2020. Read the article
ACR and RSNA letter to FDA recommends delay in approval of autonomous AI
June 30, 2020
The American College of Radiology (ACR) and the Radiological Society of North America (RSNA) have written a letter to the FDA recommending against approval of autonomous AI for medical diagnosis and similar tasks at this time, but supporting approval of physician-supervised AI tools. A short summary of the letter and the full 6-page letter are available online. The letter reasons that known problems with generalizability and performance changes over time make it impossible to ensure the quality and safety of autonomous AI using currently-available techniques. A staged approach in which knowledge of AI performance characteristics and management techniques is gained from experience with physician-supervised AI tools may allow a rigorous approach to regulating autonomous AI in the future. Pathologists should follow these developments since the knowledge gained in supervised use of radiology tools will be useful in understanding analogous pathology tools, and because there may be attempts to generalize knowledge from radiologist-supervised tools directly to support autonomous pathology AI.
Molecular profiles of severe COVID-19 infections identified by artificial intelligence
June 9, 2020
Artificial intelligence was used to evaluate almost 500 biomarkers for association with severe COVID-19 infection. Several hundred biomarkers correlated with disease severity and accurate prediction of severe vs. mild infection was possible using a panel of 29 serum factors with a random forest machine learning model. Read the article
Definitions of Artificial Intelligence and Machine Learning
Artificial intelligence (AI) is the ability of computer software to mimic human judgement. Current AI systems carry out only very specific tasks for which they are designed, but they may integrate large amounts of input data to carry out these tasks quickly and accurately. The current excitement about AI is focused on machine learning (ML) systems and this domain is sometimes referred to as AI/ML. AI/ML systems may be trained using defined input data sets, which may include images, to associate patterns in data with clinical contexts such as diagnoses or outcomes. Once trained, AI/ML systems are used with new data to predict diagnosis or outcome in specific cases, or carry out other useful tasks. To date, systems are limited in the range of diagnoses, predictions, and tasks covered, but can be impressively accurate within their defined scope.
- Yu K-H, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nature Biomedical Engineering. 2018;2:719-731.
- Topol EJ. High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine. 2019;25:44-56.
Concept of Augmented Intelligence
The American Medical Association has popularized the term Augmented Intelligence to represent the use of AI/ML as a tool to enhance rather than replace human healthcare providers. The Augmented Intelligence concept is based on studies that integrate AI/ML with human experts in a synergistic workflow that achieves higher performance than either separately. In the pathology context, Augmented Intelligence brings the computational advantages of AI/ML into the clinical and laboratory setting in the form of supportive tools that can enhance pathologists’ diagnostic capabilities by, for example, suggesting regions of interest or counting elements on a slide, or providing decision support to inform clinical judgement.
How AI/ML may be used in Pathology
Pathologists who are interested in AI/ML envision a variety of tools that may provide increased efficiency and diagnostic accuracy in the pathologist’s daily diagnostic workflow. As noted above, tools for the pathologist could scan slides to count elements such as lymph node metastases, mitoses, inflammatory cells, or pathologic organisms, presenting results at sign-out and flagging examples for review. AI/ML tools could also flag regions of interest on a slide or prioritize cases based on slide content. Studies to date have shown promise for automated detection of foci of cancer and invasion, tissue/cell quantification, virtual immunohistochemistry, spatial cell mapping of disease, novel staging paradigms for some types of tumors, and workload triaging. Future systems may be able to correlate patterns across multiple inputs from the medical record, including genomics, allowing a more comprehensive prognostic statement in the pathology report.
- Colling R, Pitman H, Oien K et al. Artificial intelligence in digital pathology: a roadmap to routine use in clinical practice. J Pathol. 2019;249:143-150.
- Campanella G, Hanna MG, Geneslaw L, et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med. 2019;25(8):1301–1309.
- Rashidi HH, Tran NK, Betts EV, Howell LP, Green R. Artificial intelligence and machine learning in pathology: The present landscape of supervised methods. Acad Pathol. 2019;6:2374289519873088.
- Mezheyeuski A, Bergsland CH, Backman M, et al. Multispectral imaging for quantitative and compartment-specific immune infiltrates reveals distinct immune profiles that classify lung cancer patients. J Pathol. 2018;244(4):421–431.
- Wilkes EH, Rumsby G, Woodward GM. Using Machine Learning to Aid the Interpretation of Urine Steroid Profiles. Clin Chem. 2018;64:1586-1595.
- Arnaout R. Machine Learning in Clinical Pathology: Seeing the Forest for the Trees. Clin Chem. 2018;64(11):1553–1554.
- Cabitza F, Banfi G. Machine learning in laboratory medicine: waiting for the flood?. Clin Chem Lab Med. 2018;56(4):516–524.
Ethical use of AI in Healthcare
The need for large sets of patient data to train AI/ML algorithms raises issues of patient consent, privacy, data security, and data de-identification in the production of AI/ML systems. There is also an ethical duty to review algorithms prior to implementation and verify their performance at deployment to ensure that they are safe, efficacious, and reliable. Recent experience has shown that subtle biases may be incorporated into training data and influence the performance of the resulting systems; these must be mitigated and training data must reflect the diversity of the patient population that the AI/ML systems are intended to serve. An algorithm trained without using best practices for representing ethnic groups, socioeconomic classes, ages, and/ or sex may limit system generalizability to these patient populations in real world settings and exclude (or harm) these groups inadvertently. The “black box” nature of some popular algorithms (not revealing the data patterns associated with particular predictions) combined with the natural proprietary orientation of system vendors may lead to transparency problems and difficulty checking the algorithms by independent interpretation. Finally, the human resource toll of AI/ML must be considered: deskilling of the workforce through dependence on AI/ML must be mitigated and there will be a need to repurpose job roles to adapt to increasing automation.
- Keskinbora KH. Medical ethics considerations on artificial intelligence. J Clin Neurosci. 2019;64:277-282.
- O’Sullivan S, Nevejans N, Allen C et al. Legal, regulatory, and ethical frameworks for development of standards in artificial intelligence (AI) and autonomous robotic surgery. Int J Med Robot. 2019;15:e1968.
Regulation of Artificial Intelligence and Machine Learning
The training and use of AI/ML algorithms introduces a fundamentally new kind of data analysis into the healthcare workflow that requires an appropriate regulatory framework. By virtue of their influence on pathologists and other physicians in selection of diagnoses and treatments, the outputs of these algorithms can critically impact patient care. The data patterns identified by these systems are often not exact: there is not perfect separation of classes or predictions. Thus there are analogies with sensitivity, specificity, and predictive value of other complex tests performed by clinical laboratories. However, in machine learning the patterns in data are identified by software and often are not explicitly revealed. Biases or subtle errors may be incorporated inadvertently into machine learning systems and these must be identified and mitigated prior to deployment. Naturally occurring changes in healthcare context such as case mix changes, updated tests or sample preparation, or new therapies, may also change the input data profile and reduce the accuracy of a previously well-functioning machine learning system.
An effective and equitable regulatory framework for machine learning in healthcare will 1) define requirements based on risk, i.e., tailored to the likelihood and magnitude of possible harm from each machine learning application, 2) require best practices for system development by vendors including bias assessment and mitigation, 3) define appropriate best practices for verification of system performance at deployment sites, i.e., local laboratories, 4) define best practices for monitoring the performance of machine learning systems over time and mitigating performance problems that develop, and 5) clearly assign responsibility for problems if and when they occur.
The development of this framework is in early stages. To date, the White House has released draft guidance for regulation of artificial intelligence applications that provides a set of high-level principles to which a regulatory framework in any domain should adhere. Specific to healthcare, the FDA has released a proposal for processes leading to approval or clearance of machine learning software for use as a medical device. Neither of these proposals yet addresses best practices for local performance verification and monitoring of machine learning systems analogous to CLIA-mandated laboratory test performance requirements. The CAP regards this omission as a gap in current regulatory planning for machine learning in healthcare and is promoting the development of a more complete regulatory framework that will include guidance, approved methods, and best practices for local laboratories in deploying machine learning tools as they become available.
- Shulz WL, Durant TJS, Krumholz HM. Validation and regulation of clinical artificial intelligence. Clin Chem 2019;65:1336-1337.
- Allen TC. Regulating artificial intelligence for a successful pathology future. Arch Pathol Lab Med 2019;143(10):1175.
- Office of Management and Budget. Guidance for regulation of artificial intelligence applications. White House Memo. 2020;Jan 7:1-15.
- FDA. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD): Discussion paper and request for feedback. 2019;1-20.
The CAP is engaged in several activities targeting AI/ML. Internally, the Informatics Committee has formed a Machine Learning Working Group focused on education and technical issues particularly related to verification and performance monitoring. This group is sharing its technical work with the FDA. The Information Technology Leadership Committee has formed an AI Project Team to ensure coordination and alignment of AI/ML activities across the organization and to provide reports to the BOG. An AI in Anatomic Pathology Work Group, reporting to the Council on Scientific Affairs, is developing use cases for AI/ML in pathology that may evolve into PT programs.
Externally, the CAP participates in a several organizations including the Alliance for Digital Pathology, a collaborative group interested in the evolution of regulatory science as it applies to digital pathology and AI. The CAP also works with the American College of Radiology Data Science Institute, a resource in understanding how radiologists are developing and using AI systems. In addition, the CAP is the Primary Secretariat to the Integrating the Healthcare Enterprise’s International Pathology and Lab Medicine domain as well as DICOM Working Group 26: Pathology. These standards organizations are developing technical profiles for incorporation of AI/ML systems into healthcare that will be available to developers of AI/ML tools and systems.