Data Quality for AI in Healthcare Whitepaper
Over the past few years, Artificial Intelligence (AI), and more specifically Machine Learning (ML) technology has experienced rapid adoption in the healthcare space as tools for diagnosis and decision-making. Such tools are intended to address challenges in the health care system to both process and the application of rapidly proliferating medical findings to practice, as well as to deliver on the promise of personalized and precision medicine.
The classic computer science idiom of “garbage in = garbage out” certainly holds true for ML systems; the quality of the data used to develop and test the system has a significant impact on the quality of the system output. There are many examples in the press where poor data quality led to poor recommendations and even led to instances of discrimination against certain patient populations, due to bias in the data that was used to develop the system. This paper, developed by the GMLP (Good Machine Learning Practices) Working Team, attempts to describe and address these potential data quality issues.