Toward AI-Ready Medical Imaging Data
Authors
Milen Nikolov, Edilberto Amorim, J Harry Caufield, Nayoon Gim, Nomi L Harris, Jared Houghtaling, Xiang Li, Danielle Morrison, Anaïs Rameau, Jamie Shaffer, Hari Trivedi, Monica C Munoz-Torres
Categories
Abstract
Medical imaging data plays a vital role in disease diagnosis, monitoring, and clinical research discovery. Biomedical data managers and clinical researchers must navigate a complex landscape of medical imaging infrastructure, input/output tools and data reliability workflow configurations taking months to operationalize. While standard formats exist for medical imaging data, standard operating procedures (SOPs) for data management are lacking. These data management SOPs are key for developing Findable, Accessible, Interoperable, and Reusable (FAIR) data, a prerequisite for AI-ready datasets. The National Institutes of Health (NIH) Bridge to Artificial Intelligence (Bridge2AI) Standards Working Group members and domain-expert stakeholders from the Bridge2AI Grand Challenges teams developed data management SOPs for the Digital Imaging and Communications in Medicine (DICOM) format. We describe novel SOPs applying to both static and cutting edge video imaging modalities. We emphasize steps required for centralized data aggregation, validation, and de-identification, including a review of new defacing methods for facial DICOM scans, anticipating adversarial AI/ML data re-identification methods. Data management vignettes based on Bridge2AI datasets include example parameters for efficient capture of a wide modality spectrum, including datasets from new ophthalmology retinal scans DICOM modalities.
Toward AI-Ready Medical Imaging Data
Categories
Abstract
Medical imaging data plays a vital role in disease diagnosis, monitoring, and clinical research discovery. Biomedical data managers and clinical researchers must navigate a complex landscape of medical imaging infrastructure, input/output tools and data reliability workflow configurations taking months to operationalize. While standard formats exist for medical imaging data, standard operating procedures (SOPs) for data management are lacking. These data management SOPs are key for developing Findable, Accessible, Interoperable, and Reusable (FAIR) data, a prerequisite for AI-ready datasets. The National Institutes of Health (NIH) Bridge to Artificial Intelligence (Bridge2AI) Standards Working Group members and domain-expert stakeholders from the Bridge2AI Grand Challenges teams developed data management SOPs for the Digital Imaging and Communications in Medicine (DICOM) format. We describe novel SOPs applying to both static and cutting edge video imaging modalities. We emphasize steps required for centralized data aggregation, validation, and de-identification, including a review of new defacing methods for facial DICOM scans, anticipating adversarial AI/ML data re-identification methods. Data management vignettes based on Bridge2AI datasets include example parameters for efficient capture of a wide modality spectrum, including datasets from new ophthalmology retinal scans DICOM modalities.
Authors
Milen Nikolov, Edilberto Amorim, J Harry Caufield et al. (+9 more)
Click to preview the PDF directly in your browser