A Systematic Approach of Data Collection and Analysis in Medical Imaging Research

Document Type : Research Articles


1 Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, 576104, India.

2 Department of Information Science and Engineering, NIE Institute of Technology, Mysuru, 570008, India.

3 Consultant, AI in Radiation Oncology, Bengaluru, India.

4 RTWO Healthcare Solutions, J P Nagara, Bengaluru 560078, India, 560086.

5 Software Consultant, Bengaluru, India 560100.


Background: Obtaining the right image dataset for the medical image research systematically is a tedious task. Anatomy segmentation is the key step before extracting the radiomic features from these images. Objective: The purpose of the study was to segment the 3D colon from CT images and to measure the smaller polyps using image processing techniques. This require huge number of samples for statistical analysis. Our objective was to systematically classify and arrange the dataset based on the parameters of interest so that the empirical testing becomes easier in medical image research. Materials and Methods: This paper discusses a systematic approach of data collection and analysis before using it for empirical testing. In this research the image were considered from National Cancer Institute (NCI). TCIA from NCI has a vast collection of diagnostic quality images for the research community. These datasets were classified before empirical testing of the research objectives. The images in the TCIA collection were acquired as per the standard protocol defined by the American College of Radiology. Patients in the age group of 50-80 years were involved in various clinical trials (multicenter). The dataset collection has more than 10 billion of DICOM images of various anatomies. In this study, the number of samples considered for empirical testing was 300 (n) acquired from both supine and prone positions. The datasets were classified based on the parameters of interest. The classified dataset makes the dataset selection easier during empirical testing. The images were validated for the data completeness as per the DICOM standard of the 2020b version. A case study of CT Colonography dataset is discussed. Conclusion: With this systematic approach of data collection and classification, analysis will be become more easier during empirical testing.


Main Subjects