Aim. To demonstrate the special aspects of dataset creation for neuroimaging using the example of preparing a dataset with computed tomographic images of the brain with and without signs of intracranial hemorrhage.
Methods. The creation of the dataset is based on the methodology developed by the Scientific and Practical Clinical Center for Diagnostics and Telemedicine (regulations for preparing the dataset), which is carried out in 4 stages: planning (selection of the necessary keywords for the initial selection of studies, determination of inclusion and exclusion criteria, source of medical information), selection (initial downloading of the text information - a brief patient history and description protocols from the Unified Radiological Information Service of the city of Moscow for 2020, anonymization of the received data, keywords analysis), labeling and verification (filling out the accompanying table with clinical and technical data, study selection by two radiologists and an expert verification by a neuroradiologist) and publication (publication of the dataset online, state registration).
Results. In the process of creating a dataset, the special aspects, defined by the neuroradiology background, were noted and formulated, which should be taken into the account when executing the primary training, testing and additional training of artificial intelligence services for diagnosing brain diseases: the use of specific terms, the use of images with the least amount of noise and the highest contrast, as well as the use of ratios of subtypes of the target pathology corresponding to its ratio in the population. A dataset with computed tomography images containing signs of intracranial hemorrhage was prepared. The final version of the dataset included anonymized studies of 209 patients (109 with the pathology, 100 without the pathology): DICOM images, an accompanying text table with clinical features (gender, age, type(s) and number of hemorrhages, presence/absence of concomitant pathology) and technical parameters (slice thickness and reconstruction slice thickness).
Conclusion. The special aspects of preparing datasets for training and testing neuroradiological artificial intelligence services were demonstrated.
Methods. The creation of the dataset is based on the methodology developed by the Scientific and Practical Clinical Center for Diagnostics and Telemedicine (regulations for preparing the dataset), which is carried out in 4 stages: planning (selection of the necessary keywords for the initial selection of studies, determination of inclusion and exclusion criteria, source of medical information), selection (initial downloading of the text information - a brief patient history and description protocols from the Unified Radiological Information Service of the city of Moscow for 2020, anonymization of the received data, keywords analysis), labeling and verification (filling out the accompanying table with clinical and technical data, study selection by two radiologists and an expert verification by a neuroradiologist) and publication (publication of the dataset online, state registration).
Results. In the process of creating a dataset, the special aspects, defined by the neuroradiology background, were noted and formulated, which should be taken into the account when executing the primary training, testing and additional training of artificial intelligence services for diagnosing brain diseases: the use of specific terms, the use of images with the least amount of noise and the highest contrast, as well as the use of ratios of subtypes of the target pathology corresponding to its ratio in the population. A dataset with computed tomography images containing signs of intracranial hemorrhage was prepared. The final version of the dataset included anonymized studies of 209 patients (109 with the pathology, 100 without the pathology): DICOM images, an accompanying text table with clinical features (gender, age, type(s) and number of hemorrhages, presence/absence of concomitant pathology) and technical parameters (slice thickness and reconstruction slice thickness).
Conclusion. The special aspects of preparing datasets for training and testing neuroradiological artificial intelligence services were demonstrated.