Baselight

Duke Breast Cancer MRI (Pre, Post-1 And Segments)

Processed MRI sequences from DICOM to NIFTI for pre and post contrast sequences

@kaggle.madhava20217_duke_breast_cancer_mri_nifti_pre_and_post_1_only

About this Dataset

Duke Breast Cancer MRI (Pre, Post-1 And Segments)

This dataset is just a processed version from DICOM to NIFTI of the Breast Cancer MRI dataset by Duke University.

Updated 23 October 2023

  • Altered the saved format : now uses img.gz instead of .nii.gz
  • Fixed reorientation of DICOM to NIFTI (now in 1:1 correspondence with the originally supplied annotation boxes)
  • Segmentation masks are more in line with the tumours
  • Pyradiomics extraction amended

Preprocessing steps involved:

  1. Processed individual DICOM slices using SimpleITK. The resultant file format uses an img.gz version that is supported by Pyradiomics. The layout is [slice, height, width] in the numpy array obtained for each sequence. This also eliminated the need for altering the bounding boxes.
  2. Selected Pre and Post-Contrast (Post_1 sequence) for each patient.
  3. Used Otsu thresholding of post-1 sequences for automated segmentation of the 3D volume. The supplied lesion bounding boxes were used to only keep the segmentation within the bounding box.
  4. The post-1 sequence and the segmentation masks of the lesions to extract features from the MRI sequences using Pyradiomics. Mask checking was enabled in Pyradiomics.

Reference for the original dataset:
Saha, A., Harowicz, M.R., Grimm, L.J., Kim, C.E., Ghate, S.V., Walsh, R. and Mazurowski, M.A., 2018. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 DCE-MRI features. British journal of cancer, 119(4), pp.508-516.
A free version of this paper is available here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6134102/.

Share link

Anyone who has the link will be able to view this.