Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Overview

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Introduction

This is an implementation of the model used for breast cancer classification as described in our paper Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening. The implementation allows users to get breast cancer predictions by applying one of our pretrained models: a model which takes images as input (image-only) and a model which takes images and heatmaps as input (image-and-heatmaps).

  • Input images: 2 CC view mammography images of size 2677x1942 and 2 MLO view mammography images of size 2974x1748. Each image is saved as 16-bit png file and gets standardized separately before being fed to the models.
  • Input heatmaps: output of the patch classifier constructed to be the same size as its corresponding mammogram. Two heatmaps are generated for each mammogram, one for benign and one for malignant category. The value of each pixel in both of them is between 0 and 1.
  • Output: 2 predictions for each breast, probability of benign and malignant findings: left_benign, right_benign, left_malignant, and right_malignant.

Both models act on screening mammography exams with four standard views (L-CC, R-CC, L-MLO, R-MLO). As a part of this repository, we provide 4 sample exams (in sample_data/images directory and exam list stored in sample_data/exam_list_before_cropping.pkl). Heatmap generation model and cancer classification models are implemented in PyTorch.

Update (2019/10/26): Our paper will be published in the IEEE Transactions on Medical Imaging!

Update (2019/08/26): We have added a TensorFlow implementation of our image-wise model.

Update (2019/06/21): We have included the image-wise model as described in the paper that generates predictions based on a single mammogram image. This model slightly under-performs the view-wise model used above, but can be used on single mammogram images as opposed to full exams.

Update (2019/05/15): Fixed a minor bug that caused the output DataFrame columns (left_malignant, right_benign) to be swapped. Note that this does not affect the operation of the model.

Prerequisites

  • Python (3.6)
  • PyTorch (0.4.1)
  • torchvision (0.2.0)
  • NumPy (1.14.3)
  • SciPy (1.0.0)
  • H5py (2.7.1)
  • imageio (2.4.1)
  • pandas (0.22.0)
  • tqdm (4.19.8)
  • opencv-python (3.4.2)

License

This repository is licensed under the terms of the GNU AGPLv3 license.

How to run the code

Exam-level

Here we describe how to get predictions from view-wise model, which is our best-performing model. This model takes 4 images from each view as input and outputs predictions for each exam.

bash run.sh

will automatically run the entire pipeline and save the prediction results in csv.

We recommend running the code with a gpu (set by default). To run the code with cpu only, please change DEVICE_TYPE in run.sh to 'cpu'.

If running the individual Python scripts, please include the path to this repository in your PYTHONPATH .

You should obtain the following outputs for the sample exams provided in the repository.

Predictions using image-only model (found in sample_output/image_predictions.csv by default):

index left_benign right_benign left_malignant right_malignant
0 0.0580 0.0754 0.0091 0.0179
1 0.0646 0.9536 0.0012 0.7258
2 0.4388 0.3526 0.2325 0.1061
3 0.3765 0.6483 0.0909 0.2579

Predictions using image-and-heatmaps model (found in sample_output/imageheatmap_predictions.csv by default):

index left_benign right_benign left_malignant right_malignant
0 0.0612 0.0555 0.0099 0.0063
1 0.0507 0.8025 0.0009 0.9000
2 0.2877 0.2286 0.2524 0.0461
3 0.4181 0.3172 0.3174 0.0485

Single Image

Here we also upload image-wise model, which is different from and performs worse than the view-wise model described above. The csv output from view-wise model will be different from that of image-wise model in this section. Because this model has the benefit of creating predictions for each image separately, we make this model public to facilitate transfer learning.

To use the image-wise model, run a command such as the following:

bash run_single.sh "sample_data/images/0_L_CC.png" "L-CC"

where the first argument is path to a mammogram image, and the second argument is the view corresponding to that image.

You should obtain the following output based on the above example command:

Stage 1: Crop Mammograms
Stage 2: Extract Centers
Stage 3: Generate Heatmaps
Stage 4a: Run Classifier (Image)
{"benign": 0.040191903710365295, "malignant": 0.008045293390750885}
Stage 4b: Run Classifier (Image+Heatmaps)
{"benign": 0.052365876734256744, "malignant": 0.005510155577212572}

Image-level Notebook

We have included a sample notebook that contains code for running the classifiers with and without heatmaps (excludes preprocessing).

Data

To use one of the pretrained models, the input is required to consist of at least four images, at least one for each view (L-CC, L-MLO, R-CC, R-MLO).

The original 12-bit mammograms are saved as rescaled 16-bit images to preserve the granularity of the pixel intensities, while still being correctly displayed in image viewers.

sample_data/exam_list_before_cropping.pkl contains a list of exam information before preprocessing. Each exam is represented as a dictionary with the following format:

{
  'horizontal_flip': 'NO',
  'L-CC': ['0_L_CC'],
  'R-CC': ['0_R_CC'],
  'L-MLO': ['0_L_MLO'],
  'R-MLO': ['0_R_MLO'],
}

We expect images from L-CC and L-MLO views to be facing right direction, and images from R-CC and R-MLO views are facing left direction. horizontal_flip indicates whether all images in the exam are flipped horizontally from expected. Values for L-CC, R-CC, L-MLO, and R-MLO are list of image filenames without extension and directory name.

Additional information for each image gets included as a dictionary. Such dictionary has all 4 views as keys, and the values are the additional information for the corresponding key. For example, window_location, which indicates the top, bottom, left and right edges of cropping window, is a dictionary that has 4 keys and has 4 lists as values which contain the corresponding information for the images. Additionally, rightmost_pixels, bottommost_pixels, distance_from_starting_side and best_center are added after preprocessing. Description for these attributes can be found in the preprocessing section. The following is an example of exam information after cropping and extracting optimal centers:

{
  'horizontal_flip': 'NO',
  'L-CC': ['0_L_CC'],
  'R-CC': ['0_R_CC'],
  'L-MLO': ['0_L_MLO'],
  'R-MLO': ['0_R_MLO'],
  'window_location': {
    'L-CC': [(353, 4009, 0, 2440)],
    'R-CC': [(71, 3771, 952, 3328)],
    'L-MLO': [(0, 3818, 0, 2607)],
    'R-MLO': [(0, 3724, 848, 3328)]
   },
  'rightmost_points': {
    'L-CC': [((1879, 1958), 2389)],
    'R-CC': [((2207, 2287), 2326)],
    'L-MLO': [((2493, 2548), 2556)],
    'R-MLO': [((2492, 2523), 2430)]
   },
  'bottommost_points': {
    'L-CC': [(3605, (100, 100))],
    'R-CC': [(3649, (101, 106))],
    'L-MLO': [(3767, (1456, 1524))],
    'R-MLO': [(3673, (1164, 1184))]
   },
  'distance_from_starting_side': {
    'L-CC': [0],
    'R-CC': [0],
    'L-MLO': [0],
    'R-MLO': [0]
   },
  'best_center': {
    'L-CC': [(1850, 1417)],
    'R-CC': [(2173, 1354)],
    'L-MLO': [(2279, 1681)],
    'R-MLO': [(2185, 1555)]
   }
}

The labels for the included exams are as follows:

index left_benign right_benign left_malignant right_malignant
0 0 0 0 0
1 0 0 0 1
2 1 0 0 0
3 1 1 1 1

Pipeline

The pipeline consists of four stages.

  1. Crop mammograms
  2. Calculate optimal centers
  3. Generate Heatmaps
  4. Run classifiers

The following variables defined in run.sh can be modified as needed:

  • NUM_PROCESSES: The number of processes to be used in preprocessing (src/cropping/crop_mammogram.py and src/optimal_centers/get_optimal_centers.py). Default: 10.

  • DEVICE_TYPE: Device type to use in heatmap generation and classifiers, either 'cpu' or 'gpu'. Default: 'gpu'

  • NUM_EPOCHS: The number of epochs to be averaged in the output of the classifiers. Default: 10.

  • HEATMAP_BATCH_SIZE: The batch size to use in heatmap generation. Default: 100.

  • GPU_NUMBER: Specify which one of the GPUs to use when multiple GPUs are available. Default: 0.

  • DATA_FOLDER: The directory where the mammogram is stored.

  • INITIAL_EXAM_LIST_PATH: The path where the initial exam list without any metadata is stored.

  • PATCH_MODEL_PATH: The path where the saved weights for the patch classifier is saved.

  • IMAGE_MODEL_PATH: The path where the saved weights for the image-only model is saved.

  • IMAGEHEATMAPS_MODEL_PATH: The path where the saved weights for the image-and-heatmaps model is saved.

  • CROPPED_IMAGE_PATH: The directory to save cropped mammograms.

  • CROPPED_EXAM_LIST_PATH: The path to save the new exam list with cropping metadata.

  • EXAM_LIST_PATH: The path to save the new exam list with best center metadata.

  • HEATMAPS_PATH: The directory to save heatmaps.

  • IMAGE_PREDICTIONS_PATH: The path to save predictions of image-only model.

  • IMAGEHEATMAPS_PREDICTIONS_PATH: The path to save predictions of image-and-heatmaps model.

Preprocessing

Run the following commands to crop mammograms and calculate information about augmentation windows.

Crop mammograms

python3 src/cropping/crop_mammogram.py \
    --input-data-folder $DATA_FOLDER \
    --output-data-folder $CROPPED_IMAGE_PATH \
    --exam-list-path $INITIAL_EXAM_LIST_PATH  \
    --cropped-exam-list-path $CROPPED_EXAM_LIST_PATH  \
    --num-processes $NUM_PROCESSES

src/import_data/crop_mammogram.py crops the mammogram around the breast and discards the background in order to improve image loading time and time to run segmentation algorithm and saves each cropped image to $PATH_TO_SAVE_CROPPED_IMAGES/short_file_path.png using h5py. In addition, it adds additional information for each image and creates a new image list to $CROPPED_IMAGE_LIST_PATH while discarding images which it fails to crop. Optional --verbose argument prints out information about each image. The additional information includes the following:

  • window_location: location of cropping window w.r.t. original dicom image so that segmentation map can be cropped in the same way for training.
  • rightmost_points: rightmost nonzero pixels after correctly being flipped.
  • bottommost_points: bottommost nonzero pixels after correctly being flipped.
  • distance_from_starting_side: records if zero-value gap between the edge of the image and the breast is found in the side where the breast starts to appear and thus should have been no gap. Depending on the dataset, this value can be used to determine wrong value of horizontal_flip.

Calculate optimal centers

python3 src/optimal_centers/get_optimal_centers.py \
    --cropped-exam-list-path $CROPPED_EXAM_LIST_PATH \
    --data-prefix $CROPPED_IMAGE_PATH \
    --output-exam-list-path $EXAM_LIST_PATH \
    --num-processes $NUM_PROCESSES

src/optimal_centers/get_optimal_centers.py outputs new exam list with additional metadata to $EXAM_LIST_PATH. The additional information includes the following:

  • best_center: optimal center point of the window for each image. The augmentation windows drawn with best_center as exact center point could go outside the boundary of the image. This usually happens when the cropped image is smaller than the window size. In this case, we pad the image and shift the window to be inside the padded image in augmentation. Refer to the data report for more details.

Heatmap Generation

python3 src/heatmaps/run_producer.py \
    --model-path $PATCH_MODEL_PATH \
    --data-path $EXAM_LIST_PATH \
    --image-path $CROPPED_IMAGE_PATH \
    --batch-size $HEATMAP_BATCH_SIZE \
    --output-heatmap-path $HEATMAPS_PATH \
    --device-type $DEVICE_TYPE \
    --gpu-number $GPU_NUMBER

src/heatmaps/run_producer.py generates heatmaps by combining predictions for patches of images and saves them as hdf5 format in $HEATMAPS_PATH using $DEVICE_TYPE device. $DEVICE_TYPE can either be 'gpu' or 'cpu'. $HEATMAP_BATCH_SIZE should be adjusted depending on available memory size. An optional argument --gpu-number can be used to specify which GPU to use.

Running the models

src/modeling/run_model.py can provide predictions using cropped images either with or without heatmaps. When using heatmaps, please use the--use-heatmaps flag and provide appropriate the --model-path and --heatmaps-path arguments. Depending on the available memory, the optional argument --batch-size can be provided. Another optional argument --gpu-number can be used to specify which GPU to use.

Run image only model

python3 src/modeling/run_model.py \
    --model-path $IMAGE_MODEL_PATH \
    --data-path $EXAM_LIST_PATH \
    --image-path $CROPPED_IMAGE_PATH \
    --output-path $IMAGE_PREDICTIONS_PATH \
    --use-augmentation \
    --num-epochs $NUM_EPOCHS \
    --device-type $DEVICE_TYPE \
    --gpu-number $GPU_NUMBER

This command makes predictions only using images for $NUM_EPOCHS epochs with random augmentation and outputs averaged predictions per exam to $IMAGE_PREDICTIONS_PATH.

Run image+heatmaps model

python3 src/modeling/run_model.py \
    --model-path $IMAGEHEATMAPS_MODEL_PATH \
    --data-path $EXAM_LIST_PATH \
    --image-path $CROPPED_IMAGE_PATH \
    --output-path $IMAGEHEATMAPS_PREDICTIONS_PATH \
    --use-heatmaps \
    --heatmaps-path $HEATMAPS_PATH \
    --use-augmentation \
    --num-epochs $NUM_EPOCHS \
    --device-type $DEVICE_TYPE \
    --gpu-number $GPU_NUMBER

This command makes predictions using images and heatmaps for $NUM_EPOCHS epochs with random augmentation and outputs averaged predictions per exam to $IMAGEHEATMAPS_PREDICTIONS_PATH.

Getting image from dicom files and saving as 16-bit png files

Dicom files can be converted into png files with the following function, which then can be used by the code in our repository (pypng 0.0.19 and pydicom 1.2.2 libraries are required).

import png
import pydicom

def save_dicom_image_as_png(dicom_filename, png_filename, bitdepth=12):
    """
    Save 12-bit mammogram from dicom as rescaled 16-bit png file.
    :param dicom_filename: path to input dicom file.
    :param png_filename: path to output png file.
    :param bitdepth: bit depth of the input image. Set it to 12 for 12-bit mammograms.
    """
    image = pydicom.read_file(dicom_filename).pixel_array
    with open(png_filename, 'wb') as f:
        writer = png.Writer(height=image.shape[0], width=image.shape[1], bitdepth=bitdepth, greyscale=True)
        writer.write(f, image.tolist())

Reference

If you found this code useful, please cite our paper:

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening
Nan Wu, Jason Phang, Jungkyu Park, Yiqiu Shen, Zhe Huang, Masha Zorin, Stanisław Jastrzębski, Thibault Févry, Joe Katsnelson, Eric Kim, Stacey Wolfson, Ujas Parikh, Sushma Gaddam, Leng Leng Young Lin, Kara Ho, Joshua D. Weinstein, Beatriu Reig, Yiming Gao, Hildegard Toth, Kristine Pysarenko, Alana Lewin, Jiyon Lee, Krystal Airola, Eralda Mema, Stephanie Chung, Esther Hwang, Naziya Samreen, S. Gene Kim, Laura Heacock, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras
IEEE Transactions on Medical Imaging
2019

@article{wu2019breastcancer, 
    title = {Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening},
    author = {Nan Wu and Jason Phang and Jungkyu Park and Yiqiu Shen and Zhe Huang and Masha Zorin and Stanis\l{}aw Jastrz\k{e}bski and Thibault F\'{e}vry and Joe Katsnelson and Eric Kim and Stacey Wolfson and Ujas Parikh and Sushma Gaddam and Leng Leng Young Lin and Kara Ho and Joshua D. Weinstein and Beatriu Reig and Yiming Gao and Hildegard Toth and Kristine Pysarenko and Alana Lewin and Jiyon Lee and Krystal Airola and Eralda Mema and Stephanie Chung and Esther Hwang and Naziya Samreen and S. Gene Kim and Laura Heacock and Linda Moy and Kyunghyun Cho and Krzysztof J. Geras}, 
    journal = {IEEE Transactions on Medical Imaging},
    year = {2019}
}
Issues
  • Testing on other data sets

    Testing on other data sets

    Hi, so I was testing the model against another dataset of mammos, and was wondering if the inputted image dimensions have to be exact? Your sample cropped photos: (2440X3656) & (2607X3818) and of ours (1993X4396) & (2133X4906).

    P_00005,RIGHT, CC, MALIGNANT, 0.1293 ,0.0123 P_00005,RIGHT,MLO, MALIGNANT ,0.1293, 0.0123 P_00007, LEFT, CC, BENIGN ,0.3026, 0.1753 P_00007,LEFT,MLO, BENIGN, 0.3026, 0.1753

    As you can see the probabilities for benign and malignancy (respectively) are incredibly low, Out of a dataset of 200, the model only accurately predicted ~ 10 of them.

    opened by Tonthatj 15
  • View the original image with heatmaps

    View the original image with heatmaps

    Hi,

    Thanks for the great contribution in mammogram! I really appreciate your work. I just wonder, how to view the heatmap prediction to the original image?

    Ive tried to view the hdf5 file but didnt get quite right for this.

    Thanks!

    opened by ghost 7
  • prediction of a single case

    prediction of a single case

    Thanks for the great contribution in mammogram! I really appreciate your work.I'm using this repository for my school project but i just wonder, how to use the project to predict one single case (4 mammogram photos) instead of 16 (16 photos found in sample_data/images). i tried many things before asking and did not work , it keeps asking for the rest of images , or batch_size error

    Thanks for reading my question

    opened by Emirbz 6
  • some images doesn't work in crop_single_mammogram.py

    some images doesn't work in crop_single_mammogram.py

    thank you for your work sharing, I'm trying to adapt your repository to our dataset.

    `score_heat_list = [] import glob def make_dir(name): if not os.path.isdir(name): os.makedirs(name) print(name, "폴더가 생성되었습니다.") else: print("해당 폴더가 이미 존재합니다.")

    make_dir('save_imageheatmap_model_figure_folder')

    def json_extract_feature(json_data): patient=json_data['case_id']

    #read_all_data:
    """
    components 
    'user id' = no
    'case_id' = split.('_')[1] = patients number
    'contour_list' = dict('image_type',dict())
    
    """
    temp_image_type = []
    temp_image_type1 = []
    temp_image_type2 = []
    temp_image_type3 = []
    
    temp_key = []
    
    temp_contour = []
    temp_contour1 = []
    temp_contour2 = []
    temp_contour3 = []
    
    
    
    for image_type in json_data['contour_list']['cancer']:
        # print(image_type)
        if image_type == 'lcc': 
            temp_image_type.append(image_type)
        if image_type == 'lmlo':
            temp_image_type1.append(image_type)
        if image_type == 'rcc':
            temp_image_type2.append(image_type)
        if image_type == 'rmlo':
            temp_image_type3.append(image_type)
    
    
    
        for key in json_data['contour_list']['cancer'][image_type]:
            # print(key)
    
            for contour in json_data['contour_list']['cancer'][image_type][key]:
                
                # print(contour)
                # print(contour.get('x'))
                # print(contour.get('y'))
                bin_list = [contour.get('y'),contour.get('x')]
                if image_type == 'lcc':
                    temp_contour.append(bin_list)
                if image_type == 'lmlo':
                    temp_contour1.append(bin_list)
                if image_type == 'rcc':
                    temp_contour2.append(bin_list)
                elif image_type == 'rmlo':
                    temp_contour3.append(bin_list)
        
    return temp_image_type,temp_image_type1,temp_image_type2,temp_image_type3,temp_contour,temp_contour1,temp_contour2,temp_contour3
    

    from skimage import draw def polygon2mask(image_shape, polygon): """Compute a mask from polygon. Parameters ---------- image_shape : tuple of size 2. The shape of the mask. polygon : array_like. The polygon coordinates of shape (N, 2) where N is the number of points. Returns ------- mask : 2-D ndarray of type 'bool'. The mask that corresponds to the input polygon. Notes ----- This function does not do any border checking, so that all the vertices need to be within the given shape. Examples -------- >>> image_shape = (128, 128) >>> polygon = np.array([[60, 100], [100, 40], [40, 40]]) >>> mask = polygon2mask(image_shape, polygon) >>> mask.shape (128, 128) """ polygon = np.asarray(polygon) vertex_row_coords, vertex_col_coords = polygon.T fill_row_coords, fill_col_coords = draw.polygon( vertex_row_coords, vertex_col_coords, image_shape) mask = np.zeros(image_shape, dtype=np.bool) mask[fill_row_coords, fill_col_coords] = True return mask ############################################################################################################################## from tqdm import tqdm from src.heatmaps.run_producer_single import produce_heatmaps import json from PIL import Image annotation_folder = r'/home/ncc/Desktop/2020_deep_learning_breastcancer/annotation_SN/' import pickle for png in tqdm(png_list[0:8]): print(PATH+png) crop_single_mammogram(PATH+png, horizontal_flip = 'NO', view = png.split('_')[1].split('.')[0], cropped_mammogram_path = PATH+'cropped_image/'+png, metadata_path = PATH+png.split('.')[0]+'.pkl',num_iterations = 100, buffer_size = 50) print(PATH+'cropped_image/'+png) get_optimal_center_single(PATH+'cropped_image/'+png,PATH+png.split('.')[0]+'.pkl') model_input = load_inputs( image_path=PATH+'cropped_image/'+png, metadata_path=PATH+png.split('.')[0]+'.pkl', use_heatmaps=False, )
    ####################################################################################################################################
    parameters = dict( device_type='gpu', gpu_number='0',

    patch_size=256,
    
    stride_fixed=20,
    more_patches=5,
    minibatch_size=10,
    seed=np.random.RandomState(shared_parameters["seed"]),
    
    initial_parameters="/home/ncc/Desktop/breastcancer/nccpatient/breast_cancer_classifier/models/sample_patch_model.p",
    input_channels=3,
    number_of_classes=4,
    
    cropped_mammogram_path=PATH+'cropped_image/'+png,
    metadata_path=PATH+png.split('.')[0]+'.pkl',
    heatmap_path_malignant=PATH+png.split('.')[0]+'_malignant_heatmap.hdf5',
    heatmap_path_benign=PATH+png.split('.')[0]+'_benign_heatmap.hdf5',
    
    heatmap_type=[0, 1],  # 0: malignant 1: benign 0: nothing
    
    use_hdf5="store_true"
    

    ) ###########################################################################################################################

    read annotation SN00000016_L-CC.png

    #코드를 읽어보면 이름이 같은 JSON 파일을 4번 읽어오고 있음.. 코드 경량화때 해결 필요 #annotation 기준은 CROP된 이미지가 아니라, 원본 이미지임, 그런데 이미지로 보여주는건 CROP된 이미지로 보여주고 있음..

    # print(png.split('_')[0])
    with open(PATH+png.split('.')[0]+'.pkl','rb') as f:
        location_data = pickle.load(f)
    print(location_data)
    start_point1 = list(location_data['window_location'])[0]
    endpoint1 = list(location_data['window_location'])[1]
    start_point2 = list(location_data['window_location'])[2]
    endpoint2 = list(location_data['window_location'])[3]
    print(start_point1,start_point2)
    with open(annotation_folder+'Cancer_'+png.split('_')[0]+'.json') as json_file:
        json_data = json.load(json_file)
    
    temp_image_type,temp_image_type1,temp_image_type2,temp_image_type3,temp_contour,temp_contour1,temp_contour2,temp_contour3 = json_extract_feature(json_data)
    
    import operator
    if png.split('_')[1].split('.')[0] =='L-CC':
        new_contour_list = temp_contour
    if png.split('_')[1].split('.')[0] =='L-MLO':
        new_contour_list = temp_contour1
    if png.split('_')[1].split('.')[0] =='R-CC':
        new_contour_list = temp_contour2  
    if png.split('_')[1].split('.')[0] =='R-MLO':
        new_contour_list = temp_contour3
    
    im = Image.open(PATH+png)
    im_cropped = Image.open(PATH+'cropped_image/'+png)
    print('원본 이미지:',im.size,'cropped image:',im_cropped.size)
    new_contour = []
    for image_list in new_contour_list:
        # print('_',image_list)
        new_temp_contour =map(operator.add,image_list,reversed(list(np.array(im.size)/2)))
        new_contour.append(list(new_temp_contour))
        # print(new_contour)
    try:
        # 'window_location': (103, 2294, 0, 1041)
        img = polygon2mask(im.size[::-1],np.array(list(new_contour)))
        img_cropped = img[start_point1:endpoint1,start_point2:endpoint2]
        im = cv2.imread(PATH+png)
        im_cropped = cv2.imread(PATH+'cropped_image/'+png)
    except ValueError as e:
        img = np.zeros(im.size)
    

    ########################################################################################################################### random_number_generator = np.random.RandomState(shared_parameters["seed"])

    # random_number_generator = np.random.RandomState(shared_parameters["seed"])
    produce_heatmaps(parameters)
    image_heatmaps_parameters = shared_parameters.copy()
    image_heatmaps_parameters["view"] = png.split('_')[1].split('.')[0]
    image_heatmaps_parameters["use_heatmaps"] = True
    image_heatmaps_parameters["model_path"] = "/home/ncc/Desktop/breastcancer/nccpatient/breast_cancer_classifier/models/ImageHeatmaps__ModeImage_weights.p"
    
    
    
    model, device = load_model(image_heatmaps_parameters)
    
    model_input = load_inputs(
    image_path=PATH+'cropped_image/'+png,
    metadata_path=PATH+png.split('.')[0]+'.pkl',
    use_heatmaps=True,
    benign_heatmap_path=PATH+png.split('.')[0]+'_malignant_heatmap.hdf5',
    malignant_heatmap_path=PATH+png.split('.')[0]+'_benign_heatmap.hdf5')
    
    batch = [
    process_augment_inputs(
        model_input=model_input,
        random_number_generator=random_number_generator,
        parameters=image_heatmaps_parameters,
        ),
    ]
    
    tensor_batch = batch_to_tensor(batch, device)
    y_hat = model(tensor_batch)
    ###############################################################
    fig, axes = plt.subplots(1, 5, figsize=(16, 4))
    x = tensor_batch[0].cpu().numpy()
    axes[0].imshow(im, cmap="gray")
    axes[0].imshow(img, cmap = 'autumn', alpha = 0.4)
    axes[0].set_title("OG_Image")
    
    axes[1].imshow(im_cropped, cmap="gray")
    axes[1].imshow(img_cropped, cmap = 'autumn', alpha = 0.4)
    axes[1].set_title("Image")
    
    axes[2].imshow(x[0], cmap="gray")
    axes[2].imshow(img_cropped, cmap = 'autumn', alpha = 0.4)
    axes[2].set_title("Image")
    
    axes[3].imshow(x[1], cmap=LinearSegmentedColormap.from_list("benign", [(0, 0, 0), (0, 1, 0)]))
    axes[3].set_title("Benign Heatmap")
    
    axes[4].imshow(x[2], cmap=LinearSegmentedColormap.from_list("malignant", [(0, 0, 0), (1, 0, 0)]))
    axes[4].set_title("Malignant Heatmap")
    plt.savefig('save_imageheatmap_model_figure_folder'+'/'+png.split('.')[0]+'.png')
    ################################################################
    predictions = np.exp(y_hat.cpu().detach().numpy())[:, :2, 1]
    predictions_dict = {
        "image" : png,
        "benign": float(predictions[0][0]),
        "malignant": float(predictions[0][1]),
    }
    
    print(predictions_dict)
    score_heat_list.append(predictions_dict)`
    

    Screenshot from 2020-11-25 11-38-47

    Attached file is cropped mammography which is made by this code. Issue is some mammogram doesn't crop well. Am I doing something wrong?

    opened by nightandweather 5
  • Fixed requirements.txt

    Fixed requirements.txt

    Pillow requirement needed an extra =, and the numpy version was updated to fix:

    pandas 1.1.1 requires numpy>=1.15.4, but you'll have numpy 1.14.5 which is incompatible.
    
    opened by JimmyWhitaker 4
  • RuntimeError: Error(s) in loading state_dict for SplitBreastModel:

    RuntimeError: Error(s) in loading state_dict for SplitBreastModel:

    With Device_type = 'cpu', I am getting below error during running 'Stage 4a: Run Classifier (Image)' stage.

    Traceback (most recent call last): File "src/modeling/run_model.py", line 238, in main() File "src/modeling/run_model.py", line 233, in main parameters=parameters, File "src/modeling/run_model.py", line 188, in load_run_save model, device = load_model(parameters) File "src/modeling/run_model.py", line 51, in load_model model.load_state_dict(torch.load(parameters["model_path"])["model"]) File "/usr/local/envs/py3env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for SplitBreastModel: Missing key(s) in state_dict: "fc1_cc.weight", "fc1_cc.bias", "fc1_mlo.weight", "fc1_mlo.bias", "output_layer_cc.fc_layer.weight", "output_layer_cc.fc_layer.bias", "output_layer_mlo.fc_layer.weight", "output_layer_mlo.fc_layer.bias". Unexpected key(s) in state_dict: "fc1_lcc.weight", "fc1_lcc.bias", "fc1_rcc.weight", "fc1_rcc.bias", "fc1_lmlo.weight", "fc1_lmlo.bias", "fc1_rmlo.weight", "fc1_rmlo.bias", "output_layer_lcc.fc_layer.weight", "output_layer_lcc.fc_layer.bias", "output_layer_rcc.fc_layer.weight", "output_layer_rcc.fc_layer.bias", "output_layer_lmlo.fc_layer.weight", "output_layer_lmlo.fc_layer.bias", "output_layer_rmlo.fc_layer.weight", "output_layer_rmlo.fc_layer.bias".

    opened by sanjaykhobragade 4
  • Predict on DDSM

    Predict on DDSM

    Firstly, Thanks for your great work on mammogram classification. Recently, I tried to predict your model on a public dataset(DDSM, and CBIS-DDSM). But I found the result is always predicted as BENIGN. Below is a sample case for your reference

    Predicted by model (only image): left_benign right_benign left_malignant right_malignant 0.2456 0.3804 0.0131 0.0716 0.1495 0.5369 0.0180 0.1072 0.1644 0.1658 0.0338 0.0284 0.1821 0.3101 0.0121 0.0585

    GoundTurth: left_benign right_benign left_malignant right_malignant 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0

    I known the imbalance issue which described in #9, so I selected 2 obvious MALIGNANT cases and 1 obvious BENIGN case. And done all preprocessing which described in #9 and dataset report. But the result is still predicted as BENIGN.

    So, could you have evaluated the released model (in your code: breast_cancer_classifier/models/sample_image_model.p) on DDSM or INBreast? And the other question I want to known is, what's different between DDSM with your private dataset? Thanks

    opened by Chunlwu 4
  • Issue with image_extension when parameter use-hdf5 is used

    Issue with image_extension when parameter use-hdf5 is used

    There is an issue in run_producer with image_extension when use-hdf5 is added as a parameter in run.sh.

    Traceback:

    Traceback (most recent call last):
      File "src/heatmaps/run_producer.py", line 392, in <module>
        main()
      File "src/heatmaps/run_producer.py", line 388, in main
        produce_heatmaps(model, device, parameters)
      File "src/heatmaps/run_producer.py", line 344, in produce_heatmaps
        making_heatmap_with_large_minibatch_potential(parameters, model, exam_list, device)
      File "src/heatmaps/run_producer.py", line 270, in making_heatmap_with_large_minibatch_potential
        all_patches, all_cases = sample_patches(exam, parameters)
      File "src/heatmaps/run_producer.py", line 223, in sample_patches
        parameters=parameters,
      File "src/heatmaps/run_producer.py", line 240, in sample_patches_single
        parameters,
      File "src/heatmaps/run_producer.py", line 102, in ori_image_prepare
        image = loading.load_image(image_path, view, horizontal_flip)
      File "src/data_loading/loading.py", line 59, in load_image
        image = read_image_mat(image_path)
      File "src/utilities/reading_images.py", line 37, in read_image_mat
        data = h5py.File(file_name, 'r')
      File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 312, in __init__
        fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
      File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 142, in make_fid
        fid = h5f.open(name, flags, fapl=fapl)
      File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
      File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
      File "h5py/h5f.pyx", line 78, in h5py.h5f.open
    OSError: Unable to open file (unable to open file: name = 'sample_output/cropped_images/0_L_CC.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
    
    

    Issue seems to go away by hard-coding here

    def get_image_path(short_file_path, parameters):
        """
        Convert short_file_path to full file path
        """
        return os.path.join(parameters['original_image_path'], short_file_path + 'png')
    

    The intention has probably been not to use use-hdf5 parameter at all, but it is listed in run_producer and it does allow the script to be modified to save also in png format (e.g. for visualization purposes) by adding here

    saving_images.save_image_as_png(img_as_ubyte(heatmap_malignant), os.path.join(
            parameters['save_heatmap_path'][0], 
            short_file_path + '.png
        ))
    saving_images.save_image_as_png(img_as_ubyte(heatmap_benign), os.path.join(
            parameters['save_heatmap_path'][1],
            short_file_path + '.png'
        ))
    

    There is a somewhat similar issue in run_model with image_extension when use-hdf5 is added as a parameter in run.sh.

    Traceback:

    Traceback (most recent call last):
      File "src/modeling/run_model.py", line 238, in <module>
        main()
      File "src/modeling/run_model.py", line 233, in main
        parameters=parameters,
      File "src/modeling/run_model.py", line 189, in load_run_save
        predictions = run_model(model, device, exam_list, parameters)
      File "src/modeling/run_model.py", line 82, in run_model
        horizontal_flip=datum["horizontal_flip"],
      File "src/data_loading/loading.py", line 59, in load_image
        image = read_image_mat(image_path)
      File "src/utilities/reading_images.py", line 37, in read_image_mat
        data = h5py.File(file_name, 'r')
      File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 312, in __init__
        fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
      File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 142, in make_fid
        fid = h5f.open(name, flags, fapl=fapl)
      File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
      File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
      File "h5py/h5f.pyx", line 78, in h5py.h5f.open
    OSError: Unable to open file (unable to open file: name = 'sample_output/cropped_images/0_L_CC.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
    
    

    Perhaps the safest solution is to hard-code here the correct file extension

    loaded_image = loading.load_image(
        image_path=os.path.join(parameters["image_path"], short_file_path + ".png"),
        view=view,
        horizontal_flip=datum["horizontal_flip"],
        )
    
    opened by aisosalo 4
  • OSError: Unable to open file (unable to open file: name = 'sample_output/heatmaps/heatmap_benign/0_L_CC.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    OSError: Unable to open file (unable to open file: name = 'sample_output/heatmaps/heatmap_benign/0_L_CC.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    Hi!

    Thank you for a great project.

    I'm trying to run the project on a K80 on Google Cloud using their provided Pytorch image.

    The model fails when trying to create the heatmats with the following error.

    I've tried creating the missing directories and reading through the source code.

    Any help would be greatly appreciated. Stage 1: Crop Mammograms Error: the directory to save cropped images already exists. Stage 2: Extract Centers Stage 3: Generate Heatmaps Traceback (most recent call last): File "src/heatmaps/run_producer.py", line 29, in import tensorflow as tf ModuleNotFoundError: No module named 'tensorflow' Stage 4a: Run Classifier (Image) 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:19<00:00, 5.10s/it] Stage 4b: Run Classifier (Image+Heatmaps) 0%| | 0/4 [00:00<?, ?it/s] Traceback (most recent call last): File "src/modeling/run_model.py", line 195, in main() File "src/modeling/run_model.py", line 190, in main parameters=parameters, File "src/modeling/run_model.py", line 148, in load_run_save predictions = run_model(model, exam_list, parameters) File "src/modeling/run_model.py", line 80, in run_model horizontal_flip=datum["horizontal_flip"], File "/home/birgermoell/chexnet/breast_cancer_classifier/src/data_loading/loading.py", line 72, in load_heatmaps benign_heatmap = load_image(benign_heatmap_path, view, horizontal_flip) File "/home/birgermoell/chexnet/breast_cancer_classifier/src/data_loading/loading.py", line 59, in load_image image = read_image_mat(image_path) File "/home/birgermoell/chexnet/breast_cancer_classifier/src/utilities/reading_images.py", line 37, in read_image_mat data = h5py.File(file_name, 'r') File "/opt/anaconda3/lib/python3.7/site-packages/h5py/_hl/files.py", line 312, in init fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr) File "/opt/anaconda3/lib/python3.7/site-packages/h5py/_hl/files.py", line 142, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 78, in h5py.h5f.open OSError: Unable to open file (unable to open file: name = 'sample_output/heatmaps/heatmap_benign/0_L_CC.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    opened by BirgerMoell 4
  • Import Error: No module named 'imageIO'

    Import Error: No module named 'imageIO'

    Hi everyone I have downloaded the files and when I try to run run.sh, the program breaks at each stage when importing reading_images.py at line 27. ImportError: No module named 'imageio'. I am doing this through the command line. I try to see if it is installed I did pip install imagio and it said requirement already satisfied. I then further try to see if its the path to python by trying to run :python3 import imageio in the same directory, but that command doesn't throw an error.

    opened by Tonthatj 3
  • Bump pillow from 6.2.2 to 8.3.2

    Bump pillow from 6.2.2 to 8.3.2

    Bumps pillow from 6.2.2 to 8.3.2.

    Release notes

    Sourced from pillow's releases.

    8.3.2

    https://pillow.readthedocs.io/en/stable/releasenotes/8.3.2.html

    Security

    • CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]

    • Fix 6-byte OOB read in FliDecode [wiredfool]

    Python 3.10 wheels

    • Add support for Python 3.10 #5569, #5570 [hugovk, radarhere]

    Fixed regressions

    • Ensure TIFF RowsPerStrip is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]

    • Updates for ImagePalette channel order #5599 [radarhere]

    • Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]

    8.3.1

    https://pillow.readthedocs.io/en/stable/releasenotes/8.3.1.html

    Changes

    8.3.0

    https://pillow.readthedocs.io/en/stable/releasenotes/8.3.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    8.3.2 (2021-09-02)

    • CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]

    • Fix 6-byte OOB read in FliDecode [wiredfool]

    • Add support for Python 3.10 #5569, #5570 [hugovk, radarhere]

    • Ensure TIFF RowsPerStrip is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]

    • Updates for ImagePalette channel order #5599 [radarhere]

    • Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]

    8.3.1 (2021-07-06)

    • Catch OSError when checking if fp is sys.stdout #5585 [radarhere]

    • Handle removing orientation from alternate types of EXIF data #5584 [radarhere]

    • Make Image.array take optional dtype argument #5572 [t-vi, radarhere]

    8.3.0 (2021-07-01)

    • Use snprintf instead of sprintf. CVE-2021-34552 #5567 [radarhere]

    • Limit TIFF strip size when saving with LibTIFF #5514 [kmilos]

    • Allow ICNS save on all operating systems #4526 [baletu, radarhere, newpanjing, hugovk]

    • De-zigzag JPEG's DQT when loading; deprecate convert_dict_qtables #4989 [gofr, radarhere]

    • Replaced xml.etree.ElementTree #5565 [radarhere]

    ... (truncated)

    Commits
    • 8013f13 8.3.2 version bump
    • 23c7ca8 Update CHANGES.rst
    • 8450366 Update release notes
    • a0afe89 Update test case
    • 9e08eb8 Raise ValueError if color specifier is too long
    • bd5cf7d FLI tests for Oss-fuzz crash.
    • 94a0cf1 Fix 6-byte OOB read in FliDecode
    • cece64f Add 8.3.2 (2021-09-02) [CI skip]
    • e422386 Add release notes for Pillow 8.3.2
    • 08dcbb8 Pillow 8.3.2 supports Python 3.10 [ci skip]
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Fast and Easy Infinite Neural Networks in Python

Neural Tangents ICLR 2020 Video | Paper | Quickstart | Install guide | Reference docs | Release notes Overview Neural Tangents is a high-level neural

Google 1.6k Oct 24, 2021
A comprehensive list of published machine learning applications to cosmology

ml-in-cosmology This github attempts to maintain a comprehensive list of published machine learning applications to cosmology, organized by subject ma

George Stein 211 Oct 15, 2021
GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

null 215 Oct 18, 2021
All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

Daniel Bourke 1.7k Oct 24, 2021
A curated list of neural network pruning resources.

A curated list of neural network pruning and related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awesome-deep-learning-papers and Awesome-NAS.

Yang He 1.3k Oct 15, 2021
PyTorch implementation of neural style transfer algorithm

neural-style-pt This is a PyTorch implementation of the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias

null 594 Oct 23, 2021
A curated list of neural rendering resources.

Awesome-of-Neural-Rendering A curated list of neural rendering and related resources. Please feel free to pull requests or open an issue to add papers

Zhiwei ZHANG 25 Oct 4, 2021
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Packt 1.1k Oct 21, 2021
Geometric Deep Learning Extension Library for PyTorch

Documentation | Paper | Colab Notebooks | External Resources | OGB Examples PyTorch Geometric (PyG) is a geometric deep learning extension library for

Matthias Fey 12.8k Oct 22, 2021
A curated list of resources for Image and Video Deblurring

A curated list of resources for Image and Video Deblurring

Subeesh Vasu 1.1k Oct 23, 2021
Must-read Papers on Physics-Informed Neural Networks.

PINNpapers Contributed by IDRL lab. Introduction Physics-Informed Neural Network (PINN) has achieved great success in scientific computing since 2017.

IDRL 30 Oct 15, 2021
Classic Papers for Beginners and Impact Scope for Authors.

There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provide a Top Academic Paper Chart for beginners and reseachers to take one step faster.

Qiulin Zhang 162 Oct 18, 2021
COVID-Net Open Source Initiative

The COVID-Net models provided here are intended to be used as reference models that can be built upon and enhanced as new data becomes available

Linda Wang 1k Oct 10, 2021
Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introduce a general-purpose deep learning architecture that takes as input the entire dataset instead of processing one datapoint at a time.

OATML 258 Oct 17, 2021
Official Implementation of Neural Splines

Neural Splines: Fitting 3D Surfaces with Inifinitely-Wide Neural Networks This repository contains the official implementation of the CVPR 2021 (Oral)

Francis Williams 21 Oct 15, 2021
NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring Uncensored version of the following image can be found at https://i.

notAI.tech 804 Oct 19, 2021
Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

Computational Pathology 5 Oct 9, 2021
PyTorch implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 10.4k Oct 22, 2021
🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

??Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

Realcat 4 Oct 21, 2021