mmsegmentation/projects/medical/2d_image/histopathology/pcam
masaaki 30e3b49b0b
[Project] Medical semantic seg dataset: Pcam (#2684)
2023-06-25 15:44:43 +08:00
..
configs [Project] Medical semantic seg dataset: Pcam (#2684) 2023-06-25 15:44:43 +08:00
datasets [Project] Medical semantic seg dataset: Pcam (#2684) 2023-06-25 15:44:43 +08:00
tools [Project] Medical semantic seg dataset: Pcam (#2684) 2023-06-25 15:44:43 +08:00
README.md [Project] Medical semantic seg dataset: Pcam (#2684) 2023-06-25 15:44:43 +08:00

README.md

PCam (PatchCamelyon)

Description

This project supports Patch Camelyon (PCam) , which can be downloaded from here.

Dataset Overview

PatchCamelyon is an image classification dataset. It consists of 327680 color images (96 x 96px) extracted from histopathologic scans of lymph node sections. Each image is annotated with a binary label indicating presence of metastatic tissue. PCam provides a new benchmark for machine learning models: bigger than CIFAR10, smaller than ImageNet, trainable on a single GPU.

Statistic Information

Dataset Name Anatomical Region Task Type Modality Num. Classes Train/Val/Test images Train/Val/Test Labeled Release Date License
Pcam throax segmentation histopathology 2 327680/-/- yes/-/- 2018 CC0 1.0
Class Name Num. Train Pct. Train Num. Val Pct. Val Num. Test Pct. Test
background 214849 63.77 - - - -
metastatic tissue 131832 36.22 - - - -

Note:

  • Pct means percentage of pixels in this category in all pixels.

Visualization

pcam

Dataset Citation

@inproceedings{veeling2018rotation,
	title={Rotation equivariant CNNs for digital pathology},
	author={Veeling, Bastiaan S and Linmans, Jasper and Winkens, Jim and Cohen, Taco and Welling, Max},
	booktitle={International Conference on Medical image computing and computer-assisted intervention},
	pages={210--218},
	year={2018},
}

Prerequisites

  • Python v3.8
  • PyTorch v1.10.0
  • pillow(PIL) v9.3.0 9.3.0
  • scikit-learn(sklearn) v1.2.0 1.2.0
  • MIM v0.3.4
  • MMCV v2.0.0rc4
  • MMEngine v0.2.0 or higher
  • MMSegmentation v1.0.0rc5

All the commands below rely on the correct configuration of PYTHONPATH, which should point to the project's directory so that Python can locate the module files. In pcam/ root directory, run the following line to add the current directory to PYTHONPATH:

export PYTHONPATH=`pwd`:$PYTHONPATH

Dataset Preparing

  • download dataset from here and decompress data to path 'data/'.
  • run script "python tools/prepare_dataset.py" to format data and change folder structure as below.
  • run script "python ../../tools/split_seg_dataset.py" to split dataset and generate train.txt, val.txt and test.txt. If the label of official validation set and test set cannot be obtained, we generate train.txt and val.txt from the training set randomly.
mkdir data & cd data
pip install opendatalab
odl get PCam
mv ./PCam/raw/pcamv1 ./
rm -rf PCam
cd ..
python tools/prepare_dataset.py
python ../../tools/split_seg_dataset.py
  mmsegmentation
  ├── mmseg
  ├── projects
  │   ├── medical
  │   │   ├── 2d_image
  │   │   │   ├── histopathology
  │   │   │   │   ├── pcam
  │   │   │   │   │   ├── configs
  │   │   │   │   │   ├── datasets
  │   │   │   │   │   ├── tools
  │   │   │   │   │   ├── data
  │   │   │   │   │   │   ├── train.txt
  │   │   │   │   │   │   ├── val.txt
  │   │   │   │   │   │   ├── images
  │   │   │   │   │   │   │   ├── train
  │   │   │   │   |   │   │   │   ├── xxx.png
  │   │   │   │   |   │   │   │   ├── ...
  │   │   │   │   |   │   │   │   └── xxx.png
  │   │   │   │   │   │   ├── masks
  │   │   │   │   │   │   │   ├── train
  │   │   │   │   |   │   │   │   ├── xxx.png
  │   │   │   │   |   │   │   │   ├── ...
  │   │   │   │   |   │   │   │   └── xxx.png

Divided Dataset Information

Note: The table information below is divided by ourselves.

Class Name Num. Train Pct. Train Num. Val Pct. Val Num. Test Pct. Test
background 171948 63.82 42901 63.6 - -
metastatic tissue 105371 36.18 26461 36.4 - -

Training commands

To train models on a single server with one GPU. (default)

mim train mmseg ./configs/${CONFIG_FILE}

Testing commands

To test models on a single server with one GPU. (default)

mim test mmseg ./configs/${CONFIG_FILE}  --checkpoint ${CHECKPOINT_PATH}

Checklist

  • Milestone 1: PR-ready, and acceptable to be one of the projects/.

    • Finish the code
    • Basic docstrings & proper citation
    • Test-time correctness
    • A full README
  • Milestone 2: Indicates a successful model implementation.

    • Training-time correctness
  • Milestone 3: Good to be a part of our core package!

    • Type hints and docstrings
    • Unit tests
    • Code polishing
    • Metafile.yml
  • Move your modules into the core package following the codebase's file hierarchy structure.

  • Refactor your modules into the core package following the codebase's file hierarchy structure.