<- Back to Datasets

PanNuke

Open Pan-Cancer Histology Dataset

PanNuke

In this work we present an experimental setup to semi automatically obtain exhaustive nuclei labels across 19 different tissue types, and therefore construct a large pan-cancer dataset for nuclei instance segmentation and classification, with minimal sampling bias. The dataset consists of 455 visual fields, of which 312 are randomly sampled from more than 20K whole slide images at different magnifications, from multiple data sources. In total the dataset contains 216.4K labeled nuclei, each with an instance segmentation mask. We independently pursue three separate streams to create the dataset: detection, classification, and instance segmentation by ensembling in total 34 models from already existing, public datasets, therefore showing that the learnt knowledge can be efficiently transferred to create new datasets. All three streams are either validated on existing public benchmarks or validated by expert pathologists, and finally merged and validated once again to create a large, comprehensive pan-cancer nuclei segmentation and detection dataset PanNuke.

PanNuke
Items
312
Classes
20000
Labels
Models using this dataset
Last updated on 
January 5, 2022
Licensed under 
Research Only
Star