The official homepage of the COCO-Stuff dataset.

Overview

The COCO-Stuff dataset

Holger Caesar, Jasper Uijlings, Vittorio Ferrari

COCO-Stuff example annotations

Welcome to official homepage of the COCO-Stuff [1] dataset. COCO-Stuff augments all 164K images of the popular COCO [2] dataset with pixel-level stuff annotations. These annotations can be used for scene understanding tasks like semantic segmentation, object detection and image captioning.

Overview

Highlights

  • 164K complex images from COCO [2]
  • Dense pixel-level annotations
  • 80 thing classes, 91 stuff classes and 1 class 'unlabeled'
  • Instance-level annotations for things from COCO [2]
  • Complex spatial context between stuff and things
  • 5 captions per image from COCO [2]

Research Paper

COCO-Stuff: Thing and Stuff Classes in Context
H. Caesar, J. Uijlings, V. Ferrari,
In Computer Vision and Pattern Recognition (CVPR), 2018.
[paper][bibtex]

Versions of COCO-Stuff

  • COCO-Stuff dataset: The final version of COCO-Stuff, that is presented on this page. It includes all 164K images from COCO 2017 (train 118K, val 5K, test-dev 20K, test-challenge 20K). It covers 172 classes: 80 thing classes, 91 stuff classes and 1 class 'unlabeled'. This dataset will form the basis of all upcoming challenges.
  • COCO 2017 Stuff Segmentation Challenge: A semantic segmentation challenge on 55K images (train 40K, val 5K, test-dev 5K, test-challenge 5K) of COCO. To focus on stuff, we merged all 80 thing classes into a single class 'other'. The results of the challenge were presented at the Joint COCO and Places Recognition Workshop at ICCV 2017.
  • COCO-Stuff 10K dataset: Our first dataset, annotated by 10 in-house annotators at the University of Edinburgh. It includes 10K images from the training set of COCO. We provide a 9K/1K (train/val) split to make results comparable. The dataset includes 80 thing classes, 91 stuff classes and 1 class 'unlabeled'. This was initially presented as 91 thing classes, but is now changed to 80 thing classes, as 11 classes do not have any segmentation annotations in COCO. This dataset is a subset of all other releases.

Downloads

Filename Description Size
train2017.zip COCO 2017 train images (118K images) 18 GB
val2017.zip COCO 2017 val images (5K images) 1 GB
stuffthingmaps_trainval2017.zip Stuff+thing PNG-style annotations on COCO 2017 trainval 659 MB
stuff_trainval2017.zip Stuff-only COCO-style annotations on COCO 2017 trainval 543 MB
annotations_trainval2017.zip Thing-only COCO-style annotations on COCO 2017 trainval 241 MB
labels.md Indices, names, previews and descriptions of the classes in COCO-Stuff <10 KB
labels.txt Machine readable version of the label list <10 KB
README.md This readme <10 KB

To use this dataset you will need to download the images (18+1 GB!) and annotations of the trainval sets. To download earlier versions of this dataset, please visit the COCO 2017 Stuff Segmentation Challenge or COCO-Stuff 10K.

Caffe-compatible stuff-thing maps We suggest using the stuffthingmaps, as they provide all stuff and thing labels in a single .png file per image. Note that the .png files are indexed images, which means they store only the label indices and are typically displayed as grayscale images. To be compatible with most Caffe-based semantic segmentation methods, thing+stuff labels cover indices 0-181 and 255 indicates the 'unlabeled' or void class.

Separate stuff and thing downloads Alternatively you can download the separate files for stuff and thing annotations in COCO format, which are compatible with the COCO-Stuff API. Note that the stuff annotations contain a class 'other' with index 183 that covers all non-stuff pixels.

Setup

Use the following instructions to download the COCO-Stuff dataset and setup the folder structure. The instructions are for Ubuntu and require git, wget and unzip. On other operating systems the commands may differ:

# Get this repo
git clone https://github.com/nightrome/cocostuff.git
cd cocostuff

# Download everything
wget --directory-prefix=downloads http://images.cocodataset.org/zips/train2017.zip
wget --directory-prefix=downloads http://images.cocodataset.org/zips/val2017.zip
wget --directory-prefix=downloads http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip

# Unpack everything
mkdir -p dataset/images
mkdir -p dataset/annotations
unzip downloads/train2017.zip -d dataset/images/
unzip downloads/val2017.zip -d dataset/images/
unzip downloads/stuffthingmaps_trainval2017.zip -d dataset/annotations/

Results

Below we present results on different releases of COCO-Stuff. If you would like to see your results here, please contact the first author.

Results on the val set of COCO-Stuff:

Method Source Class accuracy Pixel accuracy Mean IOU FW IOU
Deeplab VGG-16 (no CRF) [4] [1] 45.1% 63.6% 33.2% 47.6%

Note that the results between the 10K dataset and the full dataset are not direclty comparable, as different train and val images are used. Furthermore, on the full dataset we train Deeplab for 100K iterations [1], compared to 20K iterations on the 10K dataset [1b].

Results on the val set of the COCO 2017 Stuff Segmentation Challenge:

We show results on the val set of the challenge. Please refer to the official leaderboard for results on the test-dev and test-challenge sets. Note that these results are not comparable to other COCO-Stuff results, as the challenge only includes a single thing class 'other'.

Method Source Class accuracy Pixel accuracy Mean IOU FW IOU
Inplace-ABN sync [8] - - 24.9% -

Results on the val set of COCO-Stuff 10K:

Method Source Class accuracy Pixel accuracy Mean IOU FW IOU
FCN-16s [3] [1b] 34.0% 52.0% 22.7% -
Deeplab VGG-16 (no CRF) [4] [1b] 38.1% 57.8% 26.9% -
FCN-8s [3] [6] 38.5% 60.4% 27.2% -
SCA VGG-16 [7] [7] 42.5% 61.6% 29.1% -
DAG-RNN + CRF [6] [6] 42.8% 63.0% 31.2% -
DC + FCN+ [5] [5] 44.6% 65.5% 33.6% 50.6%
Deeplab ResNet (no CRF) [4] - 45.5% 65.1% 34.4% 50.4%
CCL ResNet-101 [10] [10] 48.8% 66.3% 35.7% -
DSSPN ResNet finetune [9] [9] 48.1% 69.4% 37.3% -
* OHE + DC + FCN+ [5] [5] 45.8% 66.6% 34.3% 51.2%
* W2V + DC + FCN+ [5] [5] 45.1% 66.1% 34.7% 51.0%
* DSSPN ResNet universal [9] [9] 50.3% 70.7% 38.9% -

* Results not comparable as they use external data

Labels

Label Names & Indices

To be compatible with COCO, COCO-Stuff has 91 thing classes (1-91), 91 stuff classes (92-182) and 1 class "unlabeled" (0). Note that 11 of the thing classes of COCO do not have any segmentation annotations (blender, desk, door, eye glasses, hair brush, hat, mirror, plate, shoe, street sign, window). The classes desk, door, mirror and window could be either stuff or things and therefore occur in both COCO and COCO-Stuff. To avoid confusion we add the suffix "-stuff" or "-other" to those classes in COCO-Stuff. The full list of classes and their descriptions can be found here.

Label Hierarchy

This figure shows the label hierarchy of COCO-Stuff including all stuff and thing classes: COCO-Stuff label hierarchy

Semantic Segmentation Models (stuff+things)

PyTorch model

We recommend this third party re-implementation of Deeplab v2 in PyTorch. Contrary to our Caffe model, it supports ResNet and CRFs. The authors provide setup routines and models for COCO-Stuff 164K. Please file any issues or questions on the project's GitHub page.

Caffe model

Here we provide the Caffe-based segmentation model used in the COCO-Stuff paper. However, for users not familiar with Caffe we recommend the above PyTorch model. Before using the semantic segmentation model, please setup the dataset. The commands below download and install Deeplab (incl. Caffe), download or train the model and predictions and evaluate the performance. The results should be the same as in the table. Due to several issues, we do not provide the Deeplab ResNet101 model, but some code for it can be found in this folder.

# Get and install Deeplab (you may need to change settings)
# We use a special version of Deeplab v2 that supports CuDNN v5, but others may work as well.
git submodule update --init models/deeplab/deeplab-v2
cd models/deeplab/deeplab-v2
cp Makefile.config.example Makefile.config
make all -j8

# Create symbolic links to the images and annotations
cd models/deeplab/cocostuff/data && ln -s ../../../../dataset/images images && ln -s ../../../../dataset/annotations annotations && cd ../../../..

# Option 1: Download the initial model
# wget --directory-prefix=models/deeplab/cocostuff/model/deeplabv2_vgg16 http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/deeplabv2_vgg16_init.caffemodel

# Option 2: Download the trained model
# wg --directory-prefix=downloads http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/deeplab_cocostuff_trainedmodel.zip
# zip downloads/deeplab_cocostuff_trainedmodel.zip -d models/deeplab/cocostuff/model/deeplabv2_vgg16/model120kimages/

# Option 3: Run training & test
# cd models/deeplab && ./run_cocostuff_vgg16.sh && cd ../..

# Option 4 (fastest): Download predictions
wget --directory-prefix=downloads http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/deeplab_predictions_cocostuff_val2017.zip
unzip downloads/deeplab_predictions_cocostuff_val2017.zip -d models/deeplab/cocostuff/features/deeplabv2_vgg16/model120kimages/val/fc8/

# Evaluate performance
python models/deeplab/evaluate_performance.py

The table below summarizes the files used in these instructions:

Filename Description Size
deeplabv2_vgg16_init.caffemodel Deeplab VGG-16 pretrained model (original link) 152 MB
deeplab_cocostuff_trainedmodel.zip Deeplab VGG-16 trained on COCO-Stuff 286 MB
deeplab_predictions_cocostuff_val2017.zip Deeplab VGG-16 predictions on COCO-Stuff 54 MB

Note that the Deeplab predictions need to be rotated and cropped, as shown in this script.

Annotation Tool

For the Matlab annotation tool used to annotate the initial 10K images, please refer to this repository.

Misc

References

Licensing

COCO-Stuff is a derivative work of the COCO dataset. The authors of COCO do not in any form endorse this work. Different licenses apply:

Acknowledgements

This work is supported by the ERC Starting Grant VisCul. The annotations were done by the crowdsourcing startup Mighty AI, and financed by Mighty AI and the Common Visual Data Foundation.

Contact

If you have any questions regarding this dataset, please contact us at holger-at-it-caesar.com.

Comments
  • COCO stuff 2017 version for downloading

    COCO stuff 2017 version for downloading

    Hi, could you please share me the link of coco-stuff 2017 version annotations for COCO 2017 Stuff Segmentation Task? Actually it's the version of (train 40K, val 5K, test-dev 5K, test-challenge 5K). I've searched the internet for that but I only found the version of (train 118K, val 5K, test-dev 20K, test-challenge 20K). Thanks.

    opened by superkoma 4
  • Do annotations have the instance-wise bounding boxes?

    Do annotations have the instance-wise bounding boxes?

    Thank you, authors, for the great work. As I've examined the annotations, it seems like the bounding boxes are provided per category rather than per instance. For example, there are 2 windows in the image but only a single box covering 2 windows. Am I correct? If so, do you have the instance-wise annotation? Many thanks beforehand!

    opened by davidnvq 3
  • How to make cocostuff dataset with coco json file?

    How to make cocostuff dataset with coco json file?

    Hi,

    I am trying to train a hybrid task cascade net (HTC) and the mask branch requires the format of cocostuff dataset. I am wondering if it is possible to convert from coco json file to a cocostuff dataset ?

    opened by Paddy-Xu 3
  • Question about confusion matrix indices

    Question about confusion matrix indices

    I trained semantic segmentation model using "stuffthingmaps_trainval2017.zip" (Stuff+thing PNG-style annotations on COCO 2017 trainval )

    In this case, thing+stuff labels cover indices 0-181 and 255 indicates the 'unlabeled' or void class.

    I think the below line https://github.com/nightrome/cocostuff/blob/master/models/deeplab/evaluate_performance.py#L98 confusion[g - 1, d - 1] += c (this is for json format annotation, COCO-style annotations (json file) cover indices 1-182)

    should be changed to confusion[g, d] += c

    since g and d can be 0.

    This modification does not change the performance on leaf category.

    However, If I add metric for superclass to evaluation_performance.py based on coccostuffapi, This modification gives me very different values for superclass category performance. (much higher)

    Do I miss something?

    opened by zacurr 3
  • Old version of COCO-stuff 2017

    Old version of COCO-stuff 2017

    Hello,

    I'm reproducing some results in the recent papers, such as LDMs and OC-GAN, and I found that most of them conduct experiments on COCO stuff 2017 old version. In that old version of COCO stuff, the number of training/testing data with detection annotations is about 41k/5k. Is it possible to get this version of COCO-stuff?

    Thank you in advance!

    opened by davidhalladay 2
  • Noun Annotations

    Noun Annotations

    Hi,

    My supervisors and I are currently working on a paper analysing the COCO data set. As part of this we need to identify nouns within the COCO captions as “things” or “stuff”. In Section 4.1 of your COCO-Stuff paper, you mention that you underwent a similar process tagging the nouns by hand. I was wondering if you would please be able to share this data with us to save us having to undertake a similar venture. We would of course credit your work through appropriate citations.

    Many thanks

    opened by Delphboy 2
  • About the stuff categories annotation.

    About the stuff categories annotation.

    panoptic_semseg_train2017/000000000247.png 000000000247

    I want to know why the greyscale of the sky is 119, but as you have mentioned in this issue, the sky-other in the labels may 157 or 146(157-11)(some classes have been removed)?

    I am so confused that how to build the mapping relationships between classes and greyscale in the .png

    opened by JimmyMa99 2
  • Annotation link broken

    Annotation link broken

    Hi, The annotation link (http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip) seems broken. Do you know how I could download the annotations ? Thank you for your help

    opened by TimDarcet 2
  • test sets available?

    test sets available?

    This might be a stupid question - in the Versions of COCO-Stuff section, it says "it includes all 164K images from COCO 2017 (train 118K, val 5K, test-dev 20K, test-challenge 20K)." However, I only see the train and val sets available for download in the Downloads section. Are either of the test sets available?

    opened by dmadras 2
  • correspondence between .png annotation & category_id

    correspondence between .png annotation & category_id

    opened by iTomxy 2
  • Stuff dataset .

    Stuff dataset .

    I only want a stuff data set, How can I separate the outdoor of stuff part from this data set? Are there have a data set only includes outdoor classes in stuff? Thank you!

    opened by Zhangwenyao1 2
Owner
Holger Caesar
Author of the COCO-Stuff and nuScenes datasets.
Holger Caesar
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 ?? is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

阿才 73 Dec 16, 2022
A set of tools for converting a darknet dataset to COCO format working with YOLOX

darknet格式数据→COCO darknet训练数据目录结构(详情参见dataset/darknet): darknet ├── class.names ├── gen_config.data ├── gen_train.txt ├── gen_valid.txt └── images

RapidAI-NG 148 Jan 3, 2023
Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

null 442 Dec 16, 2022
Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

JSON 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Json2Xml t

Nguyễn Trường Lâu 6 Aug 22, 2022
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

null 20 Jul 29, 2022
This is a Python Module For Encryption, Hashing And Other stuff

EnroCrypt This is a Python Module For Encryption, Hashing And Other Basic Stuff You Need, With Secure Encryption And Strong Salted Hashing You Can Do

null 5 Sep 15, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
This is the dataset and code release of the OpenRooms Dataset.

This is the dataset and code release of the OpenRooms Dataset.

Visual Intelligence Lab of UCSD 95 Jan 8, 2023
A large dataset of 100k Google Satellite and matching Map images, resembling pix2pix's Google Maps dataset.

Larger Google Sat2Map dataset This dataset extends the aerial ⟷ Maps dataset used in pix2pix (Isola et al., CVPR17). The provide script download_sat2m

null 34 Dec 28, 2022
Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

PlantDoc: A Dataset for Visual Plant Disease Detection This repository contains the Cropped-PlantDoc dataset used for benchmarking classification mode

Pratik Kayal 109 Dec 29, 2022
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

MADE (Multi-Adapter Dataset Experts) This repository contains the implementation of MADE (Multi-adapter dataset experts), which is described in the pa

Princeton Natural Language Processing 68 Jul 18, 2022
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

MADE (Multi-Adapter Dataset Experts) This repository contains the implementation of MADE (Multi-adapter dataset experts), which is described in the pa

Princeton Natural Language Processing 39 Oct 5, 2021
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

Habitat-Matterport 3D Dataset (HM3D) The Habitat-Matterport 3D Research Dataset is the largest-ever dataset of 3D indoor spaces. It consists of 1,000

Meta Research 62 Dec 27, 2022
Official implementation of ETH-XGaze dataset baseline

ETH-XGaze baseline Official implementation of ETH-XGaze dataset baseline. ETH-XGaze dataset ETH-XGaze dataset is a gaze estimation dataset consisting

Xucong Zhang 134 Jan 3, 2023