Tooling for the Common Objects In 3D dataset.

Related tags

Deep Learning co3d
Overview


CO3D: Common Objects In 3D

This repository contains a set of tools for working with the Common Objects in 3D (CO3D) dataset.

Download the dataset

The dataset can be downloaded from the following Facebook AI Research web page: download link

Installation

This is a Python 3 / PyTorch codebase.

  1. Install PyTorch.
  2. Install PyTorch3D.
  3. Install the remaining dependencies in requirements.txt:
pip install lpips visdom tqdm

Note that the core data model in dataset/types.py is independent of PyTorch and can be imported and used with other machine-learning frameworks.

Dependencies

Getting started

  1. Install dependencies - See Installation above.
  2. Download the dataset here to a given root folder DATASET_ROOT_FOLDER.
  3. In dataset/dataset_zoo.py set the DATASET_ROOT variable to your DATASET_ROOT_FOLDER`:
    dataset_zoo.py:25: DATASET_ROOT = DATASET_ROOT_FOLDER
    
  4. Run eval_demo.py:
    python eval_demo.py
    
    Note that eval_demo.py runs an evaluation of a simple depth-based image rendering (DBIR) model on the same data as in the paper. Hence, the results are directly comparable to the numbers reported in the paper.

Running tests

Unit tests can be executed with:

python -m unittest

License

The CO3D codebase is released under the BSD License.

Overview video

The following presentation of the dataset was delivered at the Extreme Vision Workshop at CVPR 2021: Overview

Comments
  • link list file is broken

    link list file is broken

    opened by orweiser-code 11
  • Automating download for remote machines

    Automating download for remote machines

    Hi, thanks for the dataset!!

    I can download it on my local machine by clicking on the links at https://ai.facebook.com/datasets/co3d-downloads/. But this is a bit tedious for remote machines.

    Is there a recommended way for automating the download for the entire dataset onto a remote machine?

    Thanks in advance!

    opened by shubham-goel 11
  • Depth map and intrinsic

    Depth map and intrinsic

    Thanks for your amazing dataset!

    I encountered some weird results (as shown below) when planning to back project the depth map to generate the point cloud. The intrinsic matrix is obtained by @liuyuan in issue#4, and the depth map is directly from the car/106_12650_23736/depths/frame000001.jpg.geometric.png. It seems the intrinsic matrix is not related to the depth map.

    Can you give me some quick advice or references?

    GIF 2021-12-7 20-52-01

    opened by jzhzhang 9
  • The near and far bounds of camera

    The near and far bounds of camera

    Thanks for releasing the dataset! I plan to train a nerf model using the implementation in Pytorch3D on CO3D but have problem on choosing the near and far bounds of camera. Can you provide some advices on how to calculate the two values or give a experimental reference?

    opened by w2kun 7
  • About the camera intrinsics matrix

    About the camera intrinsics matrix

    Hi! Thanks for your wonderful dataset! I have a question about the camera intrinsics matrix. I found for all data the principal_point is [0, 0], which is really rare for real-world cameras. Could you please explain it briefly? Thanks in advance.

    opened by OasisYang 7
  • Incorrect frame-to-sequence labelling, or image files, in dataset

    Incorrect frame-to-sequence labelling, or image files, in dataset

    Hi,

    Unless I am mistaken, there are errors in the incorrect placement of image files within the dataset. I've only observed this on the Car category, but can't rule it out elsewhere.

    For instance, consider the car sequence '336_34852_64130'. This is associated with frame indexes, contiguously, from 9284 to 9385 (102 frames). However, an inspection of the image sequence reveals that about 20 of these frames come from a different sequence (see image for a subset - incorrect cars begin at 9306). Screenshot 2022-03-04 at 18 58 56

    This issue does not seem to only apply to the RGB images. Depth maps and foreground probabilities (pictured here for frame 9307) are "correctly" linked to the RGB images, and thus incorrect for the sequence. download

    Would it be possible to look into this? Thank you

    opened by bibbygoodwin 5
  • 3D bounding boxes from point clouds

    3D bounding boxes from point clouds

    Helloo

    Thanks for creating this awesome dataset :)

    Im trying to make 3D bounding boxes from the point clouds using pytorch3D Pointclouds.get_bounding_boxes method, however my results as of now looks completely off - code for transforming object point clouds to 3D bounding boxes:

    def bb_vertex_from_sizes(sizes, cloud_idx=0):
        sizes = sizes[cloud_idx]
        x_min, x_max = sizes[0]
        y_min, y_max = sizes[1]
        z_min, z_max = sizes[2]
    
        point_0 = [x_min, y_min, z_min]
        point_1 = [x_min, y_min, z_max]
        point_2 = [x_min, y_max, z_min]
        point_3 = [x_min, y_max, z_max]
        point_4 = [x_max, y_min, z_min]
        point_5 = [x_max, y_min, z_max]
        point_6 = [x_max, y_max, z_min]
        point_7 = [x_max, y_max, z_max]
    
        return torch.Tensor(np.stack([point_0, point_1, point_2, point_3, point_4, point_5, point_6, point_7]))
    
    
    dataset = dataset_zoo("co3d_multisequence", "data", "cup", load_point_clouds=True, test_on_train=False)
    train_ds = dataset["train"]
    n = random.randint(0, 10000)
    frame = train_ds[n]
    image = frame.image_rgb.permute(1,2,0).numpy()
    point_cloud = frame.sequence_point_cloud[0]
    bbox = bb_vertex_from_sizes(point_cloud.get_bounding_boxes())
    bbox_proj = frame.camera.transform_points_screen(bbox, image_size=image.shape[:2])
    bbox_proj = bbox_proj.int().numpy()[:,:2]
    
    
    fig, ax = plt.subplots(figsize=(15,10))
    ax.imshow(image)
    ax.scatter(bbox_proj[:, 0], bbox_proj[:, 1])
    

    output_bbox output_pcl

    Just wanted to check in, if this at all would be possible before diving deeper into it.

    The calculated 3D vertices does not seem to be correct, even when the point clouds seem to be without too many outliers. I see that the point clouds sometimes have outliers, so my approach would be to somehow filter out these by perhaps only accounting for the points inside of the 2d bounding box of the object, however not sure if this is the best approach, though this wouldnt fix the bounding boxes for my current implementation.

    Hope you can help :)

    opened by mikkelmedm 5
  • About Figure2

    About Figure2

    Hi,

    Thanks for your awesome work.

    I am interested in the source data for plotting Figure 2 (left) I really like Figure 2 in the paper and want to plot a figure in a similar pattern.

    Would you mind sharing the plot script?

    opened by ShoufaChen 4
  • Dataset zip files on webpage are incomplete

    Dataset zip files on webpage are incomplete

    Hi,

    Most of the zip files on the download page (https://ai.facebook.com/datasets/co3d-downloads/) are not the full archive and thus cannot be opened.

    I have tried downloading using the download_dataset.py script and manually from the website.

    Some of the links give HTTP 400 errors and do not download at all ( eg. broccoli, toytruck, microwave). Most of the links download an incomplete file that does not match the checksums from #12. These files cannot be opened. For example, couch is a 648 MiB file that cannot be opened. The categories that I was able to successfully download and decompress are donut, frisbee, plant, and tv.

    Is it possible to double check the links on the website or host them on an alternative file sharing site?

    Thanks in advance!

    opened by jasonyzhang 4
  • AttributeError occured when run eval_demo.py

    AttributeError occured when run eval_demo.py

    Thanks for releasing the dataset!

    I try to run eval_demo.py after installation but encounter an error. The details are as follows:

    Traceback (most recent call last):
      File "F:/Github-Projects/co3d-master/eval_demo.py", line 209, in <module>
        main()
      File "F:/Github-Projects/co3d-master/eval_demo.py", line 56, in main
        category, task=task, single_sequence_id=single_sequence_id
      File "F:/Github-Projects/co3d-master/eval_demo.py", line 110, in evaluate_dbir_for_category
        test_restrict_sequence_id=single_sequence_id,
      File "F:\Github-Projects\co3d-master\dataset\dataset_zoo.py", line 182, in dataset_zoo
        datasets[dataset] = Co3dDataset(**params)
      File "<string>", line 29, in __init__
      File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 287, in __post_init__
        self._load_frames()
      File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 531, in _load_frames
        zipfile, List[types.FrameAnnotation]
      File "F:\Github-Projects\co3d-master\dataset\types.py", line 132, in load_dataclass
        return _dataclass_from_dict(asdict, cls)
      File "F:\Github-Projects\co3d-master\dataset\types.py", line 152, in _dataclass_from_dict
        types = typing.get_args(typeannot)
    AttributeError: module 'typing' has no attribute 'get_args'
    

    I'm using python 3.6.8, pytorch 1.7.1 and pytorch3d 0.5.0. Can you provide some advices on it? Thanks in advance.

    opened by w2kun 4
  • Any documents of v2?

    Any documents of v2?

    Hi. Thanks for great work.

    We have recently tried rendering scenes on CO3D v2 and found some issues regarding camera parameters. Although the our code has successfully rendered scenes on CO3D v1, our model fails to reliably render on CO3D v2. We have not changed the code.

    I didn't find any documentation of v2 so that I want to ask for minor details that CO3D-v2 differs from CO3D-v1. Was there any change on camera coordinate? Otherwise, was there any difference on MVS step?

    opened by jeongyw12382 3
  • Wrong depth mask in the dataset

    Wrong depth mask in the dataset

    image

    Hi, Many, if not all, depth masks are wrong, as shown in the image. I only checked the car and orange category, but I believe in other categories depth masks are also wrong. I also found that the depth masks in CO3D v1 are better.

    opened by zzhuolun 4
  • Statistics for v2

    Statistics for v2

    Is there an updated figure (or ideally a table) for the v2 of the dataset?image

    I'm looking to pick a category with the largest number of valid point clouds.

    opened by jatentaki 0
  • Is it possible to download subset with depth maps, and extrinsincs

    Is it possible to download subset with depth maps, and extrinsincs

    Hi,

    Thanks for your great work. Is it possible to download a subset of categories/sequences which contain accurate camera poses along with the depth maps and rgb? I downloaded the many-view test, but most of the sequences there have no depth map, or the depth mask is wrong. I am mostly interested to train NerF per scene.

    Thanks

    opened by PruneTruong 3
  • Is it possible tp provide User ID for each sequence?

    Is it possible tp provide User ID for each sequence?

    Hi, for each sequence, would it possible to provide the (anonymized) user ID of the user that provided/uploaded the video.

    I know this might a big ask, but I am trying to set up an "instance retrieval" use-case with the dataset. I am carrying the assumption that videos uploaded by the same user would have a similar background, which would be useful in my project.

    Please let me know if providing such annotations would be possible. :)

    Thanks, Yash

    opened by yashbhalgat 0
  • Fixed dataset error when not loading masks

    Fixed dataset error when not loading masks

    When masks are not loaded by the Co3dDataset, fg_probability, full_path, and bbox_xywh are not set, resulting in a NameError. Fixed this by setting a default value.

    CLA Signed 
    opened by jasonyzhang 0
  • Fix camera intrinsics for crop, no resize case

    Fix camera intrinsics for crop, no resize case

    I believe that there is a bug in the Co3dDataset class, in the _get_pytorch3d_camera, in the case where the dataset is loaded with self.box_crop=True and without resizing (i.e. self.image_height is None or self.image_width is None). The bug comes from the out_size parameter not being set appropriately under these conditions.

    The code under the condition at https://github.com/facebookresearch/co3d/blob/7ee9f5ba0b87b22e1dfe92c4d2010cb14dd467a6/dataset/co3d_dataset.py#L516 returns out_size as the original image size, in cases where self.image_height is None or self.image_width is None. However, if self.box_crop=True, the out_size should be the size of the crop, not the whole image.

    Plotting the back-projected point clouds under various resize/crop settings demonstrates the error - below, the purple point cloud is from a dataset loaded without cropping (either self.image_height=None or with some value e.g. self.image_height=256 -- both, correctly, lead to the same back-projection, though with different numbers of points). The blue point cloud is from a cropped and resized dataset (self.box_crop=True, self.image_height=256, self.image_width=256). This back-projects to overlap appropriately with the full point cloud. The red point cloud is from the problematic combination (self.box_crop=True, self.image_height=None, self.image_width=None), and can be seen to not overlap with either of the other point clouds. Screenshot 2022-04-14 at 13 54 57 Screenshot 2022-04-14 at 13 55 06

    By contrast, the following show two views with the bugfix (green point cloud comes from dataset with same crop-and-no-resize conditions as the red point cloud before). It can be seen that the green point cloud is now fully aligned with the other back-projected point clouds. Screenshot 2022-04-14 at 13 55 46 Screenshot 2022-04-14 at 13 56 04

    I believe that the no-crop-or-resize format is being quite commonly used by users of the dataset, so I hope that this pull request will prove useful.

    Thanks for reviewing.

    CLA Signed 
    opened by bibbygoodwin 2
Owner
Facebook Research
Facebook Research
Tooling for converting STAC metadata to ODC data model

Tooling for converting STAC metadata to ODC data model.

Open Data Cube 65 Dec 20, 2022
Tooling for GANs in TensorFlow

TensorFlow-GAN (TF-GAN) TF-GAN is a lightweight library for training and evaluating Generative Adversarial Networks (GANs). Can be installed with pip

null 803 Dec 24, 2022
A DeepStack custom model for detecting common objects in dark/night images and videos.

DeepStack_ExDark This repository provides a custom DeepStack model that has been trained and can be used for creating a new object detection API for d

MOSES OLAFENWA 98 Dec 24, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
This is the dataset and code release of the OpenRooms Dataset.

This is the dataset and code release of the OpenRooms Dataset.

Visual Intelligence Lab of UCSD 95 Jan 8, 2023
A large dataset of 100k Google Satellite and matching Map images, resembling pix2pix's Google Maps dataset.

Larger Google Sat2Map dataset This dataset extends the aerial ⟷ Maps dataset used in pix2pix (Isola et al., CVPR17). The provide script download_sat2m

null 34 Dec 28, 2022
Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

PlantDoc: A Dataset for Visual Plant Disease Detection This repository contains the Cropped-PlantDoc dataset used for benchmarking classification mode

Pratik Kayal 109 Dec 29, 2022
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

MADE (Multi-Adapter Dataset Experts) This repository contains the implementation of MADE (Multi-adapter dataset experts), which is described in the pa

Princeton Natural Language Processing 68 Jul 18, 2022
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

MADE (Multi-Adapter Dataset Experts) This repository contains the implementation of MADE (Multi-adapter dataset experts), which is described in the pa

Princeton Natural Language Processing 39 Oct 5, 2021
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

Habitat-Matterport 3D Dataset (HM3D) The Habitat-Matterport 3D Research Dataset is the largest-ever dataset of 3D indoor spaces. It consists of 1,000

Meta Research 62 Dec 27, 2022
This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

Mohamed Ayman 33 Dec 2, 2022
Neural models of common sense. 🤖

Unicorn on Rainbow Neural models of common sense. This repository is for the paper: Unicorn on Rainbow: A Universal Commonsense Reasoning Model on a N

AI2 60 Jan 5, 2023
Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Cross-media Structured Common Space for Multimedia Event Extraction Table of Contents Overview Requirements Data Quickstart Citation Overview The code

Manling Li 49 Nov 21, 2022
Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S

INK Lab @ USC 19 Nov 30, 2022
ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representa

Bats Research 94 Nov 21, 2022