Tooling for the Common Objects In 3D dataset.

Facebook Research

Last update: Jan 6, 2023

Related tags

Deep Learning co3d

Overview

CO3D: Common Objects In 3D

This repository contains a set of tools for working with the Common Objects in 3D (CO3D) dataset.

Download the dataset

The dataset can be downloaded from the following Facebook AI Research web page: download link

Installation

This is a Python 3 / PyTorch codebase.

Install PyTorch.
Install PyTorch3D.
Install the remaining dependencies in requirements.txt:

pip install lpips visdom tqdm

Note that the core data model in dataset/types.py is independent of PyTorch and can be imported and used with other machine-learning frameworks.

Dependencies

Getting started

Install dependencies - See Installation above.
Download the dataset here to a given root folder DATASET_ROOT_FOLDER.
In dataset/dataset_zoo.py set the DATASET_ROOT variable to your DATASET_ROOT_FOLDER`:
```
dataset_zoo.py:25: DATASET_ROOT = DATASET_ROOT_FOLDER
```
Run eval_demo.py:
```
python eval_demo.py
```
Note that eval_demo.py runs an evaluation of a simple depth-based image rendering (DBIR) model on the same data as in the paper. Hence, the results are directly comparable to the numbers reported in the paper.

Running tests

Unit tests can be executed with:

python -m unittest

License

The CO3D codebase is released under the BSD License.

Overview video

The following presentation of the dataset was delivered at the Extreme Vision Workshop at CVPR 2021:

Comments

link list file is broken

Hey - the link list file that we should get at the bottom of this page: https://ai.facebook.com/datasets/co3d-downloads/ is broken - it shows only one link to one dataset (apples)

opened by orweiser-code 11
Automating download for remote machines

Hi, thanks for the dataset!!

I can download it on my local machine by clicking on the links at https://ai.facebook.com/datasets/co3d-downloads/. But this is a bit tedious for remote machines.

Is there a recommended way for automating the download for the entire dataset onto a remote machine?

Thanks in advance!

opened by shubham-goel 11
Depth map and intrinsic

Thanks for your amazing dataset!

I encountered some weird results (as shown below) when planning to back project the depth map to generate the point cloud. The intrinsic matrix is obtained by @liuyuan in issue#4, and the depth map is directly from the car/106_12650_23736/depths/frame000001.jpg.geometric.png. It seems the intrinsic matrix is not related to the depth map.

Can you give me some quick advice or references?

opened by jzhzhang 9
The near and far bounds of camera

Thanks for releasing the dataset! I plan to train a nerf model using the implementation in Pytorch3D on CO3D but have problem on choosing the near and far bounds of camera. Can you provide some advices on how to calculate the two values or give a experimental reference?

opened by w2kun 7
About the camera intrinsics matrix

Hi! Thanks for your wonderful dataset! I have a question about the camera intrinsics matrix. I found for all data the principal_point is [0, 0], which is really rare for real-world cameras. Could you please explain it briefly? Thanks in advance.

opened by OasisYang 7
Incorrect frame-to-sequence labelling, or image files, in dataset

Hi,

Unless I am mistaken, there are errors in the incorrect placement of image files within the dataset. I've only observed this on the Car category, but can't rule it out elsewhere.

For instance, consider the car sequence '336_34852_64130'. This is associated with frame indexes, contiguously, from 9284 to 9385 (102 frames). However, an inspection of the image sequence reveals that about 20 of these frames come from a different sequence (see image for a subset - incorrect cars begin at 9306).

This issue does not seem to only apply to the RGB images. Depth maps and foreground probabilities (pictured here for frame 9307) are "correctly" linked to the RGB images, and thus incorrect for the sequence.

Would it be possible to look into this? Thank you

opened by bibbygoodwin 5

3D bounding boxes from point clouds

Helloo

Thanks for creating this awesome dataset :)

Im trying to make 3D bounding boxes from the point clouds using pytorch3D Pointclouds.get_bounding_boxes method, however my results as of now looks completely off - code for transforming object point clouds to 3D bounding boxes:

def bb_vertex_from_sizes(sizes, cloud_idx=0):
    sizes = sizes[cloud_idx]
    x_min, x_max = sizes[0]
    y_min, y_max = sizes[1]
    z_min, z_max = sizes[2]

    point_0 = [x_min, y_min, z_min]
    point_1 = [x_min, y_min, z_max]
    point_2 = [x_min, y_max, z_min]
    point_3 = [x_min, y_max, z_max]
    point_4 = [x_max, y_min, z_min]
    point_5 = [x_max, y_min, z_max]
    point_6 = [x_max, y_max, z_min]
    point_7 = [x_max, y_max, z_max]

    return torch.Tensor(np.stack([point_0, point_1, point_2, point_3, point_4, point_5, point_6, point_7]))


dataset = dataset_zoo("co3d_multisequence", "data", "cup", load_point_clouds=True, test_on_train=False)
train_ds = dataset["train"]
n = random.randint(0, 10000)
frame = train_ds[n]
image = frame.image_rgb.permute(1,2,0).numpy()
point_cloud = frame.sequence_point_cloud[0]
bbox = bb_vertex_from_sizes(point_cloud.get_bounding_boxes())
bbox_proj = frame.camera.transform_points_screen(bbox, image_size=image.shape[:2])
bbox_proj = bbox_proj.int().numpy()[:,:2]


fig, ax = plt.subplots(figsize=(15,10))
ax.imshow(image)
ax.scatter(bbox_proj[:, 0], bbox_proj[:, 1])

output_bbox output_pcl

Just wanted to check in, if this at all would be possible before diving deeper into it.

The calculated 3D vertices does not seem to be correct, even when the point clouds seem to be without too many outliers. I see that the point clouds sometimes have outliers, so my approach would be to somehow filter out these by perhaps only accounting for the points inside of the 2d bounding box of the object, however not sure if this is the best approach, though this wouldnt fix the bounding boxes for my current implementation.

Hope you can help :)

opened by mikkelmedm 5

About Figure2

Hi,

Thanks for your awesome work.

I am interested in the source data for plotting Figure 2 (left) I really like Figure 2 in the paper and want to plot a figure in a similar pattern.

Would you mind sharing the plot script?

opened by ShoufaChen 4
Dataset zip files on webpage are incomplete

Hi,

Most of the zip files on the download page (https://ai.facebook.com/datasets/co3d-downloads/) are not the full archive and thus cannot be opened.

I have tried downloading using the download_dataset.py script and manually from the website.

Some of the links give HTTP 400 errors and do not download at all ( eg. broccoli, toytruck, microwave). Most of the links download an incomplete file that does not match the checksums from #12. These files cannot be opened. For example, couch is a 648 MiB file that cannot be opened. The categories that I was able to successfully download and decompress are donut, frisbee, plant, and tv.

Is it possible to double check the links on the website or host them on an alternative file sharing site?

Thanks in advance!

opened by jasonyzhang 4

AttributeError occured when run eval_demo.py

Thanks for releasing the dataset!

I try to run eval_demo.py after installation but encounter an error. The details are as follows:

Traceback (most recent call last):
  File "F:/Github-Projects/co3d-master/eval_demo.py", line 209, in <module>
    main()
  File "F:/Github-Projects/co3d-master/eval_demo.py", line 56, in main
    category, task=task, single_sequence_id=single_sequence_id
  File "F:/Github-Projects/co3d-master/eval_demo.py", line 110, in evaluate_dbir_for_category
    test_restrict_sequence_id=single_sequence_id,
  File "F:\Github-Projects\co3d-master\dataset\dataset_zoo.py", line 182, in dataset_zoo
    datasets[dataset] = Co3dDataset(**params)
  File "<string>", line 29, in __init__
  File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 287, in __post_init__
    self._load_frames()
  File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 531, in _load_frames
    zipfile, List[types.FrameAnnotation]
  File "F:\Github-Projects\co3d-master\dataset\types.py", line 132, in load_dataclass
    return _dataclass_from_dict(asdict, cls)
  File "F:\Github-Projects\co3d-master\dataset\types.py", line 152, in _dataclass_from_dict
    types = typing.get_args(typeannot)
AttributeError: module 'typing' has no attribute 'get_args'

I'm using python 3.6.8, pytorch 1.7.1 and pytorch3d 0.5.0. Can you provide some advices on it? Thanks in advance.

opened by w2kun 4

Any documents of v2?

Hi. Thanks for great work.

We have recently tried rendering scenes on CO3D v2 and found some issues regarding camera parameters. Although the our code has successfully rendered scenes on CO3D v1, our model fails to reliably render on CO3D v2. We have not changed the code.

I didn't find any documentation of v2 so that I want to ask for minor details that CO3D-v2 differs from CO3D-v1. Was there any change on camera coordinate? Otherwise, was there any difference on MVS step?

opened by jeongyw12382 3
Wrong depth mask in the dataset

Hi, Many, if not all, depth masks are wrong, as shown in the image. I only checked the car and orange category, but I believe in other categories depth masks are also wrong. I also found that the depth masks in CO3D v1 are better.

opened by zzhuolun 4
Statistics for v2

Is there an updated figure (or ideally a table) for the v2 of the dataset?

I'm looking to pick a category with the largest number of valid point clouds.

opened by jatentaki 0
Is it possible to download subset with depth maps, and extrinsincs

Hi,

Thanks for your great work. Is it possible to download a subset of categories/sequences which contain accurate camera poses along with the depth maps and rgb? I downloaded the many-view test, but most of the sequences there have no depth map, or the depth mask is wrong. I am mostly interested to train NerF per scene.

Thanks

opened by PruneTruong 3
Is it possible tp provide User ID for each sequence?

Hi, for each sequence, would it possible to provide the (anonymized) user ID of the user that provided/uploaded the video.

I know this might a big ask, but I am trying to set up an "instance retrieval" use-case with the dataset. I am carrying the assumption that videos uploaded by the same user would have a similar background, which would be useful in my project.

Please let me know if providing such annotations would be possible. :)

Thanks, Yash

opened by yashbhalgat 0
Fixed dataset error when not loading masks

When masks are not loaded by the Co3dDataset, fg_probability, full_path, and bbox_xywh are not set, resulting in a NameError. Fixed this by setting a default value.
CLA Signed

opened by jasonyzhang 0
Fix camera intrinsics for crop, no resize case

I believe that there is a bug in the Co3dDataset class, in the _get_pytorch3d_camera, in the case where the dataset is loaded with self.box_crop=True and without resizing (i.e. self.image_height is None or self.image_width is None). The bug comes from the out_size parameter not being set appropriately under these conditions.

The code under the condition at https://github.com/facebookresearch/co3d/blob/7ee9f5ba0b87b22e1dfe92c4d2010cb14dd467a6/dataset/co3d_dataset.py#L516 returns out_size as the original image size, in cases where self.image_height is None or self.image_width is None. However, if self.box_crop=True, the out_size should be the size of the crop, not the whole image.

Plotting the back-projected point clouds under various resize/crop settings demonstrates the error - below, the purple point cloud is from a dataset loaded without cropping (either self.image_height=None or with some value e.g. self.image_height=256 -- both, correctly, lead to the same back-projection, though with different numbers of points). The blue point cloud is from a cropped and resized dataset (self.box_crop=True, self.image_height=256, self.image_width=256). This back-projects to overlap appropriately with the full point cloud. The red point cloud is from the problematic combination (self.box_crop=True, self.image_height=None, self.image_width=None), and can be seen to not overlap with either of the other point clouds.

By contrast, the following show two views with the bugfix (green point cloud comes from dataset with same crop-and-no-resize conditions as the red point cloud before). It can be seen that the green point cloud is now fully aligned with the other back-projected point clouds.

I believe that the no-crop-or-resize format is being quite commonly used by users of the dataset, so I hope that this pull request will prove useful.

Thanks for reviewing.
CLA Signed

opened by bibbygoodwin 2

Tooling for the Common Objects In 3D dataset.

Related tags

Overview

CO3D: Common Objects In 3D

Installation

Dependencies

Getting started

Running tests

License

Overview video

Comments

Owner

Facebook Research

Tooling for converting STAC metadata to ODC data model

Tooling for GANs in TensorFlow

A DeepStack custom model for detecting common objects in dark/night images and videos.

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

This is the dataset and code release of the OpenRooms Dataset.

A large dataset of 100k Google Satellite and matching Map images, resembling pix2pix's Google Maps dataset.

Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Neural models of common sense. 🤖

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.