Verbs in COCO (V-COCO) Dataset

This repository hosts the Verbs in COCO (V-COCO) dataset and associated code to evaluate models for the Visual Semantic Role Labeling (VSRL) task as ddescribed in this technical report.


  1. Clone repository (recursively, so as to include COCO API).

    git clone --recursive https://github.com/s-gupta/v-coco.git
  2. This dataset builds off MS COCO, please download MS-COCO images and annotations.

  3. Current V-COCO release only uses a subset of MS-COCO images (Image IDs listed in data/splits/vcoco_all.ids). Use the following script to pick out annotations from the COCO annotations to allow faster loading in V-COCO.

    # Assume you cloned the repository to `VCOCO_DIR'
    cd $VCOCO_DIR
    # If you downloaded coco annotations to coco-data/annotations
    python script_pick_annotations.py coco-data/annotations
  4. Build coco/PythonAPI/pycocotools/_mask.so, cython_bbox.so.

    # Assume you cloned the repository to `VCOCO_DIR'
    cd $VCOCO_DIR/coco/PythonAPI/ && make
    cd $VCOCO_DIR && make

Using the dataset

  1. An IPython notebook, illustrating how to use the annotations in the dataset is available in V-COCO.ipynb
  2. The current release of the dataset includes annotations as indicated in Table 1 in the paper. We are collecting role annotations for the 6 categories (that are missing) and will make them public shortly.


We provide evaluation code that computes agent AP and role AP, as explained in the paper.

In order to use the evaluation code, store your predictions as a pickle file (.pkl) in the following format:

[ {'image_id':        # the coco image id,
   'person_box':      #[x1, y1, x2, y2] the box prediction for the person,
   '[action]_agent':  # the score for action corresponding to the person prediction,
   '[action]_[role]': # [x1, y1, x2, y2, s], the predicted box for role and 
                      # associated score for the action-role pair.
   } ]

Assuming your detections are stored in det_file=/path/to/detections/detections.pkl, do

from vsrl_eval import VCOCOeval
vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
  # e.g. vsrl_annot_file: data/vcoco/vcoco_val.json
  #      coco_file:       data/instances_vcoco_all_2014.json
  #      split_file:      data/splits/vcoco_val.ids
vcocoeval._do_eval(det_file, ovr_thresh=0.5)

We introduce two scenarios for role AP evaluation.

  1. [Scenario 1] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 & the corresponding role is empty e.g. [0,0,0,0] or [NaN,NaN,NaN,NaN]. This scenario is fit for missing roles due to occlusion.

  2. [Scenario 2] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 (the corresponding role is ignored). This scenario is fit for the cases with roles outside the COCO categories.

  • MS-COCO 2014 or 2017

    MS-COCO 2014 or 2017

    Hi, @s-gupta, I have a question about the MS-COCO version, since there are two versions for it, I need to download the MS-COCO 2014 or 2017?

    And where to download the file 'instances_vcoco_all_2014.json''? Thanks!!

  broken base url of http://mscoco.org/images/ in V-COCO.ipynb

    broken base url of http://mscoco.org/images/ in V-COCO.ipynb

    positive_index = np.where(vcoco['label'] == 1)[0]
    positive_index = np.random.permutation(positive_index)
    # the demo here laods images from the COCO website, 
    # you can alternatively use your own local folder of COCO images.
    load_coco_image_from_web = True
    if load_coco_image_from_web:
        base_coco_url = 'http://mscoco.org/images/'
        from PIL import Image
        #import urllib, cStringIO
        import urllib.request
        from io import StringIO
    cc = plt.get_cmap('hsv', lut=4)
    for i in range(5):
        id = positive_index[i]
        # load image
        coco_image = coco.loadImgs(ids=[vcoco['image_id'][id][0]])[0]
        if load_coco_image_from_web:
            coco_url = base_coco_url + str(coco_image['id'])
            #file = cStringIO.StringIO(urllib.urlopen(coco_url).read())
            file = StringIO(urllib.request.urlopen(coco_url).read())
            im = np.asarray(Image.open(file))
        sy = 4.; sx = float(im.shape[1])/float(im.shape[0])*sy;
        fig, ax = subplot(plt, (1,1), (sy,sx)); ax.set_axis_off(); 
        # draw bounding box for agent
        draw_bbox(plt, ax, vcoco['bbox'][[id],:], edgecolor=cc(0)[:3])
        role_bbox = vcoco['role_bbox'][id,:]*1.
        role_bbox = role_bbox.reshape((-1,4))
        for j in range(1, len(vcoco['role_name'])):
            if not np.isnan(role_bbox[j,0]):
                draw_bbox(plt, ax, role_bbox[[j],:], edgecolor=cc(j)[:3])

  np.array(vsrl_data[i]['role_object_id']).reshape((len(vsrl_data[i]['role_name']),-1)).T KeyError: 'role_object_id'

    np.array(vsrl_data[i]['role_object_id']).reshape((len(vsrl_data[i]['role_name']),-1)).T KeyError: 'role_object_id'

    (vcoco) mona@goku:~/research/code/v-coco$ python demo_vcoco.py 
    added /home/mona/research/code/v-coco/coco/PythonAPI to pythonpath
    loading annotations into memory...
    Done (t=1.12s)
    creating index...
    index created!
    Traceback (most recent call last):
      File "demo_vcoco.py", line 28, in <module>
        vcoco_all = vu.load_vcoco('vcoco_train')
      File "/home/mona/research/code/v-coco/vsrl_utils.py", line 46, in load_vcoco
    KeyError: 'role_object_id'

    here is the code:

    import __init__
    import vsrl_utils as vu
    import numpy as np
    import matplotlib
    import matplotlib.pyplot as plt
    def draw_bbox(plt, ax, rois, fill=False, linewidth=2, edgecolor=[1.0, 0.0, 0.0], **kwargs):
        for i in range(rois.shape[0]):
            roi = rois[i,:].astype(np.int)
            ax.add_patch(plt.Rectangle((roi[0], roi[1]),
                roi[2] - roi[0], roi[3] - roi[1],
                fill=False, linewidth=linewidth, edgecolor=edgecolor, **kwargs))
    ##def subplot(plt, (Y, X), (sz_y, sz_x) = (10, 10)):
    def subplot(plt, yx, sz = (10, 10)):
        (Y, X) = yx
        (sz_y, sz_x) = sz
        plt.rcParams['figure.figsize'] = (X*sz_x, Y*sz_y)
        fig, axes = plt.subplots(Y, X)
        return fig, axes
    # Load COCO annotations for V-COCO images
    coco = vu.load_coco()
    # Load the VCOCO annotations for vcoco_train image set
    vcoco_all = vu.load_vcoco('vcoco_train')
    for x in vcoco_all:
        x = vu.attach_gt_boxes(x, coco)

    Screenshot from 2021-04-05 18-11-40

  Object Category IDs do not align neither with *2014 nor *2017

    Object Category IDs do not align neither with *2014 nor *2017


    As the dataset does not provide automatically the object class annotation but object class ID, I try to get the object class name from mscoco annotations, however object class IDs does not align with mscoco annotations. I get weird tuples like <snowboard, cup>. Is there a way to get it?

    opened by kilickaya 9
  Need information about folder 'data/vcoco'

    Need information about folder 'data/vcoco'

    Thank you @s-gupta to cretae this project, I have several questions about json data on folder data/vcoco

    1. Are you create object class and verbs class on that data by yourself using data from COCO?
    2. I want to train your data annotations using this project , so I must adjust your annotations to fit that project. Can you explain what meaning of the data on category; 'label', 'ann_id' and 'role_object_id' on every your json file?

    Thank you for your help :)

    opened by khaerulumam42 0
Saurabh Gupta
Saurabh Gupta
