Bling's Object detection tool

chuhaojin

Last update: Nov 1, 2022

Related tags

Deep Learning BriVL-BUA-applications

Overview

BriVL for Building Applications

This repo is used for illustrating how to build applications by using BriVL model.

This repo is re-implemented from following projects:

Online Demo built by BriVL

This repo contains two parts:

Bounding Box Extractor: ./bbox_extractor
BriVL Feature Extractor: ./BriVL

Test this Pipeline

Test image has been saved in ./bbox_extractor/feature_extractor, test with following command:

python3 main.py --brivl_cfg BriVL/cfg/BriVL_cfg.yml --brivl_weights BriVL/weights/brivl-weights.pth

Download Models

bua-caffe-frcn-r101_with_attributes.pth -> /bbox_extractor/weights
chinese-roberta-wwm-ext -> /BriVL/weights/hfl
tf_efficientnet_b5_ns-6f26d0cf.pth -> /BriVL/weights
brivl-weights.pth* -> /BriVL/weights

Requirements

Python >= 3.6
PyTorch >= 1.4
Cuda >= 9.2 and cuDNN
Detectron2 <= 0.3
Transformers

Important: The version of Detectron2 should be 0.3 or below.

Install Pre-Built Detectron2 (Linux only)

Choose from this table to install v0.3 (Nov 2020):

CUDA	torch 1.7	torch 1.6	torch 1.5
11.0	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu110/torch1.7/index.html`
10.2	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.7/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.6/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.5/index.html`
10.1	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.7/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.5/index.html`
9.2	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu92/torch1.7/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu92/torch1.6/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu92/torch1.5/index.html`
cpu	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.7/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.6/index.html`	install `python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.5/index.html`

More Resources

Source Code of BriVL 1.0

Model of BriVL 1.0*

Online API of BriVL 1.0

Online API of BriVL 2.0

* indicates an application is needed.

Contact

This repo is maintained by Chuhao JIn(@jinchuhao).

Bling's Object detection tool

BriVL for Building Applications This repo is used for illustrating how to build applications by using BriVL model. This repo is re-implemented from fo

47 Nov 1, 2022

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

JSON 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Json2Xml t

6 Aug 22, 2022

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

4 Nov 24, 2022

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

SiamMOT: Siamese Multi-Object Tracking

432 Dec 17, 2022

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

130 Dec 29, 2022

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning Object-object Interaction Affordance Learning. For a given object-object int

26 Nov 4, 2022

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

SafePicking Learning Safe Object Extraction via Object-Level Mapping Kentaro Wad

49 Oct 24, 2022

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

15 Oct 14, 2022

Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

SUO-SLAM This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv li

Robot Perception & Navigation Group (RPNG)

97 Jan 3, 2023

Comments

text_feat_extractor.py parameters

#python text_feat_extractor.py Traceback (most recent call last): File "text_feat_extractor.py", line 82, in vfe = TextFeatureExtractor(cfg_file, model_weights, gpu_id=1) File "text_feat_extractor.py", line 54, in init self.text_model.learnable.load_state_dict(text_model_component) File "/root/local/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for ModuleDict: Unexpected key(s) in state_dict: "textencoder.backbone.bert.embeddings.position_ids".

opened by qiao1025566574 1
tag or caption images

Thanks for your job! I want to know how to use this model to tag or caption images, I find this code base only provide a model to extract visual and text feature. Do I need to prepare a tag list ?

opened by wwnbbd 2
[BriVL] Low recall when testing on flickr30k-cn dataset

Hi, thanks for the great work!

I tested the pretrained model for zero-shot img2text and text2img retrieval on flickr30k-cn validation set. The bboxes are obtained as indicated in https://github.com/chuhaojin/BriVL-BUA-applications. For each image, we only select the one caption with the highest fluency score. However, the recall@1 for the two task is only 15.93% and 13.74%, respectively. The same evaluation for ViLT reaches 73.2% and 55.0%. I'm wondering whether you test on this dataset? Any comments on my results?

p.s. An example json file of the dataset is as follows {"sentences": [["0", "一个小男孩正在玩呼啦圈。"]], "bbox": [[78, 92, 183, 124], [179, 137, 363, 214], [68, 21, 170, 101], [73, 326, 206, 498], [338, 150, 379, 187], [0, 305, 363, 396], [105, 273, 179, 342], [30, 32, 261, 483], [89, 192, 130, 210], [12, 155, 389, 498], [173, 150, 192, 167], [17, 134, 237, 353], [10, 341, 389, 496], [90, 76, 170, 169], [29, 118, 282, 363], [17, 357, 339, 402], [129, 133, 152, 155], [6, 423, 78, 498], [97, 231, 138, 250], [74, 22, 174, 175], [165, 167, 197, 191], [34, 77, 242, 494], [316, 145, 341, 197], [33, 167, 164, 323], [294, 1, 382, 19], [199, 8, 382, 158], [15, 385, 389, 497], [1, 366, 379, 396], [179, 126, 371, 228], [204, 13, 379, 130], [57, 23, 189, 235], [59, 71, 230, 482], [55, 23, 203, 167], [44, 29, 213, 248], [61, 27, 210, 219], [32, 124, 264, 367], [44, 39, 236, 286], [18, 326, 338, 445], [198, 383, 389, 496], [61, 344, 209, 498], [95, 269, 186, 340], [46, 302, 331, 471], [19, 123, 344, 307], [11, 14, 374, 409], [31, 132, 234, 357], [20, 134, 271, 354], [16, 10, 358, 360], [32, 20, 297, 478], [39, 19, 206, 157], [2, 330, 62, 443], [29, 168, 175, 331], [153, 312, 389, 404], [2, 408, 272, 498], [0, 328, 347, 467], [317, 148, 349, 197], [35, 302, 227, 458], [38, 143, 229, 366], [11, 367, 385, 492], [191, 320, 380, 389], [323, 148, 347, 199], [61, 324, 244, 498], [79, 0, 385, 495], [47, 143, 222, 355], [6, 0, 389, 221], [0, 367, 377, 407], [0, 194, 389, 498], [103, 123, 356, 222], [14, 7, 222, 183], [20, 4, 389, 164], [0, 286, 389, 497], [14, 4, 191, 132], [21, 331, 308, 438], [59, 118, 352, 219], [70, 88, 181, 128], [0, 227, 389, 498], [4, 327, 389, 490], [0, 330, 363, 451], [15, 348, 302, 436], [126, 116, 156, 147], [48, 52, 269, 480], [17, 0, 224, 154], [34, 54, 245, 478], [8, 98, 389, 491], [24, 12, 167, 110], [17, 116, 316, 361], [32, 0, 305, 476], [4, 110, 37, 201], [48, 135, 223, 349], [14, 410, 370, 497], [38, 13, 265, 391], [51, 301, 219, 483], [54, 332, 244, 484], [22, 127, 256, 356], [47, 172, 216, 360], [81, 92, 178, 124], [75, 82, 174, 140], [27, 150, 230, 361], [53, 20, 192, 152], [0, 269, 356, 357], [18, 2, 195, 118]], "image_id": "/export/PTM_dataset/flickr30k-cn/flickr30k-images/2954461906.jpg"} {"sentences": [["0", "妇女们正在喝酒和编织。"]], "bbox": [[74, 113, 383, 271], [451, 159, 499, 273], [6, 20, 75, 106], [5, 16, 114, 277], [0, 7, 481, 251], [434, 195, 454, 221], [353, 34, 478, 264], [217, 8, 320, 161], [287, 127, 317, 209], [376, 15, 439, 72], [28, 260, 84, 277], [163, 12, 245, 154], [333, 163, 465, 269], [115, 152, 196, 195], [147, 3, 179, 78], [440, 49, 499, 185], [293, 182, 321, 211], [198, 136, 237, 180], [241, 8, 291, 58], [325, 139, 344, 178], [394, 126, 411, 149], [2, 205, 320, 277], [1, 70, 93, 197], [210, 125, 228, 156], [123, 95, 141, 152], [146, 0, 499, 65], [162, 6, 324, 152], [167, 50, 237, 131], [16, 167, 90, 274], [51, 0, 149, 80], [0, 64, 100, 233], [111, 139, 184, 181], [385, 63, 452, 151], [230, 54, 302, 138], [378, 50, 490, 264], [18, 180, 88, 266], [54, 142, 80, 163], [65, 259, 85, 277], [6, 9, 80, 112], [162, 53, 396, 151], [177, 11, 486, 254], [397, 94, 494, 267], [121, 89, 141, 148], [5, 4, 111, 277], [165, 6, 244, 149], [423, 58, 499, 254], [336, 12, 477, 273], [338, 14, 465, 258], [83, 84, 144, 142], [119, 16, 440, 163], [293, 160, 319, 214], [9, 162, 90, 270], [9, 16, 120, 277], [441, 157, 499, 272], [111, 142, 188, 184], [164, 14, 491, 271], [15, 174, 137, 275], [7, 32, 139, 276], [5, 0, 114, 277], [347, 120, 494, 277], [4, 12, 126, 277], [213, 5, 309, 161], [429, 35, 494, 175], [88, 209, 319, 276], [140, 0, 499, 75], [222, 6, 305, 153], [6, 8, 106, 277], [340, 90, 492, 277], [108, 123, 401, 274], [95, 1, 488, 268], [434, 157, 499, 271], [347, 214, 452, 274], [114, 88, 147, 154], [157, 14, 251, 154], [48, 139, 257, 271], [194, 128, 238, 181], [80, 120, 384, 273], [169, 47, 233, 133], [170, 43, 235, 133], [346, 12, 470, 195], [54, 6, 451, 244], [12, 1, 161, 88], [67, 195, 350, 275], [345, 170, 469, 269], [379, 23, 484, 201], [350, 213, 475, 273], [6, 13, 67, 109], [60, 85, 328, 266], [7, 2, 338, 263], [293, 127, 314, 203], [11, 11, 84, 107], [211, 13, 463, 205], [342, 79, 496, 274], [71, 15, 483, 169], [198, 132, 233, 175], [54, 104, 384, 269], [161, 9, 246, 152], [367, 181, 478, 270], [93, 1, 499, 103], [16, 190, 366, 276]], "image_id": "/export/PTM_dataset/flickr30k-cn/flickr30k-images/2314492671.jpg"}

opened by Qiulin-W 2

Owner

chuhaojin

an NLPer.

GitHub

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

MOT-Tracking-by-Detection-Pipeline Tracking-by-Detection形式のMOT(Multi Object Trac

41 Nov 23, 2022

A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

Data-Annotation-Tool How to Run this Tool? To run this software, follow the steps: git clone https://github.com/Autonomous-Car-Project/Data-Annotation

13 Aug 18, 2022

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

305 Dec 16, 2022

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

CLOCs is a novel Camera-LiDAR Object Candidates fusion network. It provides a low-complexity multi-modal fusion framework that improves the performance of single-modality detectors. CLOCs operates on the combined output candidates of any 3D and any 2D detector, and is trained to produce more accurate 3D and 2D detection results.

254 Dec 16, 2022

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Skin Lesion detection using YOLO This project deal

1 Nov 22, 2021

Bling's Object detection tool

Related tags

Overview

BriVL for Building Applications

This repo is re-implemented from following projects:

Online Demo built by BriVL

Contents

Test this Pipeline

Download Models

Requirements

More Resources

Contact

You might also like...

Bling's Object detection tool

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

Comments

text_feat_extractor.py parameters

tag or caption images

[BriVL] Low recall when testing on flickr30k-cn dataset

Owner

chuhaojin

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

Object Detection and Multi-Object Tracking

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Auto-Lama combines object detection and image inpainting to automate object removals

object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.