YolactEdge: Real-time Instance Segmentation on the Edge

Haotian Liu

Last update: Jan 6, 2023

Related tags

Deep Learning real-time realtime pytorch instance-segmentation edge-devices yolactedge

Overview

YolactEdge: Real-time Instance Segmentation on the Edge

██╗   ██╗ ██████╗ ██╗      █████╗  ██████╗████████╗    ███████╗██████╗  ██████╗ ███████╗
╚██╗ ██╔╝██╔═══██╗██║     ██╔══██╗██╔════╝╚══██╔══╝    ██╔════╝██╔══██╗██╔════╝ ██╔════╝
 ╚████╔╝ ██║   ██║██║     ███████║██║        ██║       █████╗  ██║  ██║██║  ███╗█████╗  
  ╚██╔╝  ██║   ██║██║     ██╔══██║██║        ██║       ██╔══╝  ██║  ██║██║   ██║██╔══╝  
   ██║   ╚██████╔╝███████╗██║  ██║╚██████╗   ██║       ███████╗██████╔╝╚██████╔╝███████╗
   ╚═╝    ╚═════╝ ╚══════╝╚═╝  ╚═╝ ╚═════╝   ╚═╝       ╚══════╝╚═════╝  ╚═════╝ ╚══════╝

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images. This is the code for our paper.

For a real-time demo and more samples, check out our demo video.

Installation

See INSTALL.md.

Model Zoo

We provide baseline YOLACT and YolactEdge models trained on COCO and YouTube VIS (our sub-training split, with COCO joint training).

To evalute the model, put the corresponding weights file in the ./weights directory and run one of the following commands.

YouTube VIS models:

Method	Backbone	mAP	AGX-Xavier FPS	RTX 2080 Ti FPS	weights
YOLACT	R-50-FPN	44.7	8.5	59.8	download \| mirror
YolactEdge (w/o TRT)	R-50-FPN	44.2	10.5	67.0	download \| mirror
YolactEdge	R-50-FPN	44.0	32.4	177.6	download \| mirror
YOLACT	R-101-FPN	47.3	5.9	42.6	download \| mirror
YolactEdge (w/o TRT)	R-101-FPN	46.9	9.5	61.2	download \| mirror
YolactEdge	R-101-FPN	46.2	30.8	172.7	download \| mirror

COCO models:

Method	Backbone	mAP	Titan Xp FPS	AGX-Xavier FPS	RTX 2080 Ti FPS	weights
YOLACT	MobileNet-V2	22.1	-	15.0	35.7	download \| mirror
YolactEdge	MobileNet-V2	20.8	-	35.7	161.4	download \| mirror
YOLACT	R-50-FPN	28.2	42.5	9.1	45.0	download \| mirror
YolactEdge	R-50-FPN	27.0	-	30.7	140.3	download \| mirror
YOLACT	R-101-FPN	29.8	33.5	6.6	36.5	download \| mirror
YolactEdge	R-101-FPN	29.5	-	27.3	124.8	download \| mirror

Getting Started

Follow the installation instructions to set up required environment for running YolactEdge.

See instructions to evaluate and train with YolactEdge.

Colab Notebook

Try out our Colab Notebook with a live demo to learn about basic usage.

If you are interested in evaluating YolactEdge with TensorRT, we provide another Colab Notebook with TensorRT environment configuration on Colab.

Evaluation

Quantitative Results

# Convert each component of the trained model to TensorRT using the optimal settings and evaluate on the YouTube VIS validation set (our split).
python3 eval.py --trained_model=./weights/yolact_edge_vid_847_50000.pth

# Evaluate on the entire COCO validation set.
python3 eval.py --trained_model=./weights/yolact_edge_54_800000.pth

# Output a COCO JSON file for the COCO test-dev. The command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively. These files can then be submitted to the website for evaluation.
python3 eval.py --trained_model=./weights/yolact_edge_54_800000.pth --dataset=coco2017_testdev_dataset --output_coco_json

Qualitative Results

# Display qualitative results on COCO. From here on I'll use a confidence threshold of 0.3.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --display

Benchmarking

# Benchmark the trained model on the COCO validation set.
# Run just the raw model on the first 1k images of the validation set
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --benchmark --max_images=1000

Notes

Inference using models trained with YOLACT

If you have a pre-trained model with YOLACT, and you want to take advantage of either TensorRT feature of YolactEdge, simply specify the --config=yolact_edge_config in command line options, and the code will automatically detect and convert the model weights to be compatible.

python3 eval.py --config=yolact_edge_config --trained_model=./weights/yolact_base_54_800000.pth

Inference without Calibration

If you want to run inference command without calibration, you can either run with FP16-only TensorRT optimization, or without TensorRT optimization with corresponding configs. Refer to data/config.py for examples of such configs.

# Evaluate YolactEdge with FP16-only TensorRT optimization with '--use_fp16_tensorrt' option (replace all INT8 optimization with FP16).
python3 eval.py --use_fp16_tensorrt --trained_model=./weights/yolact_edge_54_800000.pth

# Evaluate YolactEdge without TensorRT optimization with '--disable_tensorrt' option.
python3 eval.py --disable_tensorrt --trained_model=./weights/yolact_edge_54_800000.pth

Images

# Display qualitative results on the specified image.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png

# Process an image and save it to another file.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png

# Process a whole folder of images.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --images=path/to/input/folder:path/to/output/folder

Video

# Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance.
# If video_multiframe > 1, then the trt_batch_size should be increased to match it or surpass it. 
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=my_video.mp4

# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=0

# Process a video and save it to another file. This is unoptimized.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video=input_video.mp4:output_video.mp4

Use the help option to see a description of all available command line arguments:

python eval.py --help

Training

Make sure to download the entire dataset using the commands above.

To train, grab an imagenet-pretrained model and put it in ./weights.
- For Resnet101, download resnet101_reducedfc.pth from here.
- For Resnet50, download resnet50-19c8e357.pth from here.
- For MobileNetV2, download mobilenet_v2-b0353104.pth from here.
Run one of the training commands below.
- Note that you can press ctrl+c while training and it will save an *_interrupt.pth file at the current iteration.
- All weights are saved in the ./weights directory by default with the file name __.pth.

# Trains using the base edge config with a batch size of 8 (the default).
python train.py --config=yolact_edge_config

# Resume training yolact_edge with a specific weight file and start from the iteration specified in the weight file's name.
python train.py --config=yolact_edge_config --resume=weights/yolact_edge_10_32100.pth --start_iter=-1

# Use the help option to see a description of all available command line arguments
python train.py --help

Training on video dataset

# Pre-train the image based model
python train.py --config=yolact_edge_youtubevis_config

# Train the flow (warping) module
python train.py --config=yolact_edge_vid_trainflow_config --resume=./weights/yolact_edge_youtubevis_847_50000.pth

# Fine tune the network jointly
python train.py --config=yolact_edge_vid_config --resume=./weights/yolact_edge_vid_trainflow_144_100000.pth

Custom Datasets

You can also train on your own dataset by following these steps:

Depending on the type of your dataset, create a COCO-style (image) or YTVIS-style (video) Object Detection JSON annotation file for your dataset. The specification for this can be found here for COCO and YTVIS respectively. Note that we don't use some fields, so the following may be omitted:
- info
- liscense
- Under image: license, flickr_url, coco_url, date_captured
- categories (we use our own format for categories, see below)
Create a definition for your dataset under dataset_base in data/config.py (see the comments in dataset_base for an explanation of each field):

my_custom_dataset = dataset_base.copy({
    'name': 'My Dataset',

    'train_images': 'path_to_training_images',
    'train_info':   'path_to_training_annotation',

    'valid_images': 'path_to_validation_images',
    'valid_info':   'path_to_validation_annotation',

    'has_gt': True,
    'class_names': ('my_class_id_1', 'my_class_id_2', 'my_class_id_3', ...),

    # below is only needed for YTVIS-style video dataset.

    # whether samples all frames or key frames only.
    'use_all_frames': False,

    # the following four lines define the frame sampling strategy for the given dataset.
    'frame_offset_lb': 1,
    'frame_offset_ub': 4,
    'frame_offset_multiplier': 1,
    'all_frame_direction': 'allway',

    # 1 of K frames is annotated
    'images_per_video': 5,

    # declares a video dataset
    'is_video': True
})

Note that: class IDs in the annotation file should start at 1 and increase sequentially on the order of class_names. If this isn't the case for your annotation file (like in COCO), see the field label_map in dataset_base.
Finally, in yolact_edge_config in the same file, change the value for 'dataset' to 'my_custom_dataset' or whatever you named the config object above. Then you can use any of the training commands in the previous section.

Citation

If you use this code base in your work, please consider citing:

@article{yolactedge,
  author    = {Haotian Liu and Rafael A. Rivera Soto and Fanyi Xiao and Yong Jae Lee},
  title     = {YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS)},
  journal   = {arXiv preprint arXiv:2012.12259},
  year      = {2020},
}

@inproceedings{yolact-iccv2019,
  author    = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},
  title     = {YOLACT: {Real-time} Instance Segmentation},
  booktitle = {ICCV},
  year      = {2019},
}

Contact

For questions about our paper or code, please contact Haotian Liu or Rafael A. Rivera-Soto.

Comments

evaluating on weights trained with custom dataset

I followed instructions to train with my own custom dataset. However when I try to evaluate using those trained weight that I got from training my dataset, I get this error.

[01/11 13:41:30 yolact.eval]: Loading model... WARNING [01/11 13:41:34 yolact.model.load]: Some parameters required by the model do not exist in the checkpoint, and are initialized as they should be: prediction_layers.0.conf_layer.weight, prediction_layers.0.conf_layer.bias, semantic_seg_conv.weight, semantic_seg_conv.bias [01/11 13:41:34 yolact.eval]: Model loaded. [01/11 13:41:34 yolact.eval]: Converting to TensorRT... [01/11 13:41:34 yolact.eval]: Converting backbone to TensorRT... [01/11 13:41:37 yolact.eval]: Converting protonet to TensorRT... [01/11 13:41:37 yolact.eval]: Converting FPN to TensorRT... [01/11 13:41:37 yolact.eval]: Converting PredictionModule to TensorRT... [01/11 13:41:52 yolact.eval]: Converted to TensorRT. Traceback (most recent call last): File "eval.py", line 1407, in evaluate(net, dataset) File "eval.py", line 877, in evaluate evalimage(net, inp, out) File "eval.py", line 590, in evalimage preds = net(batch, extras=extras)["pred_outs"] File "/home/jetson/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/jetson/yolact_edge/yolact.py", line 1875, in forward outs_wrapper["pred_outs"] = self.detect(pred_outs, extras=extras) File "/home/jetson/yolact_edge/layers/functions/detection.py", line 78, in call result = self.detect(batch_idx, conf_preds, decoded_boxes, mask_data, inst_data, extras) File "/home/jetson/yolact_edge/layers/functions/detection.py", line 105, in detect boxes, masks, classes, scores = self.fast_nms(boxes, masks, scores, self.nms_thresh, self.top_k) File "/home/jetson/yolact_edge/layers/functions/detection.py", line 168, in fast_nms classes = classes[keep] IndexError: too many indices for tensor of dimension 2
usage

opened by soohunee 54
2 questions about calibrations
Do I need a annotations file for calibrations?

I am using TensorRT fp16 optimize (--use_fp16_tensorrt) but the network didn't find even one object (without the TensorRT it's works perfect). It is possible that the fp16 don't do the calibration? It look like it from the code (https://github.com/haotian-liu/yolact_edge/blob/662d760f8b2d8b4409d385aaf172e155aaa3a3d8/utils/tensorrt.py#L38)

Thanks
opened by sdimantsd 33
RuntimeError: DataLoader worker (pid(s) 87563) exited unexpectedly
Hi, When I try to train model $ python3 train.py --config=yolact_edge_config --resume=weights/yolact_edge_vid_resnet50_847_50000.pth

I face the below issue :

File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler

_error_if_any_worker_fails()

RuntimeError: DataLoader worker (pid 88428) is killed by signal: Killed.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "train.py", line 707, in

train(0, args=args)

File "train.py", line 357, in train

datum = next(data_loader_iter)

File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next

data = self._next_data()

File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data

idx, data = self._get_data()

File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1034, in _get_data

success, data = self._try_get_data()

File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 885, in _try_get_data

raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e

RuntimeError: DataLoader worker (pid(s) 88428) exited unexpectedly

Is there any advice ?
opened by VbsmRobotic 24

What channel size does FPN expect?

I've been trying to add a new backbone to this project for the past several days and I have several questions. Would you mind helping me out?

In my forward(x) I do the following return tuple(outs) where outs has the output of all 31 blocks of my network. In selected_layers I specify [22, 26, -1]. This returns the following error:

Given groups=1, weight of size [256, 512, 1, 1], expected input[8, 1024, 7, 7] to have 512 channels, but got 1024 channels instead

If I replaced selected_layers with just [22, 26], the network trains fine until it hits the following error:

[01/17 19:20:14 yolact.train]: eta: 0:00:00  epoch: 0  iter: 0  B: 8.873  M: 57.962  C: 21.055  S: 59.843  T: 147.732  time: 5.293  data_time: 0.000  lr: 0.000100  max_mem: 4405M
[01/17 19:20:17 yolact.eval]: Computing validation mAP (this may take a while)...

Traceback (most recent call last):
  File "train.py", line 746, in <module>
    train(0, args=args)
  File "train.py", line 642, in train
    compute_validation_map(yolact_net, val_dataset)
  File "train.py", line 734, in compute_validation_map
    eval_script.evaluate(yolact_net, dataset, train_mode=True, train_cfg=cfg)
  File "/content/radar/eval.py", line 1061, in evaluate
    preds = net(batch, extras=extras)["pred_outs"]
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/yolact_edge/yolact.py", line 1876, in forward
    outs_wrapper["pred_outs"] = self.detect(pred_outs, extras=extras)
  File "/content/yolact_edge/layers/functions/detection.py", line 74, in __call__
    conf_preds = conf_data.view(batch_size, num_priors, self.num_classes).transpose(2, 1).contiguous()
RuntimeError: shape '[1, 6958, 81]' is invalid for input of size 396900

I'm so close to getting things working but I always seem to hit a new bump in the road. Any help would be greatly appreciated!

opened by cyrilzakka 24

RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D :getting this error for better mAP

First of all thanks to all developers who build the best model.

I have trained Yolact Edge on a single object. When I try to inference using a trained model with around 50 mAP then it starts prediction(prediction very bad) but if the model mAP around 80 or high then it through a bellow error.

Traceback (most recent call last): File "eval.py", line 1246, in evaluate(net, dataset) File "eval.py", line 894, in evaluate evalvideo(net, args.video) File "eval.py", line 777, in evalvideo frame_buffer.put(frame['value'].get()) File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get raise self._value File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "eval.py", line 699, in prep_frame return prep_display(preds, frame, None, None, undo_transform=False, class_color=True) File "eval.py", line 167, in prep_display score_threshold = args.score_threshold) File "/home/ubuntu_pc//yolact_edge/layers/output_utils.py", line 103, in postprocess masks = proto_data @ masks.t() RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D

I have changed the dimension of tensor 3D to 2D but getting multiple errors.

If anyone has a solution then please share.

opened by kashzade 23
About custom dataset with TensorRT RuntimeError
Sorry to bother you.. I retrained my custom dataset to detect only person class.

But when I used the model to evaluate,it showed the error about "Expected 3 elements in a list but found 2"...

Did anyone meet the error before ?

I also used --use_fp16_tensorrt command ,and it still get the error.

I think maybe it's overfitting,so it can't transfer tensorrt properly.

Did it possible ?

Here's my config.py

person_dataset=dataset_base.copy({ 'name': 'person', 'train_info': '/home/jason/dataset/mscoco/train2017_person.json', 'valid_info': '/home/jason/dataset/mscoco/val2017_person.json', 'class_names':("person",), 'label_map': {1:1}, })

Thanks for your reading~
opened by ntut108318099 16
Custom YOLACT model trains well but shows IndexError in case of eval.py
Hi,

Thank you for developing this amazing tool.

I recently modified the yolact.py and added one more mask so that the network predicts 2 of them (instead of 1). I am also using a custom dataset. The training works fine but eval.py is showing the following error:

File "eval.py", line 173, in prep_display masks = t[3][idx] IndexError: too many indices for tensor of dimension 3

I checked the size of idx and t[3] to debug the issue and found the following information:

idx: tensor([0, 1, 2, 3, 4]) idx.size(): torch.Size([5]) t[3].size(): torch.Size([100, 480, 640])

It seems that the error is misleading. Anyway, after a while, I realized that in your eval.py, the variable idx is simply :args.top_k. So I changed it but then the output from eval.py was shown incorrectly.

Moreover, the statement masks = t[3][idx] works fine if PredictionModuleTRTWrapper is disabled (just comment 2 lines). I should also mention that the network shows the above error in forward propagation step in eval.py. This leads me to think that the pred_layer is not set properly.

Can you please help me out? Is input_sizes variable is set properly or needs some modification?

Thank you so much.
opened by ravijo 15

Zero mAP and no detections on custom dataset

Dear Haotian Liu, currently I'm trying the yolact_edge trained on custom coco-like dataset for one class ('person'). Contrary to coco the image resolution is 512.

When I run evaluation with TensorRT conversion I get the following error during the protonet conversion:

[02/23 15:26:23 yolact.eval]: Converting protonet to TensorRT...
Traceback (most recent call last):
  File "eval.py", line 1241, in <module>
    convert_to_tensorrt(net, cfg, args, transform=BaseTransform())
  File "/home/oidpsv/yolact_edge/utils/tensorrt.py", line 156, in convert_to_tensorrt
    net.to_tensorrt_protonet(cfg.torch2trt_protonet_int8, calibration_dataset=calibration_protonet_dataset, batch_size=args.trt_batch_size)
  File "/home/oidpsv/yolact_edge/yolact.py", line 1565, in to_tensorrt_protonet
    self.trt_load_if("proto_net", trt_fn, [x], int8_mode, batch_size=batch_size)
  File "/home/oidpsv/yolact_edge/yolact.py", line 1534, in trt_load_if
    module = trt_fn(module, trt_fn_params)
  File "/opt/conda/lib/python3.6/site-packages/torch2trt-0.1.0-py3.6-linux-x86_64.egg/torch2trt/torch2trt.py", line 555, in torch2trto
    engine = builder.build_cuda_engine(network)
  File "/opt/conda/lib/python3.6/site-packages/torch2trt-0.1.0-py3.6-linux-x86_64.egg/torch2trt/calibration.py", line 51, in get_batch
    buffer[i].copy_(tensor)
RuntimeError: The size of tensor a (69) must match the size of tensor b (64) at non-singleton dimension 2

which can be fixed by changing line 1563 in yolact.py from x = torch.ones((1, 256, 69, 69)).cuda() to x = torch.ones((1, 256, 64, 64)).cuda().

The problem is that in this case further evaluation with TensorRT conversion gives zero mAP and processing of images provides empty result (no masks or boxes). Could you please help me? Many thanks.

opened by ghost 14

No output | Custom dataset | TensorRT

Hi @haotian-liu,

I'm working on a custom instance segmentation task with three classes. While I get output segmentations on my Jetson Xavier by using tag --disable_tensorrt, there's no output when I run the model on TensorRT.

I'm training a ResNet50 model on my PC and transferring the learned model to Jetson for inference.

Initially, I suspected the error is similar to issue:27 as I got IndexError Warnings when enabling TensorRT. But while debugging I found that except blocks do no harm.

My commented detection.py file:

# This try-except block aims to fix the IndexError that we might encounter when we train on custom datasets and evaluate with TensorRT enabled. See https://github.com/haotian-liu/yolact_edge/issues/27.
       try:
           classes = classes[keep]
           boxes = boxes[keep]
           masks = masks[keep]
           scores = scores[keep]

           print("Passed first Try/Except")

       except IndexError:
           from utils.logging_helper import log_once
           log_once(self, "issue_27_flatten", name="yolact.layers.detect", 
           message="Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Flattening predictions to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.")

           classes = torch.flatten(classes, end_dim=1)
           boxes = torch.flatten(boxes, end_dim=1)
           masks = torch.flatten(masks, end_dim=1)
           scores = torch.flatten(scores, end_dim=1)
           keep = torch.flatten(keep, end_dim=1)

           idx = torch.nonzero(keep, as_tuple=True)[0]
           print(f"\nIdx: {idx}")
           print(f"Idx_min: {idx.min()} and Idx_max: {idx.max()}")

           classes = torch.index_select(classes, 0, idx)
           boxes = torch.index_select(boxes, 0, idx)
           masks = torch.index_select(masks, 0, idx)
           scores = torch.index_select(scores, 0, idx)

       # Only keep the top cfg.max_num_detections highest scores across all classes
       scores, idx = scores.sort(0, descending=True)
       idx = idx[:cfg.max_num_detections]
       scores = scores[:cfg.max_num_detections]

       print(f"\nIdx: {idx}")
       print(f"Idx_min: {idx.min()} and Idx_max: {idx.max()}")

       try:
           print(f"\nInside second Try")

           print(f"Classes: {classes}")
           print(f"Boxes: {boxes}")

           classes = classes[idx]
           print(f"Classes updated: {classes}")

           boxes= boxes[idx]
           print(f"Boxes updated: {boxes}")
           
           masks = masks[idx]

           print(f"Scores: {scores}")
       except IndexError:
           from utils.logging_helper import log_once
           log_once(self, "issue_27_index_select", name="yolact.layers.detect", message="Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Using `torch.index_select` to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.")

           print(f"\nSecond Try/Except")

           classes = torch.index_select(classes, 0, idx)
           boxes = torch.index_select(boxes, 0, idx)
           masks = torch.index_select(masks, 0, idx)

           print(f"Classes updated: {classes}")
           print(f"Boxes updated: {boxes}")
           print(f"Scores: {scores}")

       return boxes, masks, classes, scores

Command that I use to run evaluation:

~/Projects/yolact_edge$ python3 eval.py --config=yolact_edge_config --trained_model=weights/yolact_edge_2115_110000.pth --score_threshold=0.3 --top_k=20 --image=./test_input/020801_2020_11_25_11_54_18.png
[01/27 15:04:38 yolact.eval]: Loading model...
[01/27 15:04:42 yolact.eval]: Model loaded.
[01/27 15:04:42 yolact.eval]: Converting to TensorRT...
[01/27 15:04:42 yolact.eval]: Converting backbone to TensorRT...
[01/27 15:04:44 yolact.eval]: Converting protonet to TensorRT...
[01/27 15:04:44 yolact.eval]: Converting FPN to TensorRT...
[01/27 15:04:44 yolact.eval]: Converting PredictionModule to TensorRT...
[01/27 15:04:55 yolact.eval]: Converted to TensorRT.
WARNING [01/27 15:04:56 yolact.layers.detect]: Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Flattening predictions to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.

Idx: tensor([ 0,  1,  2,  9, 11, 13, 15, 16, 18, 24, 26, 27, 30, 31, 34])
Idx_min: 0 and Idx_max: 34

Idx: tensor([ 0, 10,  1,  2, 11, 12,  5, 13,  6, 14,  7,  3,  8,  9,  4])
Idx_min: 0 and Idx_max: 14

Inside second Try
Classes: tensor([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2])
Boxes: tensor([[ 0.7628,  0.4625,  0.7711,  0.5278],
        [ 0.9843,  0.7867,  0.9961,  0.8066],
        [ 0.8758,  0.4074,  0.9116,  0.5297],
        [ 0.7709, -0.0089,  0.9846,  0.9502],
        [ 0.7592,  0.1906,  0.7628,  0.2080],
        [ 0.9848,  0.7869,  0.9971,  0.8092],
        [ 0.7709, -0.0089,  0.9846,  0.9502],
        [ 0.7628,  0.4625,  0.7711,  0.5278],
        [ 0.8758,  0.4074,  0.9116,  0.5297],
        [ 0.7592,  0.1906,  0.7628,  0.2080],
        [ 0.7592,  0.1906,  0.7633,  0.2087],
        [ 0.7709, -0.0089,  0.9846,  0.9502],
        [ 0.8753,  0.4015,  0.9084,  0.5413],
        [ 0.9848,  0.7869,  0.9971,  0.8092],
        [ 0.7628,  0.4625,  0.7711,  0.5278]])
WARNING [01/27 15:04:56 yolact.layers.detect]: Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Using `torch.index_select` to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.

Second Try/Except
Classes updated: tensor([0, 2, 0, 0, 2, 2, 1, 2, 1, 2, 1, 0, 1, 1, 0])
Boxes updated: tensor([[ 0.7628,  0.4625,  0.7711,  0.5278],
        [ 0.7592,  0.1906,  0.7633,  0.2087],
        [ 0.9843,  0.7867,  0.9961,  0.8066],
        [ 0.8758,  0.4074,  0.9116,  0.5297],
        [ 0.7709, -0.0089,  0.9846,  0.9502],
        [ 0.8753,  0.4015,  0.9084,  0.5413],
        [ 0.9848,  0.7869,  0.9971,  0.8092],
        [ 0.9848,  0.7869,  0.9971,  0.8092],
        [ 0.7709, -0.0089,  0.9846,  0.9502],
        [ 0.7628,  0.4625,  0.7711,  0.5278],
        [ 0.7628,  0.4625,  0.7711,  0.5278],
        [ 0.7709, -0.0089,  0.9846,  0.9502],
        [ 0.8758,  0.4074,  0.9116,  0.5297],
        [ 0.7592,  0.1906,  0.7628,  0.2080],
        [ 0.7592,  0.1906,  0.7628,  0.2080]])
Scores: tensor([9.8882e-01, 9.0431e-01, 8.8910e-01, 7.1460e-01, 5.0766e-01, 1.0031e-03,
        4.2995e-04, 3.9997e-04, 2.7969e-04, 1.7686e-04, 1.9921e-05, 1.6216e-05,
        5.4856e-06, 1.4189e-06, 1.8612e-07])

Please note: If I add --disable_tensorrt tag, I get results as the code executes in try blocks. I also ensure to remove the cached '.trt' files in ./weights folder. Can you help me here ? Thank you!

opened by smahesh2694 14

RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4

hi,when I detect an image,there is a error.The error is :RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4, I hope one can help me. Thanks !

opened by Yang-Changhui 12
30fps performance on RTX3070

Hi

You claimed that yolact edge runs 67fps on RTX 2080 Ti without TensorRT on your paper. But, I tested it on RTX 3070 graphic card and obtained 30fps. I checked that RTX3070 and RTX2080 Ti are quite similar in performance. Also, I am using Windows machine but I believe it is not the reason why the performance is 2x times lower than what you claimed. I appreciate your feedback.

Thanks.

opened by spacewalk01 11
Problems of Evaluation Output: "rock""showel""teeth"
I downloaded the YolactEdge R-101-FPN and some other trained models to execute the evaluation code. But as shown in the image, all the detection results are "rock","showel" and "teeth". could you give me some advice about the problem? Thanks a lot!

Environment:

OS: Ubuntu 20.04

GPU: GTX1660 ti

CUDA Version 11.4.4
opened by ChrisLong13 2
Running inference on video with pre-trained weights is stuck in "converting to TensorRT" step

Describe the bug Running inference on video with pre-trained weights is stuck in "converting to TensorRT" step

Full logs Log while running comand: !python3 ./yolact_edge/eval.py --trained_model=./yolact_edge/weights/yolact_edge/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video={file_path}:{output_path} --calib_images {calib}

After killing the script:

Environment: Colab jupyter file from yolact_edge github (slightly modified to accomodate gdrive imports and some other stuff, but with same working and instalation procedure):

opened by marcoluis97 0
Bug in augmentation configuration

In SSDAugmentation and SSDAugmentationVideo there is an error in the enabling of the rotation augmentation. The line enable_if(cfg.augment_random_flip, RandomRot90()) should be converted in enable_if(cfg.augment_random_rot90, RandomRot90()) in both classes.

https://github.com/haotian-liu/yolact_edge/blob/3f423ede1aeac73dbf86dadfe85af9e288f7f99b/yolact_edge/utils/augmentations.py#L908

https://github.com/haotian-liu/yolact_edge/blob/3f423ede1aeac73dbf86dadfe85af9e288f7f99b/yolact_edge/utils/augmentations.py#L932

opened by domef 1
yolact_edge with deepstream?

NOT AN ISSUE BUT A QUESTION

Is there a documented way on how to run yolact_edge with deepstream?

As far as I know, one would need an .engine file, and .cpp file in order to parse the output of the model and send it to nvinfer and nvosd deepstream plugins.

Any kind of hint will be appreciated. Thanks!

opened by aurelm95 0

Owner

Haotian Liu

GitHub

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

4 Dec 30, 2022

Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

Codes for ECBSR Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices Xindong Zhang, Hui Zeng, Lei Zhang ACM Multimedia 202

236 Dec 26, 2022

Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

YOLO-ReT This is the original implementation of the paper: YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs. Prakhar Ganesh, Ya

69 Oct 19, 2022

BED: A Real-Time Object Detection System for Edge Devices

BED: A Real-Time Object Detection System for Edge Devices About this project Thi

Data Analytics Lab at Texas A&M University

44 Nov 18, 2022

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

39 Sep 20, 2022

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

1 Feb 15, 2022

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

873 Jan 6, 2023

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You

93 Dec 25, 2022

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

182 Dec 14, 2022

FANet - Real-time Semantic Segmentation with Fast Attention

FANet Real-time Semantic Segmentation with Fast Attention Ping Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko , Stan Sc

42 Nov 30, 2022

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes Introduction This is the unofficial code of Deep Dual-re

113 Dec 23, 2022

YolactEdge: Real-time Instance Segmentation on the Edge

Related tags

Overview

YolactEdge: Real-time Instance Segmentation on the Edge

Installation

Model Zoo

Getting Started

Colab Notebook

Evaluation

Quantitative Results

Qualitative Results

Benchmarking

Notes

Inference using models trained with YOLACT

Inference without Calibration

Images

Video

Training

Training on video dataset

Custom Datasets

Citation

Contact

Comments

Owner

Haotian Liu

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

BED: A Real-Time Object Detection System for Edge Devices

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

FANet - Real-time Semantic Segmentation with Fast Attention

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

implement of SwiftNet:Real-time Video Object Segmentation

A keras-based real-time model for medical image segmentation (CFPNet-M)

DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers