YolactEdge: Real-time Instance Segmentation on the Edge

Overview

YolactEdge: Real-time Instance Segmentation on the Edge

██╗   ██╗ ██████╗ ██╗      █████╗  ██████╗████████╗    ███████╗██████╗  ██████╗ ███████╗
╚██╗ ██╔╝██╔═══██╗██║     ██╔══██╗██╔════╝╚══██╔══╝    ██╔════╝██╔══██╗██╔════╝ ██╔════╝
 ╚████╔╝ ██║   ██║██║     ███████║██║        ██║       █████╗  ██║  ██║██║  ███╗█████╗  
  ╚██╔╝  ██║   ██║██║     ██╔══██║██║        ██║       ██╔══╝  ██║  ██║██║   ██║██╔══╝  
   ██║   ╚██████╔╝███████╗██║  ██║╚██████╗   ██║       ███████╗██████╔╝╚██████╔╝███████╗
   ╚═╝    ╚═════╝ ╚══════╝╚═╝  ╚═╝ ╚═════╝   ╚═╝       ╚══════╝╚═════╝  ╚═════╝ ╚══════╝

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images. This is the code for our paper.

For a real-time demo and more samples, check out our demo video.

example-gif-1

example-gif-2

example-gif-3

Installation

See INSTALL.md.

Model Zoo

We provide baseline YOLACT and YolactEdge models trained on COCO and YouTube VIS (our sub-training split, with COCO joint training).

To evalute the model, put the corresponding weights file in the ./weights directory and run one of the following commands.

YouTube VIS models:

Method Backbone  mAP AGX-Xavier FPS RTX 2080 Ti FPS weights
YOLACT R-50-FPN 44.7 8.5 59.8 download | mirror
YolactEdge
(w/o TRT)
R-50-FPN 44.2 10.5 67.0 download | mirror
YolactEdge R-50-FPN 44.0 32.4 177.6 download | mirror
YOLACT R-101-FPN 47.3 5.9 42.6 download | mirror
YolactEdge
(w/o TRT)
R-101-FPN 46.9 9.5 61.2 download | mirror
YolactEdge R-101-FPN 46.2 30.8 172.7 download | mirror

COCO models:

Method    Backbone     mAP Titan Xp FPS AGX-Xavier FPS RTX 2080 Ti FPS weights
YOLACT MobileNet-V2 22.1 - 15.0 35.7 download | mirror
YolactEdge MobileNet-V2 20.8 - 35.7 161.4 download | mirror
YOLACT R-50-FPN 28.2 42.5 9.1 45.0 download | mirror
YolactEdge R-50-FPN 27.0 - 30.7 140.3 download | mirror
YOLACT R-101-FPN 29.8 33.5 6.6 36.5 download | mirror
YolactEdge R-101-FPN 29.5 - 27.3 124.8 download | mirror

Getting Started

Follow the installation instructions to set up required environment for running YolactEdge.

See instructions to evaluate and train with YolactEdge.

Colab Notebook

Try out our Colab Notebook with a live demo to learn about basic usage.

If you are interested in evaluating YolactEdge with TensorRT, we provide another Colab Notebook with TensorRT environment configuration on Colab.

Evaluation

Quantitative Results

# Convert each component of the trained model to TensorRT using the optimal settings and evaluate on the YouTube VIS validation set (our split).
python3 eval.py --trained_model=./weights/yolact_edge_vid_847_50000.pth

# Evaluate on the entire COCO validation set.
python3 eval.py --trained_model=./weights/yolact_edge_54_800000.pth

# Output a COCO JSON file for the COCO test-dev. The command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively. These files can then be submitted to the website for evaluation.
python3 eval.py --trained_model=./weights/yolact_edge_54_800000.pth --dataset=coco2017_testdev_dataset --output_coco_json

Qualitative Results

# Display qualitative results on COCO. From here on I'll use a confidence threshold of 0.3.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --display

Benchmarking

# Benchmark the trained model on the COCO validation set.
# Run just the raw model on the first 1k images of the validation set
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --benchmark --max_images=1000

Notes

Inference using models trained with YOLACT

If you have a pre-trained model with YOLACT, and you want to take advantage of either TensorRT feature of YolactEdge, simply specify the --config=yolact_edge_config in command line options, and the code will automatically detect and convert the model weights to be compatible.

python3 eval.py --config=yolact_edge_config --trained_model=./weights/yolact_base_54_800000.pth

Inference without Calibration

If you want to run inference command without calibration, you can either run with FP16-only TensorRT optimization, or without TensorRT optimization with corresponding configs. Refer to data/config.py for examples of such configs.

# Evaluate YolactEdge with FP16-only TensorRT optimization with '--use_fp16_tensorrt' option (replace all INT8 optimization with FP16).
python3 eval.py --use_fp16_tensorrt --trained_model=./weights/yolact_edge_54_800000.pth

# Evaluate YolactEdge without TensorRT optimization with '--disable_tensorrt' option.
python3 eval.py --disable_tensorrt --trained_model=./weights/yolact_edge_54_800000.pth

Images

# Display qualitative results on the specified image.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png

# Process an image and save it to another file.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png

# Process a whole folder of images.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --images=path/to/input/folder:path/to/output/folder

Video

# Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance.
# If video_multiframe > 1, then the trt_batch_size should be increased to match it or surpass it. 
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=my_video.mp4

# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=0

# Process a video and save it to another file. This is unoptimized.
python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video=input_video.mp4:output_video.mp4

Use the help option to see a description of all available command line arguments:

python eval.py --help

Training

Make sure to download the entire dataset using the commands above.

  • To train, grab an imagenet-pretrained model and put it in ./weights.
    • For Resnet101, download resnet101_reducedfc.pth from here.
    • For Resnet50, download resnet50-19c8e357.pth from here.
    • For MobileNetV2, download mobilenet_v2-b0353104.pth from here.
  • Run one of the training commands below.
    • Note that you can press ctrl+c while training and it will save an *_interrupt.pth file at the current iteration.
    • All weights are saved in the ./weights directory by default with the file name __.pth.
# Trains using the base edge config with a batch size of 8 (the default).
python train.py --config=yolact_edge_config

# Resume training yolact_edge with a specific weight file and start from the iteration specified in the weight file's name.
python train.py --config=yolact_edge_config --resume=weights/yolact_edge_10_32100.pth --start_iter=-1

# Use the help option to see a description of all available command line arguments
python train.py --help

Training on video dataset

# Pre-train the image based model
python train.py --config=yolact_edge_youtubevis_config

# Train the flow (warping) module
python train.py --config=yolact_edge_vid_trainflow_config --resume=./weights/yolact_edge_youtubevis_847_50000.pth

# Fine tune the network jointly
python train.py --config=yolact_edge_vid_config --resume=./weights/yolact_edge_vid_trainflow_144_100000.pth

Custom Datasets

You can also train on your own dataset by following these steps:

  • Depending on the type of your dataset, create a COCO-style (image) or YTVIS-style (video) Object Detection JSON annotation file for your dataset. The specification for this can be found here for COCO and YTVIS respectively. Note that we don't use some fields, so the following may be omitted:
    • info
    • liscense
    • Under image: license, flickr_url, coco_url, date_captured
    • categories (we use our own format for categories, see below)
  • Create a definition for your dataset under dataset_base in data/config.py (see the comments in dataset_base for an explanation of each field):
my_custom_dataset = dataset_base.copy({
    'name': 'My Dataset',

    'train_images': 'path_to_training_images',
    'train_info':   'path_to_training_annotation',

    'valid_images': 'path_to_validation_images',
    'valid_info':   'path_to_validation_annotation',

    'has_gt': True,
    'class_names': ('my_class_id_1', 'my_class_id_2', 'my_class_id_3', ...),

    # below is only needed for YTVIS-style video dataset.

    # whether samples all frames or key frames only.
    'use_all_frames': False,

    # the following four lines define the frame sampling strategy for the given dataset.
    'frame_offset_lb': 1,
    'frame_offset_ub': 4,
    'frame_offset_multiplier': 1,
    'all_frame_direction': 'allway',

    # 1 of K frames is annotated
    'images_per_video': 5,

    # declares a video dataset
    'is_video': True
})
  • Note that: class IDs in the annotation file should start at 1 and increase sequentially on the order of class_names. If this isn't the case for your annotation file (like in COCO), see the field label_map in dataset_base.
  • Finally, in yolact_edge_config in the same file, change the value for 'dataset' to 'my_custom_dataset' or whatever you named the config object above. Then you can use any of the training commands in the previous section.

Citation

If you use this code base in your work, please consider citing:

@article{yolactedge,
  author    = {Haotian Liu and Rafael A. Rivera Soto and Fanyi Xiao and Yong Jae Lee},
  title     = {YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS)},
  journal   = {arXiv preprint arXiv:2012.12259},
  year      = {2020},
}
@inproceedings{yolact-iccv2019,
  author    = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},
  title     = {YOLACT: {Real-time} Instance Segmentation},
  booktitle = {ICCV},
  year      = {2019},
}

Contact

For questions about our paper or code, please contact Haotian Liu or Rafael A. Rivera-Soto.

Comments
  • evaluating on weights trained with custom dataset

    evaluating on weights trained with custom dataset

    I followed instructions to train with my own custom dataset. However when I try to evaluate using those trained weight that I got from training my dataset, I get this error.

    [01/11 13:41:30 yolact.eval]: Loading model... WARNING [01/11 13:41:34 yolact.model.load]: Some parameters required by the model do not exist in the checkpoint, and are initialized as they should be: prediction_layers.0.conf_layer.weight, prediction_layers.0.conf_layer.bias, semantic_seg_conv.weight, semantic_seg_conv.bias [01/11 13:41:34 yolact.eval]: Model loaded. [01/11 13:41:34 yolact.eval]: Converting to TensorRT... [01/11 13:41:34 yolact.eval]: Converting backbone to TensorRT... [01/11 13:41:37 yolact.eval]: Converting protonet to TensorRT... [01/11 13:41:37 yolact.eval]: Converting FPN to TensorRT... [01/11 13:41:37 yolact.eval]: Converting PredictionModule to TensorRT... [01/11 13:41:52 yolact.eval]: Converted to TensorRT. Traceback (most recent call last): File "eval.py", line 1407, in evaluate(net, dataset) File "eval.py", line 877, in evaluate evalimage(net, inp, out) File "eval.py", line 590, in evalimage preds = net(batch, extras=extras)["pred_outs"] File "/home/jetson/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/jetson/yolact_edge/yolact.py", line 1875, in forward outs_wrapper["pred_outs"] = self.detect(pred_outs, extras=extras) File "/home/jetson/yolact_edge/layers/functions/detection.py", line 78, in call result = self.detect(batch_idx, conf_preds, decoded_boxes, mask_data, inst_data, extras) File "/home/jetson/yolact_edge/layers/functions/detection.py", line 105, in detect boxes, masks, classes, scores = self.fast_nms(boxes, masks, scores, self.nms_thresh, self.top_k) File "/home/jetson/yolact_edge/layers/functions/detection.py", line 168, in fast_nms classes = classes[keep] IndexError: too many indices for tensor of dimension 2

    usage 
    opened by soohunee 54
  • 2 questions about calibrations

    2 questions about calibrations

    1. Do I need a annotations file for calibrations?
    2. I am using TensorRT fp16 optimize (--use_fp16_tensorrt) but the network didn't find even one object (without the TensorRT it's works perfect). It is possible that the fp16 don't do the calibration? It look like it from the code (https://github.com/haotian-liu/yolact_edge/blob/662d760f8b2d8b4409d385aaf172e155aaa3a3d8/utils/tensorrt.py#L38)

    Thanks

    opened by sdimantsd 33
  • RuntimeError: DataLoader worker (pid(s) 87563) exited unexpectedly

    RuntimeError: DataLoader worker (pid(s) 87563) exited unexpectedly

    Hi, When I try to train model $ python3 train.py --config=yolact_edge_config --resume=weights/yolact_edge_vid_resnet50_847_50000.pth

    I face the below issue :

    • File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    • _error_if_any_worker_fails()
      
    • RuntimeError: DataLoader worker (pid 88428) is killed by signal: Killed.
    • The above exception was the direct cause of the following exception:
    • Traceback (most recent call last):
    • File "train.py", line 707, in
    •  train(0, args=args)
      
    • File "train.py", line 357, in train
    •  datum = next(data_loader_iter)
      
    • File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
    •  data = self._next_data()
      
    • File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data
    •  idx, data = self._get_data()
      
    • File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1034, in _get_data
    •  success, data = self._try_get_data()
      
    • File "/home/vahid/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 885, in _try_get_data
    •  raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
      
    • RuntimeError: DataLoader worker (pid(s) 88428) exited unexpectedly

    Is there any advice ?

    opened by VbsmRobotic 24
  • What channel size does FPN expect?

    What channel size does FPN expect?

    I've been trying to add a new backbone to this project for the past several days and I have several questions. Would you mind helping me out?

    In my forward(x) I do the following return tuple(outs) where outs has the output of all 31 blocks of my network. In selected_layers I specify [22, 26, -1]. This returns the following error:

    Given groups=1, weight of size [256, 512, 1, 1], expected input[8, 1024, 7, 7] to have 512 channels, but got 1024 channels instead

    If I replaced selected_layers with just [22, 26], the network trains fine until it hits the following error:

    [01/17 19:20:14 yolact.train]: eta: 0:00:00  epoch: 0  iter: 0  B: 8.873  M: 57.962  C: 21.055  S: 59.843  T: 147.732  time: 5.293  data_time: 0.000  lr: 0.000100  max_mem: 4405M
    [01/17 19:20:17 yolact.eval]: Computing validation mAP (this may take a while)...
    
    Traceback (most recent call last):
      File "train.py", line 746, in <module>
        train(0, args=args)
      File "train.py", line 642, in train
        compute_validation_map(yolact_net, val_dataset)
      File "train.py", line 734, in compute_validation_map
        eval_script.evaluate(yolact_net, dataset, train_mode=True, train_cfg=cfg)
      File "/content/radar/eval.py", line 1061, in evaluate
        preds = net(batch, extras=extras)["pred_outs"]
      File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/content/yolact_edge/yolact.py", line 1876, in forward
        outs_wrapper["pred_outs"] = self.detect(pred_outs, extras=extras)
      File "/content/yolact_edge/layers/functions/detection.py", line 74, in __call__
        conf_preds = conf_data.view(batch_size, num_priors, self.num_classes).transpose(2, 1).contiguous()
    RuntimeError: shape '[1, 6958, 81]' is invalid for input of size 396900
    

    I'm so close to getting things working but I always seem to hit a new bump in the road. Any help would be greatly appreciated!

    opened by cyrilzakka 24
  • RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D :getting this error for  better mAP

    RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D :getting this error for better mAP

    First of all thanks to all developers who build the best model.

    I have trained Yolact Edge on a single object. When I try to inference using a trained model with around 50 mAP then it starts prediction(prediction very bad) but if the model mAP around 80 or high then it through a bellow error.

    Traceback (most recent call last): File "eval.py", line 1246, in evaluate(net, dataset) File "eval.py", line 894, in evaluate evalvideo(net, args.video) File "eval.py", line 777, in evalvideo frame_buffer.put(frame['value'].get()) File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get raise self._value File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "eval.py", line 699, in prep_frame return prep_display(preds, frame, None, None, undo_transform=False, class_color=True) File "eval.py", line 167, in prep_display score_threshold = args.score_threshold) File "/home/ubuntu_pc//yolact_edge/layers/output_utils.py", line 103, in postprocess masks = proto_data @ masks.t() RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D

    I have changed the dimension of tensor 3D to 2D but getting multiple errors.

    If anyone has a solution then please share.

    opened by kashzade 23
  • About custom dataset with TensorRT RuntimeError

    About custom dataset with TensorRT RuntimeError

    Sorry to bother you.. I retrained my custom dataset to detect only person class.

    But when I used the model to evaluate,it showed the error about "Expected 3 elements in a list but found 2"...

    Did anyone meet the error before ?

    I also used --use_fp16_tensorrt command ,and it still get the error. image

    I think maybe it's overfitting,so it can't transfer tensorrt properly.

    Did it possible ?

    Here's my config.py

    person_dataset=dataset_base.copy({
        'name': 'person',
        
        'train_info': '/home/jason/dataset/mscoco/train2017_person.json',
        'valid_info': '/home/jason/dataset/mscoco/val2017_person.json',
        'class_names':("person",),
        'label_map': {1:1},
    })
    

    Thanks for your reading~

    opened by ntut108318099 16
  • Custom YOLACT model trains well but shows IndexError in case of eval.py

    Custom YOLACT model trains well but shows IndexError in case of eval.py

    Hi,

    Thank you for developing this amazing tool.

    I recently modified the yolact.py and added one more mask so that the network predicts 2 of them (instead of 1). I am also using a custom dataset. The training works fine but eval.py is showing the following error:

    File "eval.py", line 173, in prep_display
        masks = t[3][idx]
    IndexError: too many indices for tensor of dimension 3
    

    I checked the size of idx and t[3] to debug the issue and found the following information:

    idx: tensor([0, 1, 2, 3, 4])
    idx.size(): torch.Size([5])
    t[3].size(): torch.Size([100, 480, 640])
    

    It seems that the error is misleading. Anyway, after a while, I realized that in your eval.py, the variable idx is simply :args.top_k. So I changed it but then the output from eval.py was shown incorrectly.

    Moreover, the statement masks = t[3][idx] works fine if PredictionModuleTRTWrapper is disabled (just comment 2 lines). I should also mention that the network shows the above error in forward propagation step in eval.py. This leads me to think that the pred_layer is not set properly.

    Can you please help me out? Is input_sizes variable is set properly or needs some modification?

    Thank you so much.

    opened by ravijo 15
  • Zero mAP and no detections on custom dataset

    Zero mAP and no detections on custom dataset

    Dear Haotian Liu, currently I'm trying the yolact_edge trained on custom coco-like dataset for one class ('person'). Contrary to coco the image resolution is 512.

    When I run evaluation with TensorRT conversion I get the following error during the protonet conversion:

    [02/23 15:26:23 yolact.eval]: Converting protonet to TensorRT...
    Traceback (most recent call last):
      File "eval.py", line 1241, in <module>
        convert_to_tensorrt(net, cfg, args, transform=BaseTransform())
      File "/home/oidpsv/yolact_edge/utils/tensorrt.py", line 156, in convert_to_tensorrt
        net.to_tensorrt_protonet(cfg.torch2trt_protonet_int8, calibration_dataset=calibration_protonet_dataset, batch_size=args.trt_batch_size)
      File "/home/oidpsv/yolact_edge/yolact.py", line 1565, in to_tensorrt_protonet
        self.trt_load_if("proto_net", trt_fn, [x], int8_mode, batch_size=batch_size)
      File "/home/oidpsv/yolact_edge/yolact.py", line 1534, in trt_load_if
        module = trt_fn(module, trt_fn_params)
      File "/opt/conda/lib/python3.6/site-packages/torch2trt-0.1.0-py3.6-linux-x86_64.egg/torch2trt/torch2trt.py", line 555, in torch2trto
        engine = builder.build_cuda_engine(network)
      File "/opt/conda/lib/python3.6/site-packages/torch2trt-0.1.0-py3.6-linux-x86_64.egg/torch2trt/calibration.py", line 51, in get_batch
        buffer[i].copy_(tensor)
    RuntimeError: The size of tensor a (69) must match the size of tensor b (64) at non-singleton dimension 2
    

    which can be fixed by changing line 1563 in yolact.py from x = torch.ones((1, 256, 69, 69)).cuda() to x = torch.ones((1, 256, 64, 64)).cuda().

    The problem is that in this case further evaluation with TensorRT conversion gives zero mAP and processing of images provides empty result (no masks or boxes). Could you please help me? Many thanks.

    opened by ghost 14
  • No output | Custom dataset | TensorRT

    No output | Custom dataset | TensorRT

    Hi @haotian-liu,

    I'm working on a custom instance segmentation task with three classes. While I get output segmentations on my Jetson Xavier by using tag --disable_tensorrt, there's no output when I run the model on TensorRT.

    I'm training a ResNet50 model on my PC and transferring the learned model to Jetson for inference.

    Initially, I suspected the error is similar to issue:27 as I got IndexError Warnings when enabling TensorRT. But while debugging I found that except blocks do no harm.

    My commented detection.py file:

    # This try-except block aims to fix the IndexError that we might encounter when we train on custom datasets and evaluate with TensorRT enabled. See https://github.com/haotian-liu/yolact_edge/issues/27.
           try:
               classes = classes[keep]
               boxes = boxes[keep]
               masks = masks[keep]
               scores = scores[keep]
    
               print("Passed first Try/Except")
    
           except IndexError:
               from utils.logging_helper import log_once
               log_once(self, "issue_27_flatten", name="yolact.layers.detect", 
               message="Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Flattening predictions to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.")
    
               classes = torch.flatten(classes, end_dim=1)
               boxes = torch.flatten(boxes, end_dim=1)
               masks = torch.flatten(masks, end_dim=1)
               scores = torch.flatten(scores, end_dim=1)
               keep = torch.flatten(keep, end_dim=1)
    
               idx = torch.nonzero(keep, as_tuple=True)[0]
               print(f"\nIdx: {idx}")
               print(f"Idx_min: {idx.min()} and Idx_max: {idx.max()}")
    
               classes = torch.index_select(classes, 0, idx)
               boxes = torch.index_select(boxes, 0, idx)
               masks = torch.index_select(masks, 0, idx)
               scores = torch.index_select(scores, 0, idx)
    
           # Only keep the top cfg.max_num_detections highest scores across all classes
           scores, idx = scores.sort(0, descending=True)
           idx = idx[:cfg.max_num_detections]
           scores = scores[:cfg.max_num_detections]
    
           print(f"\nIdx: {idx}")
           print(f"Idx_min: {idx.min()} and Idx_max: {idx.max()}")
    
           try:
               print(f"\nInside second Try")
    
               print(f"Classes: {classes}")
               print(f"Boxes: {boxes}")
    
               classes = classes[idx]
               print(f"Classes updated: {classes}")
    
               boxes= boxes[idx]
               print(f"Boxes updated: {boxes}")
               
               masks = masks[idx]
    
               print(f"Scores: {scores}")
           except IndexError:
               from utils.logging_helper import log_once
               log_once(self, "issue_27_index_select", name="yolact.layers.detect", message="Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Using `torch.index_select` to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.")
    
               print(f"\nSecond Try/Except")
    
               classes = torch.index_select(classes, 0, idx)
               boxes = torch.index_select(boxes, 0, idx)
               masks = torch.index_select(masks, 0, idx)
    
               print(f"Classes updated: {classes}")
               print(f"Boxes updated: {boxes}")
               print(f"Scores: {scores}")
    
           return boxes, masks, classes, scores
    

    Command that I use to run evaluation:

    ~/Projects/yolact_edge$ python3 eval.py --config=yolact_edge_config --trained_model=weights/yolact_edge_2115_110000.pth --score_threshold=0.3 --top_k=20 --image=./test_input/020801_2020_11_25_11_54_18.png
    [01/27 15:04:38 yolact.eval]: Loading model...
    [01/27 15:04:42 yolact.eval]: Model loaded.
    [01/27 15:04:42 yolact.eval]: Converting to TensorRT...
    [01/27 15:04:42 yolact.eval]: Converting backbone to TensorRT...
    [01/27 15:04:44 yolact.eval]: Converting protonet to TensorRT...
    [01/27 15:04:44 yolact.eval]: Converting FPN to TensorRT...
    [01/27 15:04:44 yolact.eval]: Converting PredictionModule to TensorRT...
    [01/27 15:04:55 yolact.eval]: Converted to TensorRT.
    WARNING [01/27 15:04:56 yolact.layers.detect]: Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Flattening predictions to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.
    
    Idx: tensor([ 0,  1,  2,  9, 11, 13, 15, 16, 18, 24, 26, 27, 30, 31, 34])
    Idx_min: 0 and Idx_max: 34
    
    Idx: tensor([ 0, 10,  1,  2, 11, 12,  5, 13,  6, 14,  7,  3,  8,  9,  4])
    Idx_min: 0 and Idx_max: 14
    
    Inside second Try
    Classes: tensor([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2])
    Boxes: tensor([[ 0.7628,  0.4625,  0.7711,  0.5278],
            [ 0.9843,  0.7867,  0.9961,  0.8066],
            [ 0.8758,  0.4074,  0.9116,  0.5297],
            [ 0.7709, -0.0089,  0.9846,  0.9502],
            [ 0.7592,  0.1906,  0.7628,  0.2080],
            [ 0.9848,  0.7869,  0.9971,  0.8092],
            [ 0.7709, -0.0089,  0.9846,  0.9502],
            [ 0.7628,  0.4625,  0.7711,  0.5278],
            [ 0.8758,  0.4074,  0.9116,  0.5297],
            [ 0.7592,  0.1906,  0.7628,  0.2080],
            [ 0.7592,  0.1906,  0.7633,  0.2087],
            [ 0.7709, -0.0089,  0.9846,  0.9502],
            [ 0.8753,  0.4015,  0.9084,  0.5413],
            [ 0.9848,  0.7869,  0.9971,  0.8092],
            [ 0.7628,  0.4625,  0.7711,  0.5278]])
    WARNING [01/27 15:04:56 yolact.layers.detect]: Encountered IndexError as mentioned in https://github.com/haotian-liu/yolact_edge/issues/27. Using `torch.index_select` to avoid error, please verify the outputs. If there are any problems you met related to this, please report an issue.
    
    Second Try/Except
    Classes updated: tensor([0, 2, 0, 0, 2, 2, 1, 2, 1, 2, 1, 0, 1, 1, 0])
    Boxes updated: tensor([[ 0.7628,  0.4625,  0.7711,  0.5278],
            [ 0.7592,  0.1906,  0.7633,  0.2087],
            [ 0.9843,  0.7867,  0.9961,  0.8066],
            [ 0.8758,  0.4074,  0.9116,  0.5297],
            [ 0.7709, -0.0089,  0.9846,  0.9502],
            [ 0.8753,  0.4015,  0.9084,  0.5413],
            [ 0.9848,  0.7869,  0.9971,  0.8092],
            [ 0.9848,  0.7869,  0.9971,  0.8092],
            [ 0.7709, -0.0089,  0.9846,  0.9502],
            [ 0.7628,  0.4625,  0.7711,  0.5278],
            [ 0.7628,  0.4625,  0.7711,  0.5278],
            [ 0.7709, -0.0089,  0.9846,  0.9502],
            [ 0.8758,  0.4074,  0.9116,  0.5297],
            [ 0.7592,  0.1906,  0.7628,  0.2080],
            [ 0.7592,  0.1906,  0.7628,  0.2080]])
    Scores: tensor([9.8882e-01, 9.0431e-01, 8.8910e-01, 7.1460e-01, 5.0766e-01, 1.0031e-03,
            4.2995e-04, 3.9997e-04, 2.7969e-04, 1.7686e-04, 1.9921e-05, 1.6216e-05,
            5.4856e-06, 1.4189e-06, 1.8612e-07])
    

    Please note: If I add --disable_tensorrt tag, I get results as the code executes in try blocks. I also ensure to remove the cached '.trt' files in ./weights folder. Can you help me here ? Thank you!

    opened by smahesh2694 14
  • RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4

    RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4

    hi,when I detect an image,there is a error.The error is :RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4, I hope one can help me. Thanks !

    opened by Yang-Changhui 12
  • 30fps performance on RTX3070

    30fps performance on RTX3070

    Hi

    You claimed that yolact edge runs 67fps on RTX 2080 Ti without TensorRT on your paper. But, I tested it on RTX 3070 graphic card and obtained 30fps. I checked that RTX3070 and RTX2080 Ti are quite similar in performance. Also, I am using Windows machine but I believe it is not the reason why the performance is 2x times lower than what you claimed. I appreciate your feedback.

    Thanks.

    opened by spacewalk01 11
  • Problems of Evaluation Output:

    Problems of Evaluation Output: "rock""showel""teeth"

    000000084440

    I downloaded the YolactEdge R-101-FPN and some other trained models to execute the evaluation code. But as shown in the image, all the detection results are "rock","showel" and "teeth". could you give me some advice about the problem? Thanks a lot!

    Environment:

    • OS: Ubuntu 20.04
    • GPU: GTX1660 ti
    • CUDA Version 11.4.4
    opened by ChrisLong13 2
  • Running inference on video with pre-trained weights is stuck in

    Running inference on video with pre-trained weights is stuck in "converting to TensorRT" step

    Describe the bug Running inference on video with pre-trained weights is stuck in "converting to TensorRT" step

    Full logs Log while running comand: !python3 ./yolact_edge/eval.py --trained_model=./yolact_edge/weights/yolact_edge/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video={file_path}:{output_path} --calib_images {calib}

    Screenshot from 2022-11-12 01-04-30

    After killing the script:

    Screenshot from 2022-11-12 01-04-30

    Environment: Colab jupyter file from yolact_edge github (slightly modified to accomodate gdrive imports and some other stuff, but with same working and instalation procedure):

    opened by marcoluis97 0
  • Bug in augmentation configuration

    Bug in augmentation configuration

    In SSDAugmentation and SSDAugmentationVideo there is an error in the enabling of the rotation augmentation. The line enable_if(cfg.augment_random_flip, RandomRot90()) should be converted in enable_if(cfg.augment_random_rot90, RandomRot90()) in both classes.

    https://github.com/haotian-liu/yolact_edge/blob/3f423ede1aeac73dbf86dadfe85af9e288f7f99b/yolact_edge/utils/augmentations.py#L908

    https://github.com/haotian-liu/yolact_edge/blob/3f423ede1aeac73dbf86dadfe85af9e288f7f99b/yolact_edge/utils/augmentations.py#L932

    opened by domef 1
  • yolact_edge with deepstream?

    yolact_edge with deepstream?

    NOT AN ISSUE BUT A QUESTION

    Is there a documented way on how to run yolact_edge with deepstream?

    As far as I know, one would need an .engine file, and .cpp file in order to parse the output of the model and send it to nvinfer and nvosd deepstream plugins.

    Any kind of hint will be appreciated. Thanks!

    opened by aurelm95 0
Owner
Haotian Liu
Haotian Liu
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 4 Dec 30, 2022
Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

Codes for ECBSR Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices Xindong Zhang, Hui Zeng, Lei Zhang ACM Multimedia 202

xindong zhang 236 Dec 26, 2022
Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

YOLO-ReT This is the original implementation of the paper: YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs. Prakhar Ganesh, Ya

null 69 Oct 19, 2022
BED: A Real-Time Object Detection System for Edge Devices

BED: A Real-Time Object Detection System for Edge Devices About this project Thi

Data Analytics Lab at Texas A&M University 44 Nov 18, 2022
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

Rounak Das 1 Feb 15, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 6, 2023
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You 93 Dec 25, 2022
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

Yuval Nirkin 182 Dec 14, 2022
FANet - Real-time Semantic Segmentation with Fast Attention

FANet Real-time Semantic Segmentation with Fast Attention Ping Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko , Stan Sc

Ping Hu 42 Nov 30, 2022
implement of SwiftNet:Real-time Video Object Segmentation

SwiftNet The official PyTorch implementation of SwiftNet:Real-time Video Object Segmentation, which has been accepted by CVPR2021. Requirements Python

haochen wang 64 Dec 14, 2022
A keras-based real-time model for medical image segmentation (CFPNet-M)

CFPNet-M: A Light-Weight Encoder-Decoder Based Network for Multimodal Biomedical Image Real-Time Segmentation This repository contains the implementat

null 268 Nov 27, 2022
DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

DFFNet Paper DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation. Xiangyan Tang, Wenxuan Tu, Keqiu Li, J

null 4 Sep 23, 2022
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

凌逆战 16 Dec 30, 2022
the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

EmbedSeg Introduction This repository hosts the version of the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

JugLab 88 Dec 25, 2022
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Unseen Object Clustering: Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation Introduction In this work, we propose a new method

NVIDIA Research Projects 132 Dec 13, 2022
[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru

Yuqing Wang 687 Jan 7, 2023