FairMOT - A simple baseline for one-shot multi-object tracking

Overview

FairMOT

PWC PWC PWC PWC

A simple baseline for one-shot multi-object tracking:

FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking,
Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu,
arXiv technical report (arXiv 2004.01888)

Abstract

There has been remarkable progress on object detection and re-identification in recent years which are the core components for multi-object tracking. However, little attention has been focused on accomplishing the two tasks in a single network to improve the inference speed. The initial attempts along this path ended up with degraded results mainly because the re-identification branch is not appropriately learned. In this work, we study the essential reasons behind the failure, and accordingly present a simple baseline to addresses the problems. It remarkably outperforms the state-of-the-arts on the MOT challenge datasets at 30 FPS. We hope this baseline could inspire and help evaluate new ideas in this field.

News

  • (2020.05.24) A light version of FairMOT using yolov5s backbone is released!
  • (2020.09.10) A new version of FairMOT is released! (73.7 MOTA on MOT17)

Main updates

  • We pretrain FairMOT on the CrowdHuman dataset using a weakly-supervised learning approach.
  • To detect bounding boxes outside the image, we use left, top, right and bottom (4 channel) to replace the WH head (2 channel).

Tracking performance

Results on MOT challenge test set

Dataset MOTA IDF1 IDS MT ML FPS
2DMOT15 60.6 64.7 591 47.6% 11.0% 30.5
MOT16 74.9 72.8 1074 44.7% 15.9% 25.9
MOT17 73.7 72.3 3303 43.2% 17.3% 25.9
MOT20 61.8 67.3 5243 68.8% 7.6% 13.2

All of the results are obtained on the MOT challenge evaluation server under the “private detector” protocol. We rank first among all the trackers on 2DMOT15, MOT16, MOT17 and MOT20. The tracking speed of the entire system can reach up to 30 FPS.

Video demos on MOT challenge test set

Installation

  • Clone this repo, and we'll call the directory that you cloned as ${FAIRMOT_ROOT}
  • Install dependencies. We use python 3.8 and pytorch >= 1.7.0
conda create -n FairMOT
conda activate FairMOT
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.2 -c pytorch
cd ${FAIRMOT_ROOT}
pip install cython
pip install -r requirements.txt
  • We use DCNv2_pytorch_1.7 in our backbone network (pytorch_1.7 branch). Previous versions can be found in DCNv2.
git clone -b pytorch_1.7 https://github.com/ifzhang/DCNv2.git
cd DCNv2
./make.sh
  • In order to run the code for demos, you also need to install ffmpeg.

Data preparation

  • CrowdHuman The CrowdHuman dataset can be downloaded from their official webpage. After downloading, you should prepare the data in the following structure:
crowdhuman
   |——————images
   |        └——————train
   |        └——————val
   └——————labels_with_ids
   |         └——————train(empty)
   |         └——————val(empty)
   └------annotation_train.odgt
   └------annotation_val.odgt

If you want to pretrain on CrowdHuman (we train Re-ID on CrowdHuman), you can change the paths in src/gen_labels_crowd_id.py and run:

cd src
python gen_labels_crowd_id.py

If you want to add CrowdHuman to the MIX dataset (we do not train Re-ID on CrowdHuman), you can change the paths in src/gen_labels_crowd_det.py and run:

cd src
python gen_labels_crowd_det.py
  • MIX We use the same training data as JDE in this part and we call it "MIX". Please refer to their DATA ZOO to download and prepare all the training data including Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16.
  • 2DMOT15 and MOT20 2DMOT15 and MOT20 can be downloaded from the official webpage of MOT challenge. After downloading, you should prepare the data in the following structure:
MOT15
   |——————images
   |        └——————train
   |        └——————test
   └——————labels_with_ids
            └——————train(empty)
MOT20
   |——————images
   |        └——————train
   |        └——————test
   └——————labels_with_ids
            └——————train(empty)

Then, you can change the seq_root and label_root in src/gen_labels_15.py and src/gen_labels_20.py and run:

cd src
python gen_labels_15.py
python gen_labels_20.py

to generate the labels of 2DMOT15 and MOT20. The seqinfo.ini files of 2DMOT15 can be downloaded here [Google], [Baidu],code:8o0w.

Pretrained models and baseline model

  • Pretrained models

DLA-34 COCO pretrained model: DLA-34 official. HRNetV2 ImageNet pretrained model: HRNetV2-W18 official, HRNetV2-W32 official. After downloading, you should put the pretrained models in the following structure:

${FAIRMOT_ROOT}
   └——————models
           └——————ctdet_coco_dla_2x.pth
           └——————hrnetv2_w32_imagenet_pretrained.pth
           └——————hrnetv2_w18_imagenet_pretrained.pth
  • Baseline model

Our baseline FairMOT model (DLA-34 backbone) is pretrained on the CrowdHuman for 60 epochs with the self-supervised learning approach and then trained on the MIX dataset for 30 epochs. The models can be downloaded here: crowdhuman_dla34.pth [Google] [Baidu, code:ggzx ] [Onedrive]. fairmot_dla34.pth [Google] [Baidu, code:uouv] [Onedrive]. (This is the model we get 73.7 MOTA on the MOT17 test set. ) After downloading, you should put the baseline model in the following structure:

${FAIRMOT_ROOT}
   └——————models
           └——————fairmot_dla34.pth
           └——————...

Training

  • Download the training data
  • Change the dataset root directory 'root' in src/lib/cfg/data.json and 'data_dir' in src/lib/opts.py
  • Pretrain on CrowdHuman and train on MIX:
sh experiments/crowdhuman_dla34.sh
sh experiments/mix_ft_ch_dla34.sh
  • Only train on MIX:
sh experiments/mix_dla34.sh
  • Only train on MOT17:
sh experiments/mot17_dla34.sh
  • Finetune on 2DMOT15 using the baseline model:
sh experiments/mot15_ft_mix_dla34.sh
  • Train on MOT20: The data annotation of MOT20 is a little different from MOT17, the coordinates of the bounding boxes are all inside the image, so we need to uncomment line 313 to 316 in the dataset file src/lib/datasets/dataset/jde.py:
#np.clip(xy[:, 0], 0, width, out=xy[:, 0])
#np.clip(xy[:, 2], 0, width, out=xy[:, 2])
#np.clip(xy[:, 1], 0, height, out=xy[:, 1])
#np.clip(xy[:, 3], 0, height, out=xy[:, 3])

Then, we can train on the mix dataset and finetune on MOT20:

sh experiments/crowdhuman_dla34.sh
sh experiments/mix_ft_ch_dla34.sh
sh experiments/mot20_ft_mix_dla34.sh

The MOT20 model 'mot20_fairmot.pth' can be downloaded here: [Google] [Baidu, code:jmce].

  • For ablation study, we use MIX and half of MOT17 as training data, you can use different backbones such as ResNet, ResNet-FPN, HRNet and DLA::
sh experiments/mix_mot17_half_dla34.sh
sh experiments/mix_mot17_half_hrnet18.sh
sh experiments/mix_mot17_half_res34.sh
sh experiments/mix_mot17_half_res34fpn.sh
sh experiments/mix_mot17_half_res50.sh

The ablation study model 'mix_mot17_half_dla34.pth' can be downloaded here: [Google] [Onedrive] [Baidu, code:iifa].

  • Performance on the test set of MOT17 when using different training data:
Training Data MOTA IDF1 IDS
MOT17 69.8 69.9 3996
MIX 72.9 73.2 3345
CrowdHuman + MIX 73.7 72.3 3303
  • We use CrowdHuman, MIX and MOT17 to train the light version of FairMOT using yolov5s as backbone:
sh experiments/all_yolov5s.sh

The pretrained model of yolov5s on the COCO dataset can be downloaded here: [Google] [Baidu, code:wh9h].

The model of the light version 'fairmot_yolov5s' can be downloaded here: [Google] [Baidu, code:2y3a].

Tracking

  • The default settings run tracking on the validation dataset from 2DMOT15. Using the baseline model, you can run:
cd src
python track.py mot --load_model ../models/fairmot_dla34.pth --conf_thres 0.6

to see the tracking results (76.5 MOTA and 79.3 IDF1 using the baseline model). You can also set save_images=True in src/track.py to save the visualization results of each frame.

  • For ablation study, we evaluate on the other half of the training set of MOT17, you can run:
cd src
python track_half.py mot --load_model ../exp/mot/mix_mot17_half_dla34.pth --conf_thres 0.4 --val_mot17 True

If you use our pretrained model 'mix_mot17_half_dla34.pth', you can get 69.1 MOTA and 72.8 IDF1.

  • To get the txt results of the test set of MOT16 or MOT17, you can run:
cd src
python track.py mot --test_mot17 True --load_model ../models/fairmot_dla34.pth --conf_thres 0.4
python track.py mot --test_mot16 True --load_model ../models/fairmot_dla34.pth --conf_thres 0.4
  • To run tracking using the light version of FairMOT (68.5 MOTA on the test of MOT17), you can run:
cd src
python track.py mot --test_mot17 True --load_model ../models/fairmot_yolov5s.pth --conf_thres 0.4 --arch yolo --reid_dim 64

and send the txt files to the MOT challenge evaluation server to get the results. (You can get the SOTA results 73+ MOTA on MOT17 test set using the baseline model 'fairmot_dla34.pth'.)

  • To get the SOTA results of 2DMOT15 and MOT20, run the tracking code:
cd src
python track.py mot --test_mot15 True --load_model your_mot15_model.pth --conf_thres 0.3
python track.py mot --test_mot20 True --load_model your_mot20_model.pth --conf_thres 0.3

Results of the test set all need to be evaluated on the MOT challenge server. You can see the tracking results on the training set by setting --val_motxx True and run the tracking code. We set 'conf_thres' 0.4 for MOT16 and MOT17. We set 'conf_thres' 0.3 for 2DMOT15 and MOT20.

Demo

You can input a raw video and get the demo video by running src/demo.py and get the mp4 format of the demo video:

cd src
python demo.py mot --load_model ../models/fairmot_dla34.pth --conf_thres 0.4

You can change --input-video and --output-root to get the demos of your own videos. --conf_thres can be set from 0.3 to 0.7 depending on your own videos.

Train on custom dataset

You can train FairMOT on custom dataset by following several steps bellow:

  1. Generate one txt label file for one image. Each line of the txt label file represents one object. The format of the line is: "class id x_center/img_width y_center/img_height w/img_width h/img_height". You can modify src/gen_labels_16.py to generate label files for your custom dataset.
  2. Generate files containing image paths. The example files are in src/data/. Some similar code can be found in src/gen_labels_crowd.py
  3. Create a json file for your custom dataset in src/lib/cfg/. You need to specify the "root" and "train" keys in the json file. You can find some examples in src/lib/cfg/.
  4. Add --data_cfg '../src/lib/cfg/your_dataset.json' when training.

Acknowledgement

A large part of the code is borrowed from Zhongdao/Towards-Realtime-MOT and xingyizhou/CenterNet. Thanks for their wonderful works.

Citation

@article{zhang2020fair,
  title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
  author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
  journal={arXiv preprint arXiv:2004.01888},
  year={2020}
}
Comments
  • On Windows 10 I get ModuleNotFoundError: No module named '_ext'

    On Windows 10 I get ModuleNotFoundError: No module named '_ext'

    My lib versions

    python 3.6.8
    torch==1.4.0
    torchsummary==1.5.1
    torchvision==0.5.0
    

    I got everything else installed but I get this error on Windows 10 when running demo.py

    (pytorch) E:\Documents\Projects\FairMOT\src>python demo.py mot --load_model ../models/hrnetv2_w32_imagenet_pretrained.pth
    Traceback (most recent call last):
      File "demo.py", line 14, in <module>
        from track import eval_seq
      File "E:\Documents\Projects\FairMOT\src\track.py", line 15, in <module>
        from tracker.multitracker import JDETracker
      File "E:\Documents\Projects\FairMOT\src\lib\tracker\multitracker.py", line 12, in <module>
        from lib.models.model import create_model, load_model
      File "E:\Documents\Projects\FairMOT\src\lib\models\model.py", line 11, in <module>
        from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn
      File "E:\Documents\Projects\FairMOT\src\lib\models\networks\pose_dla_dcn.py", line 16, in <module>
        from .DCNv2.dcn_v2 import DCN
      File "E:\Documents\Projects\FairMOT\src\lib\models\networks\DCNv2\dcn_v2.py", line 13, in <module>
        import _ext as _backend
    ModuleNotFoundError: No module named '_ext'
    

    Though I have the .ext and .lib files copied in the same folder as the script

    image

    Also the python setup.py build develop didn't succeed with the attached error out.txt

    opened by zubairahmed-ai 70
  • Train on custom data

    Train on custom data

    I want to train on my custom dataset and i have some questions about that.

    • What is the root path in cfg file ?
    • What is the path for opt.data_dir
    • In my videos, some frames dont have any objects so when i use gen_labels_16.py, it generates only txt file for frame that have object, so i add some empty texts file for frames that dont have objects. Is my solution right? If not, how to deal with this. My log when running with empty files:

    Screenshot from 2021-06-30 20-08-06

    opened by LeDuySon 13
  • GPU compatibility DCNv2_new not out the box

    GPU compatibility DCNv2_new not out the box

    Following just the steps in the repo, I found that building DCNv2 does not work on my machine which has a NVIDIA Tesla K80 x.

    The exact error I have when running cd src python track.py mot --load_model ../models/all_dla34.pth --conf_thres 0.6 is : RuntimeError: Not compiled with GPU support (dcn_v2_forward at /{FAIRMOT_PATH}/src/lib/models/networks/DCNv2_new/src/dcn_v2.h:35)

    opened by FlorentijnD 10
  • Problem about multi-Class multi-object tracking

    Problem about multi-Class multi-object tracking

    Hi, I have successfully modified FairMOT to train and detect multi-class and multi-object, but when it comes about tracking updates, a problem occurs: Different class objects track ids linked together, that is to say, a person's track id starts with the max track id of previous class car(e.g 15) rather than 1. So, how to modify this, i 'm a new commer of tracking, here is my main code:

    def update(self, im_blob, img0):
        self.frame_id += 1
    
        activated_starcks_dict = defaultdict(list)
        refind_stracks_dict = defaultdict(list)
        lost_stracks_dict = defaultdict(list)
        removed_stracks_dict = defaultdict(list)
        output_stracks_dict = defaultdict(list)
    
        width = img0.shape[1]
        height = img0.shape[0]
        inp_height = im_blob.shape[2]
        inp_width = im_blob.shape[3]
    
        c = np.array([width / 2., height / 2.], dtype=np.float32)
        s = max(float(inp_width) / float(inp_height) * height, width) * 1.0
        meta = {'c': c,
                's': s,
                'out_height': inp_height // self.opt.down_ratio,
                'out_width': inp_width // self.opt.down_ratio}
    
        ''' Step 1: Network forward, get detections & embeddings'''
        with torch.no_grad():  
            output = self.model.forward(im_blob)[-1]
    
            hm = output['hm'].sigmoid_()
            # print("hm shape ", hm.shape, "hm:\n", hm)
    
            wh = output['wh']
            # print("wh shape ", wh.shape, "wh:\n", wh)
    
            id_feature = output['id']
            id_feature = F.normalize(id_feature, dim=1)
    
            reg = output['reg'] if self.opt.reg_offset else None
            # print("reg shape ", reg.shape, "reg:\n", reg)
    
            dets, inds, cls_inds_mask = mot_decode(heatmap=hm,
                                                   wh=wh,
                                                   reg=reg,
                                                   num_classes=self.opt.num_classes,
                                                   cat_spec_wh=self.opt.cat_spec_wh,
                                                   K=self.opt.K)
    
            # id_feature = _tranpose_and_gather_feat(id_feature, inds)
            # id_feature = id_feature.squeeze(0)  # K × FeatDim
            # id_feature = id_feature.cpu().numpy()
    
            # ----- 
            cls_id_feats = []  
            for cls_id in range(self.opt.num_classes): 
                cls_inds = inds[:, cls_inds_mask[cls_id]]
    
                cls_id_feature = _tranpose_and_gather_feat(id_feature, cls_inds)  # inds: 1×128
                cls_id_feature = cls_id_feature.squeeze(0)  # n × FeatDim
                cls_id_feature = cls_id_feature.cpu().numpy() 
                cls_id_feats.append(cls_id_feature)
    
        dets = self.post_process(dets, meta)
        dets = self.merge_outputs([dets])
        # dets = self.merge_outputs(dets)[1]
    
        for cls_id in range(self.opt.num_classes):  # cls_id starts with 0
            cls_dets = dets[cls_id + 1]
    
            '''
            for i in range(0, cls_dets.shape[0]):
                bbox = cls_dets[i][0:4]
                cv2.rectangle(img0,
                              (bbox[0], bbox[1]),  # left-top point
                              (bbox[2], bbox[3]),  # right-down point
                              [0, 255, 255],  # yellow
                              2)
                cv2.putText(img0,
                            id2cls[cls_id],
                            (bbox[0], bbox[1]),
                            cv2.FONT_HERSHEY_PLAIN,
                            1.3,
                            [0, 0, 255],  # red
                            2)
            cv2.imshow('{}'.format(id2cls[cls_id]), img0)
            cv2.waitKey(0)
            '''
    
            remain_inds = cls_dets[:, 4] > self.opt.conf_thres
            cls_dets = cls_dets[remain_inds]
            cls_id_feature = cls_id_feats[cls_id][remain_inds]
    
            if len(cls_dets) > 0:
                '''Detections, tlbrs: top left bottom right score'''
                cls_detections = [STrack(STrack.tlbr_to_tlwh(tlbrs[:4]), tlbrs[4], feat, buff_size=30)
                                  for (tlbrs, feat) in zip(cls_dets[:, :5], cls_id_feature)]
            else:
                cls_detections = []
    
            ''' Add newly detected tracklets to tracked_stracks'''
            unconfirmed_dict = defaultdict(list)
            tracked_stracks_dict = defaultdict(list)  # type: key(cls_id), value: list[STrack]
            for track in self.tracked_stracks_dict[cls_id]:
                if not track.is_activated:
                    unconfirmed_dict[cls_id].append(track)
                else:
                    tracked_stracks_dict[cls_id].append(track)
    
            ''' Step 2: First association, with embedding'''
            strack_pool_dict = defaultdict(list)
            strack_pool_dict[cls_id] = joint_stracks(tracked_stracks_dict[cls_id], self.lost_stracks_dict[cls_id])
    
            # Predict the current location with KF
            # for strack in strack_pool:
            STrack.multi_predict(strack_pool_dict[cls_id])
            dists = matching.embedding_distance(strack_pool_dict[cls_id], cls_detections)
            dists = matching.fuse_motion(self.kalman_filter, dists, strack_pool_dict[cls_id], cls_detections)
            matches, u_track, u_detection = matching.linear_assignment(dists, thresh=0.7)
    
            for i_tracked, i_det in matches:
                track = strack_pool_dict[cls_id][i_tracked]
                det = cls_detections[i_det]
                if track.state == TrackState.Tracked:
                    track.update(cls_detections[i_det], self.frame_id)
                    activated_starcks_dict[cls_id].append(track)  # for multi-class
                else:
                    track.re_activate(det, self.frame_id, new_id=False)
                    refind_stracks_dict[cls_id].append(track)
    
            ''' Step 3: Second association, with IOU'''
            cls_detections = [cls_detections[i] for i in u_detection]
            r_tracked_stracks = [strack_pool_dict[cls_id][i]
                                 for i in u_track if strack_pool_dict[cls_id][i].state == TrackState.Tracked]
            dists = matching.iou_distance(r_tracked_stracks, cls_detections)
            matches, u_track, u_detection = matching.linear_assignment(dists, thresh=0.5)
    
            for i_tracked, i_det in matches:
                track = r_tracked_stracks[i_tracked]
                det = cls_detections[i_det]
                if track.state == TrackState.Tracked:
                    track.update(det, self.frame_id)
                    activated_starcks_dict[cls_id].append(track)
                else:
                    track.re_activate(det, self.frame_id, new_id=False)
                    refind_stracks_dict[cls_id].append(track)
    
            for it in u_track:
                track = r_tracked_stracks[it]
                if not track.state == TrackState.Lost:
                    track.mark_lost()
                    lost_stracks_dict[cls_id].append(track)
    
            '''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
            cls_detections = [cls_detections[i] for i in u_detection]
            dists = matching.iou_distance(unconfirmed_dict[cls_id], cls_detections)
            matches, u_unconfirmed, u_detection = matching.linear_assignment(dists, thresh=0.7)
            for i_tracked, i_det in matches:
                unconfirmed_dict[cls_id][i_tracked].update(cls_detections[i_det], self.frame_id)
                activated_starcks_dict[cls_id].append(unconfirmed_dict[cls_id][i_tracked])
            for it in u_unconfirmed:
                track = unconfirmed_dict[cls_id][it]
                track.mark_removed()
                removed_stracks_dict[cls_id].append(track)
    
            """ Step 4: Init new stracks"""
            for i_new in u_detection:
                track = cls_detections[i_new]
                if track.score < self.det_thresh:
                    continue
                track.activate(self.kalman_filter, self.frame_id)
                activated_starcks_dict[cls_id].append(track)
    
            """ Step 5: Update state"""
            for track in self.lost_stracks_dict[cls_id]:
                if self.frame_id - track.end_frame > self.max_time_lost:
                    track.mark_removed()
                    removed_stracks_dict[cls_id].append(track)
    
            # print('Ramained match {} s'.format(t4-t3))
            self.tracked_stracks_dict[cls_id] = [t for t in self.tracked_stracks_dict[cls_id] if
                                                 t.state == TrackState.Tracked]
            self.tracked_stracks_dict[cls_id] = joint_stracks(self.tracked_stracks_dict[cls_id],
                                                              activated_starcks_dict[cls_id])
            self.tracked_stracks_dict[cls_id] = joint_stracks(self.tracked_stracks_dict[cls_id],
                                                              refind_stracks_dict[cls_id])
            self.lost_stracks_dict[cls_id] = sub_stracks(self.lost_stracks_dict[cls_id],
                                                         self.tracked_stracks_dict[cls_id])
            self.lost_stracks_dict[cls_id].extend(lost_stracks_dict[cls_id])
            self.lost_stracks_dict[cls_id] = sub_stracks(self.lost_stracks_dict[cls_id],
                                                         self.removed_stracks_dict[cls_id])
            self.removed_stracks_dict[cls_id].extend(removed_stracks_dict[cls_id])
            self.tracked_stracks_dict[cls_id], self.lost_stracks_dict[cls_id] = remove_duplicate_stracks(
                self.tracked_stracks_dict[cls_id],
                self.lost_stracks_dict[cls_id])
    
            # get scores of lost tracks
            output_stracks_dict[cls_id] = [track for track in self.tracked_stracks_dict[cls_id] if track.is_activated]
    
            logger.debug('===========Frame {}=========='.format(self.frame_id))
            logger.debug('Activated: {}'.format(
                [track.track_id for track in activated_starcks_dict[cls_id]]))
            logger.debug('Refind: {}'.format(
                [track.track_id for track in refind_stracks_dict[cls_id]]))
            logger.debug('Lost: {}'.format(
                [track.track_id for track in lost_stracks_dict[cls_id]]))
            logger.debug('Removed: {}'.format(
                [track.track_id for track in removed_stracks_dict[cls_id]]))
    
        return output_stracks_dict
    
    opened by CaptainEven 10
  • ModuleNotFoundError: No module named 'cython_bbox'

    ModuleNotFoundError: No module named 'cython_bbox'

    I followed all the instructions for installation and then tried this command,

    python3 demo.py mot --load_model ../models/all_dla34.pth --conf_thres 0.4

    I have done cython_bbox installation using pip3 in the conda environment. But still receiving this error.

    opened by YumnaBatool 10
  • About the Re-ID part

    About the Re-ID part

    Hi, sorry for bothering you. I have two minor questions for the re-id part.

    1. in the /src/train/mot/py, line 64,
      if opt.id_weight > 0:
      id_head = _tranpose_and_gather_feat(output['id'], batch['ind']) id_head = id_head[batch['reg_mask'] > 0].contiguous()

    here, the output seem to be [12, 512, 152, 272] which should be NCW*H, but after the .gather, it changes to [12, 152*272, 512] to [12, 128, 512], can I understand as here you only get the first 128 points from the "batch"?

    1. Should I understand the detection and re-id as 2-stage process? first you get the center location of bbox by hm head, then use the center location to get feature for re-id?

    Hope for your kind guidance,

    best regards

    opened by XnetDA 9
  • make error!

    make error!

    cuda10.2 or cuda9.0,Ubuntu18.04

    Follow your installation instructions, but make error。What can I do to fix this

    sh make.sh running build running build_ext building '_ext' extension /home/quh/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -Wstrict-prototypes -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/quh/anaconda3/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/quh/anaconda3/include -fPIC -DWITH_CUDA -I/home/quh/pythonwork/FairMOT/src/lib/models/networks/DCNv2/src -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include/TH -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include/THC -I:/usr/local/cuda-10.2:/usr/local/cuda/include -I/home/quh/anaconda3/include/python3.7m -c /home/quh/pythonwork/FairMOT/src/lib/models/networks/DCNv2/src/vision.cpp -o build/temp.linux-x86_64-3.7/home/quh/pythonwork/FairMOT/src/lib/models/networks/DCNv2/src/vision.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++ /home/quh/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -Wstrict-prototypes -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/quh/anaconda3/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/quh/anaconda3/include -fPIC -DWITH_CUDA -I/home/quh/pythonwork/FairMOT/src/lib/models/networks/DCNv2/src -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include/TH -I/home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include/THC -I:/usr/local/cuda-10.2:/usr/local/cuda/include -I/home/quh/anaconda3/include/python3.7m -c /home/quh/pythonwork/FairMOT/src/lib/models/networks/DCNv2/src/cpu/dcn_v2_cpu.cpp -o build/temp.linux-x86_64-3.7/home/quh/pythonwork/FairMOT/src/lib/models/networks/DCNv2/src/cpu/dcn_v2_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++ In file included from /home/quh/pythonwork/FairMOT/src/lib/models/networks/DCNv2/src/cpu/dcn_v2_cpu.cpp:4:0: /home/quh/anaconda3/envs/people-detection/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h: No such file or directory #include <cuda_runtime_api.h> ^~~~~~~~~~~~~~~~~~~~ compilation terminated. error: command '/home/quh/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc' failed with exit status 1

    opened by quuhua911 9
  • TrackId switching on custom dataset

    TrackId switching on custom dataset

    Hi @ifzhang Thanks for sharing the great work that you have done!

    I have tried training the pre-trained model all_dla34.pth on my custom dataset which is a person based dataset. I want to keep the encoder and decoder weights fixed but only train the last layers of the model. The change that I did in the code (in src/lib/models/model.py) is:

    for param in model.parameters(): param.requires_grad = False

    for param in model.hm.parameters(): param.requires_grad = True for param in model.id.parameters(): param.requires_grad = True for param in model.reg.parameters(): param.requires_grad = True for param in model.wh.parameters(): param.requires_grad = True

    I did this change on the basis that they are 4 heads of the model and I wanted to train just those layers. Can you let me know if it is correct approach? I have attached my train log which seems right to me. But the result which I am getting on the trained model, there is a lot of switching of ids.

    logs

    Thanks for the help in advance!

    opened by apekshapriya 8
  • all_hrnet_v2_w18.pth

    all_hrnet_v2_w18.pth

    Thanks for your excellent work. I run the demo with model “all_dla34.pth” using the below script and all is ok.

    python demo.py mot --load_model ../models/ all_dla34.pth --reid_dim 128 --conf_thres 0.4

    but, when I use the model “all_hrnet_v2_w18.pth” with the following commands using the model that you have recently uploaded. python demo.py mot --load_model ../models/all_hrnet_v2_w18.pth --arch hrnet_18 --reid_dim 128 --conf_thres 0.4 i receive the following error:

    Creating model... 2020-04-24 10:01:09 [INFO]: unexpected EOF, expected 6133124 more bytes. The file might be corrupted. terminate called after throwing an instance of 'c10::Error' what(): owning_ptr == NullType::singleton() || owning_ptr->refcount_.load() > 0 INTERNAL ASSERT FAILED at /pytorch/c10/util/intrusive_ptr.h:348, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /pytorch/c10/util/intrusive_ptr.h:348) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f73cc075193 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so) frame #1: + 0x18cd59f (0x7f73cdd9559f in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch.so) frame #2: THStorage_free + 0x17 (0x7f73ce55dba7 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch.so) frame #3: + 0x55d65d (0x7f7416fc665d in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so) frame #4: python3() [0x54f6e6] frame #5: python3() [0x5734e0] frame #6: python3() [0x4b1a28] frame #7: python3() [0x589078] frame #8: python3() [0x5ade68] frame #9: python3() [0x5ade7e] frame #10: python3() [0x5ade7e] frame #11: python3() [0x5ade7e] frame #12: python3() [0x5ade7e] frame #13: python3() [0x5ade7e] frame #14: python3() [0x5ade7e] frame #15: python3() [0x5ade7e] frame #16: python3() [0x590669] frame #17: python3() [0x590943] frame #19: python3() [0x509d48] frame #20: python3() [0x50aa7d] frame #22: python3() [0x508245] frame #24: python3() [0x635222] frame #29: __libc_start_main + 0xe7 (0x7f74310b8b97 in /lib/x86_64-linux-gnu/libc.so.6)

    would you please tell me why?

    opened by khodabakhshih 8
  • Multitask learning with uncertainty ?

    Multitask learning with uncertainty ?

    Here, it looks like that the uncertainty is used to learn multitask.

    BUT, I can NOT find that the parameters is updated by any optimizer ... I just found that the instance of loss is made here.

    Could you point out where the uncertainty is learned during training?

    opened by kakusikun 8
  • make.sh

    make.sh

    Thanks for sharing your codes. Last week I run the code without any problem in google colab. Today, I try to run the make.sh and receive the following error:

    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DWITH_CUDA -I/content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.6/dist-packages/torch/include/TH -I/usr/local/lib/python3.6/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c /content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cpu/dcn_v2_cpu.cpp -o build/temp.linux-x86_64-3.6/content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cpu/dcn_v2_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.6/dist-packages/torch/include/TH -I/usr/local/lib/python3.6/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c /content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cuda/dcn_v2_im2col_cuda.cu -o build/temp.linux-x86_64-3.6/content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cuda/dcn_v2_im2col_cuda.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -std=c++14 /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.6/dist-packages/torch/include/TH -I/usr/local/lib/python3.6/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c /content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cuda/dcn_v2_cuda.cu -o build/temp.linux-x86_64-3.6/content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -std=c++14 /content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cuda/dcn_v2_cuda.cu(86): error: identifier "THCState_getCurrentStream" is undefined

    /content/drive/My Drive/Colab Notebooks/FairMOT/src/lib/models/networks/DCNv2/src/cuda/dcn_v2_cuda.cu(182): error: identifier "THCState_getCurrentStream" is undefined

    2 errors detected in the compilation of "/tmp/tmpxft_00000192_00000000-6_dcn_v2_cuda.cpp1.ii". error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

    would you please tell me the issue?

    opened by khodabakhshih 7
  • [Logic Bug] self.removed_stracks are REMOVED NEXT FRAME

    [Logic Bug] self.removed_stracks are REMOVED NEXT FRAME

    According to the track management logic, removed_stracks, which are the tracks meeting the condition: self.frame_id - track.end_frame > self.max_time_lost, should be removed from the lost tracks.

    However, in current codes, these removed_stracks are not taken into account when performing self.lost_tracks = sub_tracks(self.lost_tracks, self.removed_tracks).

    https://github.com/ifzhang/FairMOT/blob/4aa62976bde6266cbafd0509e24c3d98a7d0899f/src/lib/tracker/multitracker.py#L363-L366

    Say it's at time t now, the logic bug above means that some of the tracks supposed to be removed will be actually subtracted from the list of self.lost_tracks at time t+1.

    At time t+1, if some of these removed_tracks satisfy the condition of being reactivated, they will be again online from the list of self.lost_tracks. However, they are supposed to have been removed at time t.

    opened by ZXYFrank 0
  • AssertionError: Torch not compiled with CUDA enabled

    AssertionError: Torch not compiled with CUDA enabled

    Hi, guys. Anyone can help me? I have already tried to re-install both cuda and pytorch, with no success. Thanks!


    sh experiments/crowdhuman_dla34.sh Using tensorboardX Fix size testing. training chunk_sizes: [4, 4] The output will be saved to /media/preto/HD6TB/FairMOT/src/lib/../../exp/mot/crowdhuman_dla34 Setting up data...

    dataset summary OrderedDict([('CrowdHuman_train01', 339565.0), ('CrowedHuman_test', 99481.0)]) total # identities: 439047 start index OrderedDict([('CrowdHuman_train01', 0), ('CrowedHuman_test', 339565.0)])

    heads {'hm': 1, 'wh': 4, 'id': 128, 'reg': 2} Namespace(K=500, arch='dla_34', batch_size=8, cat_spec_wh=False, chunk_sizes=[4, 4], conf_thres=0.4, data_cfg='../src/lib/cfg/crowdhuman.json', data_dir='/home/zyf/dataset', dataset='jde', debug_dir='/media/preto/HD6TB/FairMOT/src/lib/../../exp/mot/crowdhuman_dla34/debug', dense_wh=False, det_thres=0.3, down_ratio=4, exp_dir='/media/preto/HD6TB/FairMOT/src/lib/../../exp/mot', exp_id='crowdhuman_dla34', fix_res=True, gpus=[0, 1], gpus_str='0,1', head_conv=256, heads={'hm': 1, 'wh': 4, 'id': 128, 'reg': 2}, hide_data_time=False, hm_weight=1, id_loss='ce', id_weight=1, img_size=(1088, 608), input_h=1088, input_res=1088, input_video='../videos/MOT16-03.mp4', input_w=608, keep_res=False, load_model='../models/ctdet_coco_dla_2x.pth', lr=0.0001, lr_step=[50], ltrb=True, master_batch_size=4, mean=None, metric='loss', min_box_area=100, mse_loss=False, multi_loss='uncertainty', nID=439047, nms_thres=0.4, norm_wh=False, not_cuda_benchmark=False, not_prefetch_test=False, not_reg_offset=False, num_classes=1, num_epochs=60, num_iters=-1, num_stacks=1, num_workers=8, off_weight=1, output_format='video', output_h=272, output_res=272, output_root='../demos', output_w=152, pad=31, print_iter=0, reg_loss='l1', reg_offset=True, reid_dim=128, resume=False, root_dir='/media/preto/HD6TB/FairMOT/src/lib/../..', save_all=False, save_dir='/media/preto/HD6TB/FairMOT/src/lib/../../exp/mot/crowdhuman_dla34', seed=317, std=None, task='mot', test=False, test_hie=False, test_mot15=False, test_mot16=False, test_mot17=False, test_mot20=False, track_buffer=30, trainval=False, val_hie=False, val_intervals=5, val_mot15=False, val_mot16=False, val_mot17=True, val_mot20=False, vis_thresh=0.5, wh_weight=0.1) Creating model... Starting training... Traceback (most recent call last): File "train.py", line 98, in main(opt) File "train.py", line 61, in main trainer.set_device(opt.gpus, opt.chunk_sizes, opt.device) File "/media/preto/HD6TB/FairMOT/src/lib/trains/base_trainer.py", line 34, in set_device self.model_with_loss = DataParallel( File "/home/preto/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 927, in to return self._apply(convert) File "/home/preto/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply module._apply(fn) File "/home/preto/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply module._apply(fn) File "/home/preto/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply module._apply(fn) [Previous line repeated 2 more times] File "/home/preto/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 602, in _apply param_applied = fn(param) File "/home/preto/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 925, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "/home/preto/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/cuda/init.py", line 211, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

    opened by felipearrudamoura 0
  • Why can't I get the result of MOTA 61.8 on the MOT20

    Why can't I get the result of MOTA 61.8 on the MOT20

    I used the model mot20_fairmot.pth given by the author to upload to the MOT20 test set to see the results. The results in the test set cannot reach the accuracy given in the paper. Why is this?

    opened by dky123456 0
Owner
Yifu Zhang
Master student of HUST and Research intern of MSRA
Yifu Zhang
FairMOT for Multi-Class MOT using YOLOX as Detector

FairMOT-X Project Overview FairMOT-X is a multi-class multi object tracker, which has been tailored for training on the BDD100K MOT Dataset. It makes

Jonathan Tan 33 Dec 28, 2022
[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search The official implementation of the paper LightTra

Multimedia Research 290 Dec 24, 2022
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

null 47 Dec 16, 2022
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
Jingju baseline - A baseline model of our project of Beijing opera script generation

Jingju Baseline It is a baseline of our project about Beijing opera script gener

midon 1 Jan 14, 2022
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 443 Dec 6, 2022
Object Detection and Multi-Object Tracking

Object Detection and Multi-Object Tracking

Bobby Chen 1.6k Jan 4, 2023
TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

ETHZ ASL 130 Dec 29, 2022
Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

Qibin He 6 Nov 25, 2022
Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

WIBAM (Work in progress) Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data 3D object dete

Matthew Howe 10 Aug 24, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 5, 2022
The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

Jinkun Cao 325 Jan 5, 2023
Zsseg.baseline - Zero-Shot Semantic Segmentation

This repo is for our paper A Simple Baseline for Zero-shot Semantic Segmentation

null 98 Dec 20, 2022
Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

Max 1 Dec 29, 2021
TrackFormer: Multi-Object Tracking with Transformers

TrackFormer: Multi-Object Tracking with Transformers This repository provides the official implementation of the TrackFormer: Multi-Object Tracking wi

Tim Meinhardt 321 Dec 29, 2022
Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion Read our ICRA 2021 paper here. Check out the 3 minute video for the quick intro or the full prese

Aleksandr Kim 276 Dec 30, 2022
a project for 3D multi-object tracking

a project for 3D multi-object tracking

null 155 Jan 4, 2023