A tutorial on training a DarkNet YOLOv4 model for the CrowdHuman dataset

JK Jung

Last update: Nov 10, 2022

Related tags

Deep Learning yolov4_crowdhuman

Overview

YOLOv4 CrowdHuman Tutorial

This is a tutorial demonstrating how to train a YOLOv4 people detector using Darknet and the CrowdHuman dataset.

Setup
Preparing training data
Training on a local PC
Testing the custom-trained yolov4 model
Training on Google Colab
Deploying onto Jetson Nano

Setup

If you are going to train the model on Google Colab, you could skip this section and jump straight to Training on Google Colab.

Otherwise, to run training locally, you need to have a x86_64 PC with a decent GPU. For example, I mainly test the code in this repository using a desktop PC with:

NVIDIA GeForce RTX 2080 Ti
Ubuntu 18.04.5 LTS (x86_64)
- CUDA 10.2
- cuDNN 8.0.1

In addition, you should have OpenCV (including python3 "cv2" module) installed properly on the local PC since both the data preparation code and "darknet" would require it.

Preparing training data

For training on the local PC, I use a "608x608" yolov4 model as example. Note that I use python3 exclusively in this tutorial (python2 might not work). Follow these steps to prepare the "CrowdHuman" dataset for training the yolov4 model.

Clone this repository.

$ cd ${HOME}/project
$ git clone https://github.com/jkjung-avt/yolov4_crowdhuman

Run the "prepare_data.sh" script in the "data/" subdirectory. It would download the "CrowdHuman" dataset, unzip train/val image files, and generate YOLO txt files necessary for the training. You could refer to data/README.md for more information about the dataset. You could further refer to How to train (to detect your custom objects) for an explanation of YOLO txt files.
```
$ cd ${HOME}/project/yolov4_crowdhuman/data
$ ./prepare_data.sh 608x608
```
This step could take quite a while, depending on your internet speed. When it is done, all image files and ".txt" files for training would be in the "data/crowdhuman-608x608/" subdirectory. (If interested, you could do python3 verify_txts.py 608x608 to verify the generated txt files.)

This tutorial is for training the yolov4 model to detect 2 classes of object: "head" (0) and "person" (1), where the "person" class corresponds to "full body" (including occluded body portions) in the original "CrowdHuman" annotations. Take a look at "data/crowdhuman-608x608.data", "data/crowdhuman.names", and "data/crowdhuman-608x608/" to gain a better understanding of the data files that have been generated/prepared for the training.

Training on a local PC

Continuing from steps in the previous section, you'd be using the "darknet" framework to train the yolov4 model.

Download and build "darknet" code. (NOTE to myself: Consider making "darknet" as a submodule and automate the build process?)

$ cd ${HOME}/project/yolov4_crowdhuman
$ git clone https://github.com/AlexeyAB/darknet.git
$ cd darknet
$ vim Makefile  # edit Makefile with your preferred editor (might not be vim)

Modify the first few lines of the "Makefile" as follows. Please refer to How to compile on Linux (using make) for more information about these settings. Note that, in the example below, CUDA compute "75" is for RTX 2080 Ti and "61" is for GTX 1080. You might need to modify those based on the kind of GPU you are using.

GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
AVX=1
OPENMP=1
LIBSO=1
ZED_CAMERA=0
ZED_CAMERA_v2_8=0

......

USE_CPP=0
DEBUG=0

ARCH= -gencode arch=compute_61,code=[sm_61,compute_61] \
      -gencode arch=compute_75,code=[sm_75,compute_75]

......

Then do a make to build "darknet".

$ make

When it is done, you could (optionally) test the "darknet" executable as follows.

### download pre-trained yolov4 coco weights and test with the dog image
$ wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights \
       -q --show-progress --no-clobber
$ ./darknet detector test cfg/coco.data cfg/yolov4-416.cfg yolov4.weights \
                          data/dog.jpg

Then copy over all files needed for training and download the pre-trained weights ("yolov4.conv.137").
```
$ cd ${HOME}/project/yolov4_crowdhuman
$ ./prepare_training.sh 608x608
```
Train the "yolov4-crowdhuman-608x608" model. Please refer to How to train with multi-GPU for how to fine-tune your training process. For example, you could specify -gpus 0,1,2,3 in order to use multiple GPUs to speed up training.
```
$ cd ${HOME}/project/yolov4_crowdhuman/darknet
$ ./darknet detector train data/crowdhuman-608x608.data \
                           cfg/yolov4-crowdhuman-608x608.cfg \
                           yolov4.conv.137 -map -gpus 0
```
When the model is being trained, you could monitor its progress on the loss/mAP chart (since the -map option is used). Alternatively, if you are training on a remote PC via ssh, add the -dont_show -mjpeg_port 8090 option so that you could monitor the loss/mAP chart on a web browser (http://{IP address}:8090/).

As a reference, training this "yolov4-crowdhuman-608x608" model with my RTX 2080 Ti GPU takes 17~18 hours.

Another example for the training of "yolov4-tiny-crowdhuman-608x608" model on RTX 2080 Ti GPU (< 3 hours).

And another one for the training of "yolov4-tiny-3l-crowdhuman-416x416" model on RTX 2080 Ti GPU (< 2 hours).

Testing the custom-trained yolov4 model

After you have trained the "yolov4-crowdhuman-608x608" model locally, you could test the "best" custom-trained model like this.

$ cd ${HOME}/project/yolov4_crowdhuman/darknet
$ ./darknet detector test data/crowdhuman-608x608.data \
                          cfg/yolov4-crowdhuman-608x608.cfg \
                          backup/yolov4-crowdhuman-608x608_best.weights \
                          data/crowdhuman-608x608/273275,4e9d1000623d182f.jpg \
                          -gpus 0

In addition, you could verify mAP of the "best" model like this.

$ ./darknet detector map data/crowdhuman-608x608.data \
                         cfg/yolov4-crowdhuman-608x608.cfg \
                         backup/yolov4-crowdhuman-608x608_best.weights \
                         -gpus 0

For example, I got [email protected] = 0.814523 when I tested my own custom-trained "yolov4-crowdhuman-608x608" model.

 detections_count = 614280, unique_truth_count = 183365
class_id = 0, name = head, ap = 82.60%           (TP = 65119, FP = 14590)
class_id = 1, name = person, ap = 80.30%         (TP = 72055, FP = 11766)

 for conf_thresh = 0.25, precision = 0.84, recall = 0.75, F1-score = 0.79
 for conf_thresh = 0.25, TP = 137174, FP = 26356, FN = 46191, average IoU = 66.92 %

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
 mean average precision ([email protected]) = 0.814523, or 81.45 %

Training on Google Colab

For doing training on Google Colab, I use a "416x416" yolov4 model as example. I have put all data processing and training commands into an IPython Notebook. So training the "yolov4-crowdhuman-416x416" model on Google Colab is just as simple as: (1) opening the Notebook on Google Colab, (2) mount your Google Drive, (3) run all cells in the Notebook.

A few words of caution before you begin running the Notebook on Google Colab:

Google Colab's GPU runtime is free of charge, but it is not unlimited nor guaranteed. Even though the Google Colab FAQ states that "virtual machines have maximum lifetimes that can be as much as 12 hours", I often saw my Colab GPU sessions getting disconnected after 7~8 hours of non-interactive use.
If you connect to GPU instances on Google Colab repeatedly and frequently, you could be temporarily locked out (not able to connect to GPU instances for a couple of days). So I'd suggest you to connect to a GPU runtime sparingly and only when needed, and to manually terminate the GPU sessions as soon as you no longer need them.
It is strongly advised that you read and mind Google Colab's Resource Limits.

Due to the 7~8 hour limit of GPU runtime mentioned above, you won't be able to train a large yolov4 model in a single session. That's the reason why I chose "416x416" model for this part of the tutorial. Here are the steps:

Open yolov4_crowdhuman.ipynb. This IPython Notebook is on my personal Google Drive. You could review it, but you could not modify it.
Make a copy of "yolov4_crowdhuman.ipynb" on your own Google Drive, by clicking "Files -> Save a copy in Drive" on the menu. You should use your own saved copy of the Notebook for the rest of the steps.
Follow the instructions in the Notebook to train the "yolov4-crowdhuman-416x416" model, i.e.
- make sure the IPython Notebook has successfully connected to a GPU runtime,
- mount your Google Drive (for saving training log and weights),
- run all cells ("Runtime -> Run all" or "Runtime -> Restart and run all").
You should have a good chance of finishing training the "yolov4-crowdhuman-416x416" model before the Colab session gets automatically disconnected (expired).

Instead of opening the Colab Notebook on my Google Drive, you could also go to your own Colab account and use "File -> Upload notebook" to upload yolov4_crowdhuman.ipynb directly.

Refer to my Custom YOLOv4 Model on Google Colab post for additional information about running the IPython Notebook.

Deploying onto Jetson Nano

To deploy the trained "yolov4-crowdhuman-416x416" model onto Jsetson Nano, I'd use my jkjung-avt/tensorrt_demos code to build/deploy it as a TensorRT engine. Here are the detailed steps:

On the Jetson Nano, check out my jkjung-avt/tensorrt_demos code and make sure you are able to run the standard "yolov4-416" TensorRT engine without problem. Please refer to Demo #5: YOLOv4 for details.

$ cd ${HOME}/project
$ git clone https://github.com/jkjung-avt/tensorrt_demos.git
### Detailed steps omitted: install pycuda, download yolov4-416 model, yolo_to_onnx, onnx_to_tensorrt
### ......
$ cd ${HOME}/project/tensorrt_demos
$ python3 trt_yolo.py --image ${HOME}/Pictures/dog.jpg -m yolov4-416

Download the "yolov4-crowdhuman-416x416" model. More specifically, get "yolov4-crowdhuman-416x416.cfg" from this repository and download "yolov4-crowdhuman-416x416_best.weights" file from your Google Drive. Rename the .weights file so that it matches the .cfg file.
```
$ cd ${HOME}/project/tensorrt_demos/yolo
$ wget https://raw.githubusercontent.com/jkjung-avt/yolov4_crowdhuman/master/cfg/yolov4-crowdhuman-416x416.cfg
$ cp ${HOME}/Downloads/yolov4-crowdhuman-416x416_best.weights yolov4-crowdhuman-416x416.weights
```
Then build the TensorRT (FP16) engine. Note the "-c 2" in the command-line option is for specifying that the model is for detecting 2 classes of objects.
```
$ python3 yolo_to_onnx.py -c 2 -m yolov4-crowdhuman-416x416
$ python3 onnx_to_tensorrt.py -c 2 -m yolov4-crowdhuman-416x416
```
Test the TensorRT engine. For example, I tested it with the "Avengers: Infinity War" movie trailer. (You should download and test with your own images or videos.)
```
$ cd ${HOME}/project/tensorrt_demos
$ python3 trt_yolo.py --video ${HOME}/Videos/Infinity_War.mp4 \
                      -c 2 -m yolov4-crowdhuman-416x416
```
(Click on the image below to see the whole video clip...)

Contributions

@philipp-schmidt: yolov4-tiny models and training charts

Comments

Can you share your weights files?

Hi jkjung,

I have also train my custom yolov4 on this dataset, but can not achieve with good performance. Could you share your weights files for 608x608? I would like to continue my training on your works, thanks a lot in advance!

opened by Tsings04 7
yolov4-tiny 416x416 trained weights?

Hello, good job with the yolov4 instructions and sharing weights

Do you have yolov4 tiny trained weights as well? If so, could you share it and the corresponding cfg?

opened by kadirbeytorun 6
Data Preparation

hi, thanks for your repo. This is very helpful. However, I still don't understand about the purpose of Kmeans on your data preparation. Does it affect the quality of the dataset?

opened by mnnurilmi 3
meaning of 608 in "Prepare_data.sh" execution

When I download data, do specify 608 to build a directory name to make it easier to use later, regardless of the size of the data? So, the data doesn't change according to the arugument, does it? It automatically resize when I train the model anyway? Thank you

opened by ingbeeedd 3

error while convert to onnx

Hi, i've trained the yolov4-tiny-crowdhuman-416x416 and now trying to convert it to onnx on jetson nano, but have a error:

Parsing DarkNet cfg file...
Building ONNX graph...
graph yolov4-tiny-crowdhuman-416x416 (
  %000_net[FLOAT, 1x3x416x416]
) optional inputs with matching initializers (
  %001_convolutional_bn_scale[FLOAT, 32]
  %001_convolutional_bn_bias[FLOAT, 32]
  %001_convolutional_bn_mean[FLOAT, 32]
  %001_convolutional_bn_var[FLOAT, 32]
  %001_convolutional_conv_weights[FLOAT, 32x3x3x3]
  %002_convolutional_bn_scale[FLOAT, 64]
  %002_convolutional_bn_bias[FLOAT, 64]
  %002_convolutional_bn_mean[FLOAT, 64]
  %002_convolutional_bn_var[FLOAT, 64]
  %002_convolutional_conv_weights[FLOAT, 64x32x3x3]
  %003_convolutional_bn_scale[FLOAT, 64]
  %003_convolutional_bn_bias[FLOAT, 64]
  %003_convolutional_bn_mean[FLOAT, 64]
  %003_convolutional_bn_var[FLOAT, 64]
  %003_convolutional_conv_weights[FLOAT, 64x64x3x3]
  %005_convolutional_bn_scale[FLOAT, 32]
  %005_convolutional_bn_bias[FLOAT, 32]
  %005_convolutional_bn_mean[FLOAT, 32]
  %005_convolutional_bn_var[FLOAT, 32]
  %005_convolutional_conv_weights[FLOAT, 32x32x3x3]
  %006_convolutional_bn_scale[FLOAT, 32]
  %006_convolutional_bn_bias[FLOAT, 32]
  %006_convolutional_bn_mean[FLOAT, 32]
  %006_convolutional_bn_var[FLOAT, 32]
  %006_convolutional_conv_weights[FLOAT, 32x32x3x3]
  %008_convolutional_bn_scale[FLOAT, 64]
  %008_convolutional_bn_bias[FLOAT, 64]
  %008_convolutional_bn_mean[FLOAT, 64]
  %008_convolutional_bn_var[FLOAT, 64]
  %008_convolutional_conv_weights[FLOAT, 64x64x1x1]
  %011_convolutional_bn_scale[FLOAT, 128]
  %011_convolutional_bn_bias[FLOAT, 128]
  %011_convolutional_bn_mean[FLOAT, 128]
  %011_convolutional_bn_var[FLOAT, 128]
  %011_convolutional_conv_weights[FLOAT, 128x128x3x3]
  %013_convolutional_bn_scale[FLOAT, 64]
  %013_convolutional_bn_bias[FLOAT, 64]
  %013_convolutional_bn_mean[FLOAT, 64]
  %013_convolutional_bn_var[FLOAT, 64]
  %013_convolutional_conv_weights[FLOAT, 64x64x3x3]
  %014_convolutional_bn_scale[FLOAT, 64]
  %014_convolutional_bn_bias[FLOAT, 64]
  %014_convolutional_bn_mean[FLOAT, 64]
  %014_convolutional_bn_var[FLOAT, 64]
  %014_convolutional_conv_weights[FLOAT, 64x64x3x3]
  %016_convolutional_bn_scale[FLOAT, 128]
  %016_convolutional_bn_bias[FLOAT, 128]
  %016_convolutional_bn_mean[FLOAT, 128]
  %016_convolutional_bn_var[FLOAT, 128]
  %016_convolutional_conv_weights[FLOAT, 128x128x1x1]
  %019_convolutional_bn_scale[FLOAT, 256]
  %019_convolutional_bn_bias[FLOAT, 256]
  %019_convolutional_bn_mean[FLOAT, 256]
  %019_convolutional_bn_var[FLOAT, 256]
  %019_convolutional_conv_weights[FLOAT, 256x256x3x3]
  %021_convolutional_bn_scale[FLOAT, 128]
  %021_convolutional_bn_bias[FLOAT, 128]
  %021_convolutional_bn_mean[FLOAT, 128]
  %021_convolutional_bn_var[FLOAT, 128]
  %021_convolutional_conv_weights[FLOAT, 128x128x3x3]
  %022_convolutional_bn_scale[FLOAT, 128]
  %022_convolutional_bn_bias[FLOAT, 128]
  %022_convolutional_bn_mean[FLOAT, 128]
  %022_convolutional_bn_var[FLOAT, 128]
  %022_convolutional_conv_weights[FLOAT, 128x128x3x3]
  %024_convolutional_bn_scale[FLOAT, 256]
  %024_convolutional_bn_bias[FLOAT, 256]
  %024_convolutional_bn_mean[FLOAT, 256]
  %024_convolutional_bn_var[FLOAT, 256]
  %024_convolutional_conv_weights[FLOAT, 256x256x1x1]
  %027_convolutional_bn_scale[FLOAT, 512]
  %027_convolutional_bn_bias[FLOAT, 512]
  %027_convolutional_bn_mean[FLOAT, 512]
  %027_convolutional_bn_var[FLOAT, 512]
  %027_convolutional_conv_weights[FLOAT, 512x512x3x3]
  %028_convolutional_bn_scale[FLOAT, 256]
  %028_convolutional_bn_bias[FLOAT, 256]
  %028_convolutional_bn_mean[FLOAT, 256]
  %028_convolutional_bn_var[FLOAT, 256]
  %028_convolutional_conv_weights[FLOAT, 256x512x1x1]
  %029_convolutional_bn_scale[FLOAT, 512]
  %029_convolutional_bn_bias[FLOAT, 512]
  %029_convolutional_bn_mean[FLOAT, 512]
  %029_convolutional_bn_var[FLOAT, 512]
  %029_convolutional_conv_weights[FLOAT, 512x256x3x3]
  %030_convolutional_conv_bias[FLOAT, 21]
  %030_convolutional_conv_weights[FLOAT, 21x512x1x1]
  %033_convolutional_bn_scale[FLOAT, 128]
  %033_convolutional_bn_bias[FLOAT, 128]
  %033_convolutional_bn_mean[FLOAT, 128]
  %033_convolutional_bn_var[FLOAT, 128]
  %033_convolutional_conv_weights[FLOAT, 128x256x1x1]
  %034_upsample_scale[FLOAT, 4]
  %036_convolutional_bn_scale[FLOAT, 256]
  %036_convolutional_bn_bias[FLOAT, 256]
  %036_convolutional_bn_mean[FLOAT, 256]
  %036_convolutional_bn_var[FLOAT, 256]
  %036_convolutional_conv_weights[FLOAT, 256x384x3x3]
  %037_convolutional_conv_bias[FLOAT, 21]
  %037_convolutional_conv_weights[FLOAT, 21x256x1x1]
) {
  %001_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [2, 2]](%000_net, %001_convolutional_conv_weights)
  %001_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%001_convolutional, %001_convolutional_bn_scale, %001_convolutional_bn_bias, %001_convolutional_bn_mean, %001_convolutional_bn_var)
  %001_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%001_convolutional_bn)
  %002_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [2, 2]](%001_convolutional_lrelu, %002_convolutional_conv_weights)
  %002_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%002_convolutional, %002_convolutional_bn_scale, %002_convolutional_bn_bias, %002_convolutional_bn_mean, %002_convolutional_bn_var)
  %002_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%002_convolutional_bn)
  %003_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%002_convolutional_lrelu, %003_convolutional_conv_weights)
  %003_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%003_convolutional, %003_convolutional_bn_scale, %003_convolutional_bn_bias, %003_convolutional_bn_mean, %003_convolutional_bn_var)
  %003_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%003_convolutional_bn)
  %004_route_dummy0, %004_route = Split[axis = 1, split = [32, 32]](%003_convolutional_lrelu)
  %005_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%004_route, %005_convolutional_conv_weights)
  %005_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%005_convolutional, %005_convolutional_bn_scale, %005_convolutional_bn_bias, %005_convolutional_bn_mean, %005_convolutional_bn_var)
  %005_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%005_convolutional_bn)
  %006_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%005_convolutional_lrelu, %006_convolutional_conv_weights)
  %006_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%006_convolutional, %006_convolutional_bn_scale, %006_convolutional_bn_bias, %006_convolutional_bn_mean, %006_convolutional_bn_var)
  %006_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%006_convolutional_bn)
  %007_route = Concat[axis = 1](%006_convolutional_lrelu, %005_convolutional_lrelu)
  %008_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%007_route, %008_convolutional_conv_weights)
  %008_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%008_convolutional, %008_convolutional_bn_scale, %008_convolutional_bn_bias, %008_convolutional_bn_mean, %008_convolutional_bn_var)
  %008_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%008_convolutional_bn)
  %009_route = Concat[axis = 1](%003_convolutional_lrelu, %008_convolutional_lrelu)
  %010_maxpool = MaxPool[auto_pad = 'SAME_UPPER', kernel_shape = [2, 2], strides = [2, 2]](%009_route)
  %011_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%010_maxpool, %011_convolutional_conv_weights)
  %011_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%011_convolutional, %011_convolutional_bn_scale, %011_convolutional_bn_bias, %011_convolutional_bn_mean, %011_convolutional_bn_var)
  %011_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%011_convolutional_bn)
  %012_route_dummy0, %012_route = Split[axis = 1, split = [64, 64]](%011_convolutional_lrelu)
  %013_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%012_route, %013_convolutional_conv_weights)
  %013_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%013_convolutional, %013_convolutional_bn_scale, %013_convolutional_bn_bias, %013_convolutional_bn_mean, %013_convolutional_bn_var)
  %013_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%013_convolutional_bn)
  %014_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%013_convolutional_lrelu, %014_convolutional_conv_weights)
  %014_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%014_convolutional, %014_convolutional_bn_scale, %014_convolutional_bn_bias, %014_convolutional_bn_mean, %014_convolutional_bn_var)
  %014_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%014_convolutional_bn)
  %015_route = Concat[axis = 1](%014_convolutional_lrelu, %013_convolutional_lrelu)
  %016_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%015_route, %016_convolutional_conv_weights)
  %016_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%016_convolutional, %016_convolutional_bn_scale, %016_convolutional_bn_bias, %016_convolutional_bn_mean, %016_convolutional_bn_var)
  %016_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%016_convolutional_bn)
  %017_route = Concat[axis = 1](%011_convolutional_lrelu, %016_convolutional_lrelu)
  %018_maxpool = MaxPool[auto_pad = 'SAME_UPPER', kernel_shape = [2, 2], strides = [2, 2]](%017_route)
  %019_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%018_maxpool, %019_convolutional_conv_weights)
  %019_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%019_convolutional, %019_convolutional_bn_scale, %019_convolutional_bn_bias, %019_convolutional_bn_mean, %019_convolutional_bn_var)
  %019_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%019_convolutional_bn)
  %020_route_dummy0, %020_route = Split[axis = 1, split = [128, 128]](%019_convolutional_lrelu)
  %021_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%020_route, %021_convolutional_conv_weights)
  %021_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%021_convolutional, %021_convolutional_bn_scale, %021_convolutional_bn_bias, %021_convolutional_bn_mean, %021_convolutional_bn_var)
  %021_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%021_convolutional_bn)
  %022_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%021_convolutional_lrelu, %022_convolutional_conv_weights)
  %022_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%022_convolutional, %022_convolutional_bn_scale, %022_convolutional_bn_bias, %022_convolutional_bn_mean, %022_convolutional_bn_var)
  %022_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%022_convolutional_bn)
  %023_route = Concat[axis = 1](%022_convolutional_lrelu, %021_convolutional_lrelu)
  %024_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%023_route, %024_convolutional_conv_weights)
  %024_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%024_convolutional, %024_convolutional_bn_scale, %024_convolutional_bn_bias, %024_convolutional_bn_mean, %024_convolutional_bn_var)
  %024_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%024_convolutional_bn)
  %025_route = Concat[axis = 1](%019_convolutional_lrelu, %024_convolutional_lrelu)
  %026_maxpool = MaxPool[auto_pad = 'SAME_UPPER', kernel_shape = [2, 2], strides = [2, 2]](%025_route)
  %027_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%026_maxpool, %027_convolutional_conv_weights)
  %027_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%027_convolutional, %027_convolutional_bn_scale, %027_convolutional_bn_bias, %027_convolutional_bn_mean, %027_convolutional_bn_var)
  %027_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%027_convolutional_bn)
  %028_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%027_convolutional_lrelu, %028_convolutional_conv_weights)
  %028_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%028_convolutional, %028_convolutional_bn_scale, %028_convolutional_bn_bias, %028_convolutional_bn_mean, %028_convolutional_bn_var)
  %028_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%028_convolutional_bn)
  %029_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%028_convolutional_lrelu, %029_convolutional_conv_weights)
  %029_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%029_convolutional, %029_convolutional_bn_scale, %029_convolutional_bn_bias, %029_convolutional_bn_mean, %029_convolutional_bn_var)
  %029_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%029_convolutional_bn)
  %030_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%029_convolutional_lrelu, %030_convolutional_conv_weights, %030_convolutional_conv_bias)
  %033_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%028_convolutional_lrelu, %033_convolutional_conv_weights)
  %033_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%033_convolutional, %033_convolutional_bn_scale, %033_convolutional_bn_bias, %033_convolutional_bn_mean, %033_convolutional_bn_var)
  %033_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%033_convolutional_bn)
  %034_upsample = Upsample[mode = 'nearest'](%033_convolutional_lrelu, %034_upsample_scale)
  %035_route = Concat[axis = 1](%034_upsample, %024_convolutional_lrelu)
  %036_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%035_route, %036_convolutional_conv_weights)
  %036_convolutional_bn = BatchNormalization[epsilon = 9.99999974737875e-06, momentum = 0.990000009536743](%036_convolutional, %036_convolutional_bn_scale, %036_convolutional_bn_bias, %036_convolutional_bn_mean, %036_convolutional_bn_var)
  %036_convolutional_lrelu = LeakyRelu[alpha = 0.100000001490116](%036_convolutional_bn)
  %037_convolutional = Conv[auto_pad = 'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%036_convolutional_lrelu, %037_convolutional_conv_weights, %037_convolutional_conv_bias)
  return %030_convolutional, %037_convolutional
}
Checking ONNX model...
Traceback (most recent call last):
  File "yolo_to_onnx.py", line 955, in <module>
    main()
  File "yolo_to_onnx.py", line 945, in main
    onnx.checker.check_model(yolo_model_def)
  File "/home/fm/.local/lib/python3.6/site-packages/onnx/checker.py", line 102, in check_model
    C.check_model(protobuf_string)
onnx.onnx_cpp2py_export.checker.ValidationError: Unrecognized attribute: split for operator Split

==> Context: Bad node spec: input: "003_convolutional_lrelu" output: "004_route_dummy0" output: "004_route" name: "004_route" op_type: "Split" attribute { name: "axis" i: 1 type: INT } attribute { name: "split" ints: 32 ints: 32 type: INTS }

the only thing that i've changed is subdivisions=32 in darknet/cfg/yolov4-tiny-crowdhuman-416x416.cfg

opened by sawk1 3

Simplify gen_txt.py

I believe gen_txt.py is doing too much work. It should not be necessary to process the dataset for different input resolutions of the network, as the yolo input coordinates are basically UV coordinates and are relative to input image dimensions. (Unless you do letterbox rescaling I guess...)

https://github.com/AlexeyAB/Yolo_mark/issues/60#issuecomment-401854885

Current implementation:

def txt_line(cls, bbox, img_w, img_h):
    """Generate 1 line in the txt file."""
    assert INPUT_WIDTH > 0 and INPUT_HEIGHT > 0
    x, y, w, h = bbox
    x = max(int(x), 0)
    y = max(int(y), 0)
    w = min(int(w), img_w - x)
    h = min(int(h), img_h - y)
    w_rescaled = float(w) * INPUT_WIDTH  / img_w
    h_rescaled = float(h) * INPUT_HEIGHT / img_h
    if w_rescaled < MIN_W or h_rescaled < MIN_H:
        return ''
    else:
        cx = (x + w / 2.) / img_w
        cy = (y + h / 2.) / img_h
        nw = float(w) / img_w
        nh = float(h) / img_h
        return '%d %.6f %.6f %.6f %.6f\n' % (cls, cx, cy, nw, nh)

Implementation from https://github.com/theAIGuysCode/OIDv4_ToolKit:

# function that turns XMin, YMin, XMax, YMax coordinates to normalized yolo format
def convert(filename_str, coords):
    os.chdir("..")
    image = cv2.imread(filename_str + ".jpg")
    coords[2] -= coords[0]
    coords[3] -= coords[1]
    x_diff = int(coords[2]/2)
    y_diff = int(coords[3]/2)
    coords[0] = coords[0]+x_diff
    coords[1] = coords[1]+y_diff
    coords[0] /= int(image.shape[1])
    coords[1] /= int(image.shape[0])
    coords[2] /= int(image.shape[1])
    coords[3] /= int(image.shape[0])
    os.chdir("Label")
    return coords

Note the lack of INPUT_WIDTH and INPUT_HEIGHT.

opened by philipp-schmidt 3

ERROR WHILE TESTING ON A LOCAL VIDEO FILE

While testing on a local video file. I came across this error """ double free or corruption (!prev) Aborted (core dumped) """ command used : python3 darknet_video.py --input videos/case1.mp4 --out_filename videos_output/case1.avi --weights ./backup/yolov4-crowdhuman-608x608_final.weights --dont_show --ext_output --config_file ./cfg/yolov4-crowdhuman-608x608.cfg --data_file ../data/crowdhuman-608x608.data --thresh 0.25

I double checked every parameter. weights are loaded properly.

[yolo] params: iou loss: ciou (4), iou_norm: 0.07, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.05 nms_kind: greedynms (1), beta = 0.600000 Total BFLOPS 127.248 avg_outputs = 1046494 Allocate additional workspace_size = 52.43 MB Try to load weights: ./backup/yolov4-crowdhuman-608x608_final.weights Loading weights from ./backup/yolov4-crowdhuman-608x608_final.weights... seen 64, trained: 384 K-images (6 Kilo-batches_64) Done! Loaded 162 layers from weights-file Loaded - names_list: data/crowdhuman.names, classes = 2 videos/case1.mp4 FPS: 4

Objects: person: 50.31% (left_x: 250 top_y: 59 width: 7 height: 33) person: 88.72% (left_x: 259 top_y: 60 width: 9 height: 34) FPS: 26

Objects: person: 49.94% (left_x: 250 top_y: 59 width: 7 height: 33) person: 88.43% (left_x: 259 top_y: 60 width: 9 height: 34) double free or corruption (!prev) Aborted (core dumped)

How to solve this issue?

opened by chandu1263 3
max_batches value

This is more a question than an issue. Since the crowdhuman608x608 dataset contains around 15000 photos for training, why is the max_batches set to 6000 only? In the instructions of yolov4 there is a guideline saying that the max_batches shall be at least equal to the number of your training images.

opened by dlavrantonis 2
Training on Xavier agx - make issue
hello!

I tried to train on Xavier agx but there's an error on it.

when I 'make' after modifying Makefile,

chmod +x *.sh g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -DOPENCV pkg-config --cflags opencv4 2> /dev/null || pkg-config - -cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN -DCUDNN_HALF -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pra gmas -fPIC -ffp-contract=fast -mavx -mavx2 -msse3 -msse4.1 -msse4.2 -msse4a -Ofast -DOPENCV -fopenmp -DGPU -DCUDNN -I/usr/loca l/cudnn/include -DCUDNN_HALF -fPIC -c ./src/image_opencv.cpp -o obj/image_opencv.o g++: error: unrecognized command line option ‘-mavx’ g++: error: unrecognized command line option ‘-mavx2’ g++: error: unrecognized command line option ‘-msse3’ g++: error: unrecognized command line option ‘-msse4.1’ g++: error: unrecognized command line option ‘-msse4.2’ g++: error: unrecognized command line option ‘-msse4a’ Makefile:182: recipe for target 'obj/image_opencv.o' failed make: *** [obj/image_opencv.o] Error 1

error like this.

My agx's environment like this:

NVIDIA Jetson AGX Xavier [16GB]

Jetpack 4.5.1 [L4T 32.5.1]

NV Power Mode: MODE_15W_DESKTOP - Type: 7

jetson_stats.service: active

Libraries:

CUDA: 10.2.89

cuDNN: 8.0.0.180

TensorRT: 7.1.3.0

Visionworks: 1.6.0.501

OpenCV: 4.1.1

VPI: ii libnvvpi1 1.0.15 arm64 NVIDIA Vision Programming Interface library

Vulkan: 1.2.70

I'm always grateful for your good materials.
opened by Falconpunchzz 2
How do I create a .odgt extension file to create a custom yolov4 model?

How do I create a .odgt extension file to create a custom yolov4 model?

I am following your tutorial on google colab.

But I saw odgt extension files for the first time in my life in the dataset.

This file is like a annotation file. Please tell me how to create odgt file to make custom yolov4 .

opened by Choikyungho9 2
Permission denied

w can't get data in google colab could you change the permission of your drive for dataset folders Downloading CrowdHuman_train01.zip... Permission denied: https://drive.google.com/uc?id=134QOvaatwKdy0iIeNqA_p-xkAhkV4F8Y Maybe you need to change permission over 'Anyone with the link'?

opened by aloosh12 2
Can't detect my model

Hello, my model is traffic sign detection, I have downloaded my files cfg and weights and convert them to tensorRT as your guide, but it can't show the label name of traffic sign. Could you show me how can I fix it, thank you.

opened by QuyNguyen87 3
cannot download yolov4-crowdhuman-416x416_best.weights

hi,I am from China,I wan to download your yolov4-crowdhuman-416x416_best.weights,but ,in China ,everyone can not open google website ,so I can not download yolov4-crowdhuman-416x416_best.weights, would you please send me one copy file,or other ways? thanks

opened by fatfishlin 9
what all changes do I need to make when to train with only person class.

I get bellow error while training with only person class

Wrong annotation: class_id = 1. But class_id should be [from 0 to 0], file: data/crowdhuman-608x608/273278,77a600008ca1359b.txt

Kindly suggest what should i do ?

opened by ssbilakeri 2

Owner

JK Jung

I am one of the NVIDIA Jetson Champions: https://developer.nvidia.com/embedded/community/jetson-champions

GitHub

WHENet - ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L

HeadPoseEstimation-WHENet-yolov4-onnx-openvino ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L 1. Usage $ git clone htt

49 Sep 21, 2022

FaceAnon - Anonymize people in images and videos using yolov5-crowdhuman

Face Anonymizer Blur faces from image and video files in /input/ folder. Require

22 Nov 3, 2022

A set of tools for converting a darknet dataset to COCO format working with YOLOX

darknet格式数据→COCO darknet训练数据目录结构（详情参见dataset/darknet）： darknet ├── class.names ├── gen_config.data ├── gen_train.txt ├── gen_valid.txt └── images

148 Jan 3, 2023

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Skin Lesion detection using YOLO This project deal

1 Nov 22, 2021

YOLOv4-v3 Training Automation API for Linux

This repository allows you to get started with training a state-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset or label your dataset using our BMW-LabelTool-Lite and you can start the training right away and monitor it in many different ways like TensorBoard or a custom REST API and GUI. NoCode training with YOLOv4 and YOLOV3 has never been so easy.

626 Dec 31, 2022

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

What Dead simple python wrapper for Yolo V3 using AlexyAB's darknet fork. Works with CUDA 10.1 and OpenCV 4.1 or later (I use OpenCV master as of Jun

6 Jan 12, 2022

An Unsupervised Detection Framework for Chinese Jargons in the Darknet

An Unsupervised Detection Framework for Chinese Jargons in the Darknet This repo is the Python 3 implementation of 《An Unsupervised Detection Framewor

7 Nov 8, 2022

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

33 Dec 2, 2022

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

66 Dec 26, 2022

这是一个mobilenet-yolov4-lite的库，把yolov4主干网络修改成了mobilenet，修改了Panet的卷积组成，使参数量大幅度缩小。

YOLOV4：You Only Look Once目标检测模型-修改mobilenet系列主干网络-在Keras当中的实现 2021年2月8日更新：加入letterbox_image的选项，关闭letterbox_image后网络的map一般可以得到提升。

65 Dec 1, 2022

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4. YOLTv4 is designed to detect objects in aerial or satellite imagery in arbitrarily large images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

161 Jan 6, 2023

PyTorch ,ONNX and TensorRT implementation of YOLOv4

4.2k Jan 1, 2023

I tried to apply the CAM algorithm to YOLOv4 and it worked.

YOLOV4：You Only Look Once目标检测模型在pytorch当中的实现 2021年2月7日更新：加入letterbox_image的选项，关闭letterbox_image后网络的map得到大幅度提升。目录性能情况 Performance 实现的内容 Achievement

55 Dec 5, 2022

People movement type classifier with YOLOv4 detection and SORT tracking.

Movement classification The goal of this project would be movement classification of people, in other words, walking (normal and fast) and running. Yo

4 Sep 21, 2021

Object tracking implemented with YOLOv4, DeepSort, and TensorFlow.

Object tracking implemented with YOLOv4, DeepSort, and TensorFlow. YOLOv4 is a state of the art algorithm that uses deep convolutional neural networks to perform object detections. We can take the output of YOLOv4 feed these object detections into Deep SORT (Simple Online and Realtime Tracking with a Deep Association Metric) in order to create a highly accurate object tracker.

1.1k Dec 29, 2022

A tutorial on training a DarkNet YOLOv4 model for the CrowdHuman dataset

Related tags

Overview

YOLOv4 CrowdHuman Tutorial

Table of contents

Setup

Preparing training data

Training on a local PC

Testing the custom-trained yolov4 model

Training on Google Colab

Deploying onto Jetson Nano

Contributions

Comments

Owner

JK Jung

WHENet - ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L

FaceAnon - Anonymize people in images and videos using yolov5-crowdhuman

A set of tools for converting a darknet dataset to COCO format working with YOLOX

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

YOLOv4-v3 Training Automation API for Linux

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

An Unsupervised Detection Framework for Chinese Jargons in the Darknet

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

这是一个mobilenet-yolov4-lite的库，把yolov4主干网络修改成了mobilenet，修改了Panet的卷积组成，使参数量大幅度缩小。

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

PyTorch ,ONNX and TensorRT implementation of YOLOv4

I tried to apply the CAM algorithm to YOLOv4 and it worked.

People movement type classifier with YOLOv4 detection and SORT tracking.

Object tracking implemented with YOLOv4, DeepSort, and TensorFlow.

Vehicles Counting using YOLOv4 + DeepSORT + Flask + Ngrok

Implementing yolov4 target detection and tracking based on nao robot

Implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.

A Keras implementation of YOLOv4 (Tensorflow backend)