A pytorch-based real-time segmentation model for autonomous driving

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

This project contains the Pytorch implementation for the proposed CFPNet: paper

Result
Result
Real-time semantic segmentation is playing a more important role in computer vision, due to the growing demand for mobile devices and autonomous driving. Therefore, it is very important to achieve a good trade-off among performance, model size and inference speed. In this paper, we propose a Channel-wise Feature Pyramid (CFP) module to balance those factors. Based on the CFP module, we built CFPNet for real-time semantic segmentation which applied a series of dilated convolution channels to extract effective features. Experiments on Cityscapes and CamVid datasets show that the proposed CFPNet achieves an effective combination of those factors. For the Cityscapes test dataset, CFPNet achievse 70.1% class-wise mIoU with only 0.55 million parameters and 2.5 MB memory. The inference speed can reach 30 FPS on a single RTX 2080Ti GPU (GPU usage 60%) with a 1024×2048-pixel image.

Installation

  • Enviroment: Python 3.6; Pytorch 1.0; CUDA 9.0; cuDNN V7
  • Install some packages:
pip install opencv-python pillow numpy matplotlib
  • Clone this repository
git clone https://github.com/AngeLouCN/CFPNet
  • One GPU with 11GB memory is needed

Dataset

You need to download the two dataset——CamVid and Cityscapes, and put the files in the datasetfolder with following structure.

|—— camvid
|    ├── train
|    ├── test
|    ├── val 
|    ├── trainannot
|    ├── testannot
|    ├── valannot
|    ├── camvid_trainval_list.txt
|    ├── camvid_train_list.txt
|    ├── camvid_test_list.txt
|    └── camvid_val_list.txt
├── cityscapes
|    ├── gtCoarse
|    ├── gtFine
|    ├── leftImg8bit
|    ├── cityscapes_trainval_list.txt
|    ├── cityscapes_train_list.txt
|    ├── cityscapes_test_list.txt
|    └── cityscapes_val_list.txt  

Training

  • You can run: python train.py -hto check the detail of optional arguments. In the train.py, you can set the dataset, train type, epochs and batch size, etc.
  • training on Cityscapes train set.
python train.py --dataset cityscapes
  • training on Camvid train and val set.
python train.py --dataset camvid --train_type trainval --max_epochs 1000 --lr 1e-3 --batch_size 16
  • During training course, every 50 epochs, we will record the mean IoU of train set, validation set and training loss to draw a plot, so you can check whether the training process is normal.
Val mIoU vs Epochs Train loss vs Epochs
Result
Result

Testing

  • After training, the checkpoint will be saved at checkpointfolder, you can use test.pyto predict the result.
python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Evalution

  • For those dataset that do not provide label on the test set (e.g. Cityscapes), you can use predict.py to save all the output images, then submit to official webpage for evaluation.
python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Inference Speed

  • You can run the eval_fps.py to test the model inference speed, input the image size such as 1024,2048.
python eval_fps.py 1024,2048

Results

  • Results for CFPNet-V1, CFPNet-V2 and CFPNet-v3:
Dataset Model mIoU
Cityscapes CFPNet-V1 60.4%
Cityscapes CFPNet-V2 66.5%
Cityscapes CFPNet-V3 70.1%
  • Sample results: (from top to bottom is Original, CFPNet-V1, CFPNet-V2 and CFPNet-v3)
Result
Category_acc vs size Class_acc vs size
Result
Result
Class_acc vs parameter Class_acc vs speed
Result
Result

Comparsion

  • Results of Cityscapes
Result
  • Results of CamVid
Result

Citation

If you think our work is helpful, please consider to cite:

@article{lou2021cfpnet,
  title={CFPNet: Channel-wise Feature Pyramid for Real-Time Semantic Segmentation},
  author={Lou, Ange and Loew, Murray},
  journal={arXiv preprint arXiv:2103.12212},
  year={2021}
}
You might also like...
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

A keras-based real-time model for medical image segmentation (CFPNet-M)
A keras-based real-time model for medical image segmentation (CFPNet-M)

CFPNet-M: A Light-Weight Encoder-Decoder Based Network for Multimodal Biomedical Image Real-Time Segmentation This repository contains the implementat

This repository contains a pytorch implementation of
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

PyTorch implementation of
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

GndNet: Fast ground plane estimation and point cloud segmentation for autonomous vehicles using deep neural networks.
GndNet: Fast ground plane estimation and point cloud segmentation for autonomous vehicles using deep neural networks.

GndNet: Fast Ground plane Estimation and Point Cloud Segmentation for Autonomous Vehicles. Authors: Anshul Paigwar, Ozgur Erkent, David Sierra Gonzale

Comments
  • model Flops?

    model Flops?

    @AngeLouCN Hi there,

    1. what is the number of GFlops for the CFPNet-V3 model?
    2. can it inference using CPU-only? and how much RAM does it require to inference on cpu-only?
    opened by seekingdeep 2
Owner
null
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

null 308 Jan 4, 2023
An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

Semisupervised Multitask Learning This repository is an unofficial and slightly modified implementation of UM-Adapt[1] using PyTorch. This code primar

Abhinav Atrishi 11 Nov 25, 2022
Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

Ethan 51 Nov 17, 2022
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

null 695 Jan 5, 2023
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

null 148 Dec 28, 2022
[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving Paper | Supplementary | Video | Poster | Blog This repository is for the ICCV 2021 pap

null 254 Jan 2, 2023
This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

Sagor Saha 4 Sep 4, 2021
Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

QCraft 101 Dec 5, 2022