InsightFace: 2D and 3D Face Analysis Project on MXNet and PyTorch

Deep Insight

Last update: Jan 6, 2023

Related tags

Deep Learning mxnet pytorch face-recognition face-detection face-alignment age-estimation arcface retinaface

Overview

InsightFace: 2D and 3D Face Analysis Project

By Jia Guo and Jiankang Deng

Top News

2021-06-05: We launch a Masked Face Recognition Challenge & Workshop on ICCV 2021.

2021-05-15: We released an efficient high accuracy face detection approach called SCRFD.

2021-04-18: We achieved Rank-4th on NIST-FRVT 1:1, see leaderboard.

2021-03-13: We have released our official ArcFace PyTorch implementation, see here.

License

The code of InsightFace is released under the MIT License. There is no limitation for both academic and commercial usage.

The training data containing the annotation (and the models trained with these data) are available for non-commercial research purposes only.

Introduction

InsightFace is an open source 2D&3D deep face analysis toolbox, mainly based on MXNet and PyTorch.

The master branch works with MXNet 1.2 to 1.6, PyTorch 1.6+, with Python 3.x.

ArcFace Video Demo

Please click the image to watch the Youtube video. For Bilibili users, click here.

Recent Update

2021-06-05: We launch a Masked Face Recognition Challenge & Workshop on ICCV 2021.

2021-05-15: We released an efficient high accuracy face detection approach called SCRFD.

2021-04-18: We achieved Rank-4th on NIST-FRVT 1:1, see leaderboard.

2021-03-13: We have released our official ArcFace PyTorch implementation, see here.

2021-03-09: Tips for training large-scale face recognition model, such as millions of IDs(classes).

2021-02-21: We provide a simple face mask renderer here which can be used as a data augmentation tool while training face recognition models.

2021-01-20: OneFlow based implementation of ArcFace and Partial-FC, here.

2020-10-13: A new training method and one large training set(360K IDs) were released here by DeepGlint.

2020-10-09: We opened a large scale recognition test benchmark IFRT

2020-08-01: We released lightweight facial landmark models with fast coordinate regression(106 points). See detail here.

2020-04-27: InsightFace pretrained models and MS1M-Arcface are now specified as the only external training dataset, for iQIYI iCartoonFace challenge, see detail here.

2020.02.21: Instant discussion group created on QQ with group-id: 711302608. For English developers, see install tutorial here.

2020.02.16: RetinaFace now can detect faces with mask, for anti-CoVID19, see detail here

2019.08.10: We achieved 2nd place at WIDER Face Detection Challenge 2019.

2019.05.30: Presentation at cvmart

2019.04.30: Our Face detector (RetinaFace) obtains state-of-the-art results on the WiderFace dataset.

2019.04.14: We will launch a Light-weight Face Recognition challenge/workshop on ICCV 2019.

2019.04.04: Arcface achieved state-of-the-art performance (7/109) on the NIST Face Recognition Vendor Test (FRVT) (1:1 verification) report (name: Imperial-000 and Imperial-001). Our solution is based on [MS1MV2+DeepGlintAsian, ResNet100, ArcFace loss].

2019.02.08: Please check https://github.com/deepinsight/insightface/tree/master/recognition/ArcFace for our parallel training code which can easily and efficiently support one million identities on a single machine (8* 1080ti).

2018.12.13: Inference acceleration TVM-Benchmark.

2018.10.28: Light-weight attribute model Gender-Age. About 1MB, 10ms on single CPU core. Gender accuracy 96% on validation set and 4.1 age MAE.

2018.10.16: We achieved state-of-the-art performance on Trillionpairs (name: nttstar) and IQIYI_VID (name: WitcheR).

Deep Face Recognition

Introduction
Training Data
Train
Pretrained Models
Verification Results On Combined Margin
Test on MegaFace
512-D Feature Embedding
Third-party Re-implementation

Face Detection

RetinaFace
RetinaFaceAntiCov

Face Alignment

DenseUNet
CoordinateReg

Citation

Contact

Deep Face Recognition

Introduction

In this module, we provide training data, network settings and loss designs for deep face recognition. The training data includes, but not limited to the cleaned MS1M, VGG2 and CASIA-Webface datasets, which were already packed in MXNet binary format. The network backbones include ResNet, MobilefaceNet, MobileNet, InceptionResNet_v2, DenseNet, etc.. The loss functions include Softmax, SphereFace, CosineFace, ArcFace, Sub-Center ArcFace and Triplet (Euclidean/Angular) Loss.

You can check the detail page of our work ArcFace(which accepted in CVPR-2019) and SubCenter-ArcFace(which accepted in ECCV-2020).

Our method, ArcFace, was initially described in an arXiv technical report. By using this module, you can simply achieve LFW 99.83%+ and Megaface 98%+ by a single model. This module can help researcher/engineer to develop deep face recognition algorithms quickly by only two steps: download the binary dataset and run the training script.

Training Data

All face images are aligned by ficial five landmarks and cropped to 112x112:

Please check Dataset-Zoo for detail information and dataset downloading.

Please check recognition/tools/face2rec2.py on how to build a binary face dataset. You can either choose MTCNN or RetinaFace to align the faces.

Train

Install MXNet with GPU support (Python 3.X).

pip install mxnet-cu101 # which should match your installed cuda version

Clone the InsightFace repository. We call the directory insightface as INSIGHTFACE_ROOT.

git clone --recursive https://github.com/deepinsight/insightface.git

Download the training set (MS1M-Arcface) and place it in $INSIGHTFACE_ROOT/recognition/datasets/. Each training dataset includes at least following 6 files:

    faces_emore/
       train.idx
       train.rec
       property
       lfw.bin
       cfp_fp.bin
       agedb_30.bin

The first three files are the training dataset while the last three files are verification sets.

Train deep face recognition models. In this part, we assume you are in the directory $INSIGHTFACE_ROOT/recognition/ArcFace/.

Place and edit config file:

cp sample_config.py config.py
vim config.py # edit dataset path etc..

We give some examples below. Our experiments were conducted on the Tesla P40 GPU.

(1). Train ArcFace with LResNet100E-IR.

CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --network r100 --loss arcface --dataset emore

It will output verification results of LFW, CFP-FP and AgeDB-30 every 2000 batches. You can check all options in config.py. This model can achieve LFW 99.83+ and MegaFace 98.3%+.

(2). Train CosineFace with LResNet50E-IR.

CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --network r50 --loss cosface --dataset emore

(3). Train Softmax with LMobileNet-GAP.

CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --network m1 --loss softmax --dataset emore

(4). Fine-turn the above Softmax model with Triplet loss.

CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --network m1 --loss triplet --lr 0.005 --pretrained ./models/m1-softmax-emore,1

(5). Training in model parallel acceleration.

CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_parall.py --network r100 --loss arcface --dataset emore

Verification results.

LResNet100E-IR network trained on MS1M-Arcface dataset with ArcFace loss:

Method	LFW(%)	CFP-FP(%)	AgeDB-30(%)
Ours	99.80+	98.0+	98.20+

Pretrained Models

You can use $INSIGHTFACE_ROOT/recognition/arcface_torch/eval/verification.py to test all the pre-trained models.

Please check Model-Zoo for more pretrained models.

Verification Results on Combined Margin

A combined margin method was proposed as a function of target logits value and original θ:

COM(θ) = cos(m_1*θ+m_2) - m_3

For training with m1=1.0, m2=0.3, m3=0.2, run following command:

CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --network r100 --loss combined --dataset emore

Results by using MS1M-IBUG(MS1M-V1)

Method	m1	m2	m3	LFW	CFP-FP	AgeDB-30
W&F Norm Softmax	1	0	0	99.28	88.50	95.13
SphereFace	1.5	0	0	99.76	94.17	97.30
CosineFace	1	0	0.35	99.80	94.4	97.91
ArcFace	1	0.5	0	99.83	94.04	98.08
Combined Margin	1.2	0.4	0	99.80	94.08	98.05
Combined Margin	1.1	0	0.35	99.81	94.50	98.08
Combined Margin	1	0.3	0.2	99.83	94.51	98.13
Combined Margin	0.9	0.4	0.15	99.83	94.20	98.16

Test on MegaFace

Please check $INSIGHTFACE_ROOT/evaluation/megaface/ to evaluate the model accuracy on Megaface. All aligned images were already provided.

512-D Feature Embedding

In this part, we assume you are in the directory $INSIGHTFACE_ROOT/deploy/. The input face image should be generally centre cropped. We use RNet+ONet of MTCNN to further align the image before sending it to the feature embedding network.

Prepare a pre-trained model.
Put the model under $INSIGHTFACE_ROOT/models/. For example, $INSIGHTFACE_ROOT/models/model-r100-ii.
Run the test script $INSIGHTFACE_ROOT/deploy/test.py.

For single cropped face image(112x112), total inference time is only 17ms on our testing server(Intel E5-2660 @ 2.00GHz, Tesla M40, LResNet34E-IR).

Third-party Re-implementation

TensorFlow: InsightFace_TF
TensorFlow: tf-insightface
TensorFlow:insightface
PyTorch: InsightFace_Pytorch
PyTorch: arcface-pytorch
Caffe: arcface-caffe
Caffe: CombinedMargin-caffe
Tensorflow: InsightFace-tensorflow
TensorRT: wang-xinyu/tensorrtx

Face Detection

RetinaFace

RetinaFace is a practical single-stage SOTA face detector which is initially introduced in arXiv technical report and then accepted by CVPR 2020. We provide training code, training dataset, pretrained models and evaluation scripts.

Please check RetinaFace for detail.

RetinaFaceAntiCov

RetinaFaceAntiCov is an experimental module to identify face boxes with masks. Please check RetinaFaceAntiCov for detail.

Face Alignment

DenseUNet

Please check the Menpo Benchmark and our Dense U-Net for detail. We also provide other network settings such as classic hourglass. You can find all of training code, training dataset and evaluation scripts there.

CoordinateReg

On the other hand, in contrast to heatmap based approaches, we provide some lightweight facial landmark models with fast coordinate regression. The input of these models is loose cropped face image while the output is the direct landmark coordinates. See detail at alignment-coordinateReg. Now only pretrained models available.

Citation

If you find InsightFace useful in your research, please consider to cite the following related papers:

@inproceedings{deng2019retinaface,
title={RetinaFace: Single-stage Dense Face Localisation in the Wild},
author={Deng, Jiankang and Guo, Jia and Yuxiang, Zhou and Jinke Yu and Irene Kotsia and Zafeiriou, Stefanos},
booktitle={arxiv},
year={2019}
}

@inproceedings{guo2018stacked,
  title={Stacked Dense U-Nets with Dual Transformers for Robust Face Alignment},
  author={Guo, Jia and Deng, Jiankang and Xue, Niannan and Zafeiriou, Stefanos},
  booktitle={BMVC},
  year={2018}
}

@article{deng2018menpo,
  title={The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking},
  author={Deng, Jiankang and Roussos, Anastasios and Chrysos, Grigorios and Ververas, Evangelos and Kotsia, Irene and Shen, Jie and Zafeiriou, Stefanos},
  journal={IJCV},
  year={2018}
}

@inproceedings{deng2018arcface,
title={ArcFace: Additive Angular Margin Loss for Deep Face Recognition},
author={Deng, Jiankang and Guo, Jia and Niannan, Xue and Zafeiriou, Stefanos},
booktitle={CVPR},
year={2019}
}

Contact

[Jia Guo](guojia[at]gmail.com)
[Jiankang Deng](jiankangdeng[at]gmail.com)

Comments

iBUG_DeepInsight

I have seen that the current top algorithm in the MegaFace challenge is iBug_DeepInsight, with an accuracy that corresponds with your latest update: 2018.02.13: We achieved state-of-the-art performance on MegaFace-Challenge-1, at 98.06

After reading your paper and the README in this repo, it seems to me that this accuracy is achieved using the cleaned/refined MegaFace dataset. Is this correct?

opened by d4nst 70
Did anyone try to use ArcLoss without alignment?

Hi all, I hope someone tried this. I have trained ResNet-18 with Softmax, Centerloss and finally ArcFace. VGG2 was used as training data, no alignment step has been done because we use another face detection approach. VGG2 test results are the following: 68% for SoftMax, about 89% for CenterLoss ( more than 20% boost) and surprisingly only 70% for ArcFace. I guess this caused because alignment step is missed and because of some reasons this step is extremely important for ArcFace and not so important for CenterLoss. But it's just an assumption. Does anyone have experience in training with no aligned images? Thank you!

opened by borisgribkov 60
How about the speed of training ?

Initially, I meet the issue of out of memory issue on TITAN X 12GB, so I change per GPU batch size from 128 to 64, so the batch_size is 64*4=256. However, the training speed is only 26 examples/sec. The version of MXNet is 1.2.0

So, I adopt the suggestions (https://github.com/deepinsight/insightface/compare/master...gaohuazuo:tested) from @gaohuazuo (https://github.com/deepinsight/insightface/issues/32) for out of memory issue. In his comments, he tested on 1080Ti x4, mxnet-cu80, r100, per GPU batch size 128. Memory 8.3G, speed 308 examples/sec.

But I followed the operations he suggested, the training speed is still very low on my server, it is only 28 examples/sec. I test on P100x4 with each 16 GB, mxnet-cu80, r100, loss_type=4, per GPU batch size 128, Memory 8.3G (I also try the setting with per GPU batch size 192, Memory 10.3G, also very low only 32 examples/sec).

Moreover, If I do not use memonger, P100x4 with each 16 GB, mxnet-cu80, r100, loss_type=4, per GPU batch size 128, the training speed is almost the same as 30 examples/sec.

Do you know how to fix the issue of speed ?
question

opened by bruinxiong 42
training 10 Million Identities with no pretrained model

I want to train my own model on 10 Million Identities. At the beginning of training, speed is 10000 samples/sec. In a few minutes, speed is 300 samples/sec. Why? GPU: V100 * 8
question partial_fc

opened by Gary-Qinghua 32
Pytorch implementation. rescale parameter in sgd

Could you please explain the meaning of the rescale parameter in your sgd implementation? https://github.com/deepinsight/insightface/blob/master/recognition/partial_fc/pytorch/sgd.py#L42

You set it to the world_size in the training here https://github.com/deepinsight/insightface/blob/master/recognition/partial_fc/pytorch/partial_fc.py#L82 and it seems like it affects gradients in both the backbone and the head.
bug partial_fc

opened by golunovas 29
Test on CPU

Hi,

How we can test pretrained models on CPU ?

it gaves:

RuntimeError: simple_bind error. Arguments: data: (1, 3L, 68L, 68L) [23:39:56] src/storage/storage.cc:118: Compile with USE_CUDA=1 to enable GPU usage

opened by MyraBaba 25
very slow when testing on megaface

Since the batch size is always 1 when testing megaface (https://github.com/deepinsight/insightface/blob/master/src/megaface/gen_megaface.py), is there any reason for setting such small batch size?

opened by twmht 22
Paid help request C++

Hi,

We have small budget to use:

We need to use model from C++ and extract futures like in test.py

Is there anyone who expert on the both C++ and mxnet to spare a few hours?

Best

opened by AnjeliquePink 21
acc is always about 0.5 using mobilenetface

testing verification.. (12000, 128) infer time 30.116359 [lfw][6000]XNorm: 38.367005 [lfw][6000]Accuracy-Flip: 0.50000+-0.00000 testing verification.. (14000, 128) infer time 35.065952 [cfp_fp][6000]XNorm: 38.365932 [cfp_fp][6000]Accuracy-Flip: 0.50000+-0.00000 testing verification.. (12000, 128) infer time 30.366434 [agedb_30][6000]XNorm: 38.366582 [agedb_30][6000]Accuracy-Flip: 0.50000+-0.00000 [6000]Accuracy-Highest: 0.51533

the train script is CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network y1 --loss-type 4 --margin-s 128 --margin-m 0.5 --per-batch-size 128 --emb-size 128 --data-dir ../datasets/faces_ms1m_112x112 --wd 0.00004 --fc7-wd-mult 10.0 --prefix ../model-mobilefacenet-128

opened by tianxingyzxq 21
How to finetune on my own datasets ?

I Fine-tune 'model-r50-am-lfw' on 'faces_emore' datasets,and got a new model v1, but when i try to finetune v1 on my own datasets ,it got a problem like this. Incompatible attr in node at 0-th output: expected [85742,512], got [12700,512] i also can finet-une 'model-r50-am-lfw' on my own datasets ,it works. How to save the model correctly? or How to finetune on my own datasets ? thank you!

opened by maryhh 19
RuntimeError: simple_bind error. Arguments:

I have this question, can you help me? thank you very much!

Traceback (most recent call last): File "train_softmax.py", line 1033, in main() File "train_softmax.py", line 1030, in main train_net(args) File "train_softmax.py", line 1009, in train_net model.fit(train_dataiter, File "/usr/local/lib/python2.7/dist-packages/mxnet/module/base_module.py", line 460, in fit for_training=True, force_rebind=force_rebind) File "/usr/local/lib/python2.7/dist-packages/mxnet/module/module.py", line 429, in bind state_names=self._state_names) File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 264, in init self.bind_exec(data_shapes, label_shapes, shared_group) File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 360, in bind_exec shared_group)) File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 638, in _bind_ith_exec shared_buffer=shared_data_arrays, **input_shapes) File "/usr/local/lib/python2.7/dist-packages/mxnet/symbol/symbol.py", line 1518, in simple_bind raise RuntimeError(error_msg) RuntimeError: simple_bind error. Arguments: data: (128, 3, 112, 112) softmax_label: (128,) [17:54:55] src/storage/./pooled_storage_manager.h:107: cudaMalloc failed: unknown error

Stack trace returned 10 entries: [bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2ab998) [0x7fd3e2f9d998] [bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2abda8) [0x7fd3e2f9dda8] [bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x29071ef) [0x7fd3e55f91ef] [bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x290a4c8) [0x7fd3e55fc4c8] [bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2466d4d) [0x7fd3e5158d4d] [bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x24710c4) [0x7fd3e51630c4] [bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x247205c) [0x7fd3e516405c] [bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x24768f8) [0x7fd3e51688f8] [bt] (8) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x248103a) [0x7fd3e517303a] [bt] (9) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2481734) [0x7fd3e5173734]

opened by zfhsky 19
Jmlr Obj file for visualization

I cant seem to convert the predicted verts into a obj file using the obj module in 3rd Party modules. Is there a simple way to save the output into a preview-able from to open it in meshlab or zbrush ?

opened by umarmuzammil 0
Identifying unique people throughout video

I want to know how many unique people are in this video. https://www.youtube.com/watch?v=XtXJSX71hhg I know it is 4, but through insight-face, I want to do it and it shows face ids around 10. I analyzed and looked at it, what happens is whenever extremely side-angled faces are detected obviously it will be nonrecognizable with the existing list of faces so it will issue a new id to that, so I tried ignoring all faces with faces.embedding_norm<20 but still the issue persists. so now how can I improve results so it can give max 5-6 id if 4 people are there?

opened by NisargCSP 0
How to download dataset/models from Baidu?

I am intending to download the Glint360 data set, but I am unable to download using Baidu. Do you mind providing the steps to do so? I am able to read chinese too, so if steps are provided in Chinese it will be good too. Thank you!! :D

opened by tanpengshi 0
implementation for 3d68 face alignment

Hi, do you have implementation for 3d68 face alignment? I found it seams the sdunet only outputs 2D landmarks. Do you use JMLR for 3d68 face alignment? Thank you:)

opened by danxia0308 0
Loss goes to NaN while using fp16 when working with arcface_torch and throws an error during validation
Hello I am trying to train an arcface model on my custom dataset with 240226 ids. Here is my config file

from easydict import EasyDict as edict config = edict() config.margin_list = (1.0, 0.0, 0.4) config.network = "r50" config.resume = False config.output = None config.embedding_size = 512 config.sample_rate = 1.0 config.fp16 = True config.momentum = 0.9 config.weight_decay = 1e-4 config.batch_size = 192 config.lr = 0.001 config.verbose = 2000 config.dali = False config.rec = "/workspace/awi_facial_recognition/" #config.use_pretrained = True #config.pretrained_model_path = "/workspace/16backbone.pth" config.num_classes = 240226 config.num_image = 12558871 config.num_epoch = 20 config.warmup_epoch = 0 config.val_targets = ['lfw', 'cfp_fp', 'agedb_30', 'calfw.bin']

After training for a couple of hours, I receive an error during validation. The error is regarding Nan values in the embedding. My loss also goes to nan before this error occurs. Can anyone please help on this?

Edit: There are several validation steps that take place before the error occurs. I sort of have an idea that there is something I need to do with either batch size or learning rate. However I would love to know some others ways I can solve it. This error occurs before a whole epoch is completed. Please help me if possible.
opened by stavyadatta 0

Owner

Deep Insight

洞见

GitHub https://insightface.ai

Tensorboard for pytorch (and chainer, mxnet, numpy, ...)

tensorboardX Write TensorBoard events with simple function call. The current release (v2.3) is tested on anaconda3, with PyTorch 1.8.1 / torchvision 0

7.5k Dec 28, 2022

Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework. MXbox is a toolbox aiming to provide a general and simple interface for visi

31 Oct 19, 2019

Modular Probabilistic Programming on MXNet

100 Dec 10, 2022

MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Octave Convolution MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution Imag