MTCNN face detection implementation for TensorFlow, as a PIP package.

Iván de Paz Centeno

Last update: Dec 30, 2022

Related tags

Deep Learning package tensorflow detection python3 pip face landmark mtcnn

Overview

MTCNN

Implementation of the MTCNN face detector for Keras in Python3.4+. It is written from scratch, using as a reference the implementation of MTCNN from David Sandberg (FaceNet's MTCNN) in Facenet. It is based on the paper Zhang, K et al. (2016) [ZHANG2016].

INSTALLATION

Currently it is only supported Python3.4 onwards. It can be installed through pip:

$ pip install mtcnn

This implementation requires OpenCV>=4.1 and Keras>=2.0.0 (any Tensorflow supported by Keras will be supported by this MTCNN package). If this is the first time you use tensorflow, you will probably need to install it in your system:

$ pip install tensorflow

or with conda

$ conda install tensorflow

Note that tensorflow-gpu version can be used instead if a GPU device is available on the system, which will speedup the results.

USAGE

The following example illustrates the ease of use of this package:

>>> from mtcnn import MTCNN
>>> import cv2
>>>
>>> img = cv2.cvtColor(cv2.imread("ivan.jpg"), cv2.COLOR_BGR2RGB)
>>> detector = MTCNN()
>>> detector.detect_faces(img)
[
    {
        'box': [277, 90, 48, 63],
        'keypoints':
        {
            'nose': (303, 131),
            'mouth_right': (313, 141),
            'right_eye': (314, 114),
            'left_eye': (291, 117),
            'mouth_left': (296, 143)
        },
        'confidence': 0.99851983785629272
    }
]

The detector returns a list of JSON objects. Each JSON object contains three main keys: 'box', 'confidence' and 'keypoints':

The bounding box is formatted as [x, y, width, height] under the key 'box'.
The confidence is the probability for a bounding box to be matching a face.
The keypoints are formatted into a JSON object with the keys 'left_eye', 'right_eye', 'nose', 'mouth_left', 'mouth_right'. Each keypoint is identified by a pixel position (x, y).

Another good example of usage can be found in the file "example.py." located in the root of this repository. Also, you can run the Jupyter Notebook "example.ipynb" for another example of usage.

BENCHMARK

The following tables shows the benchmark of this mtcnn implementation running on an Intel i7-3612QM CPU @ 2.10GHz, with a CPU-based Tensorflow 1.4.1.

Pictures containing a single frontal face:

Image size	Total pixels	Process time	FPS
460x259	119,140	0.118 seconds	8.5
561x561	314,721	0.227 seconds	4.5
667x1000	667,000	0.456 seconds	2.2
1920x1200	2,304,000	1.093 seconds	0.9
4799x3599	17,271,601	8.798 seconds	0.1

Pictures containing 10 frontal faces:

Image size	Total pixels	Process time	FPS
474x224	106,176	0.185 seconds	5.4
736x348	256,128	0.290 seconds	3.4
2100x994	2,087,400	1.286 seconds	0.7

MODEL

By default the MTCNN bundles a face detection weights model.

The model is adapted from the Facenet's MTCNN implementation, merged in a single file located inside the folder 'data' relative to the module's path. It can be overriden by injecting it into the MTCNN() constructor during instantiation.

The model must be numpy-based containing the 3 main keys "pnet", "rnet" and "onet", having each of them the weights of each of the layers of the network.

For more reference about the network definition, take a close look at the paper from Zhang et al. (2016) [ZHANG2016].

LICENSE

MIT License.

REFERENCE

[ZHANG2016]

(1, 2) Zhang, K., Zhang, Z., Li, Z., and Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10):1499–1503.

Comments

Tensorflow 1.14.0 API Deprecation Warnings

Tensorflow is deprecating the v1 API, and moving it into tf.compat.v1. In addition, some elements of the API are being replaced by new API.

Example output:

W0703 09:23:51.102508 140059022907200 deprecation.py:323] From site-packages/tensorflow/python/training/queue_runner_impl.py:391: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
W0703 09:23:51.141032 140059022907200 deprecation.py:323] From site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-07-03 09:23:52.959783: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
W0703 09:23:53.422218 140059022907200 deprecation_wrapper.py:119] From site-packages/mtcnn/mtcnn.py:187: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0703 09:23:53.422743 140059022907200 deprecation_wrapper.py:119] From site-packages/mtcnn/mtcnn.py:193: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

W0703 09:23:53.433550 140059022907200 deprecation_wrapper.py:119] From site-packages/mtcnn/network.py:43: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0703 09:23:53.434048 140059022907200 deprecation_wrapper.py:119] From site-packages/mtcnn/layer_factory.py:88: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0703 09:23:53.435072 140059022907200 deprecation_wrapper.py:119] From site-packages/mtcnn/layer_factory.py:79: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

W0703 09:23:53.435565 140059022907200 deprecation.py:506] From site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0703 09:23:53.454373 140059022907200 deprecation_wrapper.py:119] From site-packages/mtcnn/layer_factory.py:171: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

W0703 09:23:53.505696 140059022907200 deprecation.py:323] From site-packages/mtcnn/layer_factory.py:221: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
W0703 09:23:53.700929 140059022907200 deprecation_wrapper.py:119] From site-packages/mtcnn/layer_factory.py:196: The name tf.nn.xw_plus_b is deprecated. Please use tf.compat.v1.nn.xw_plus_b instead.

This issue is distinct from #3, which refers to the keep_dims deprecation.

I understand that I can filter warnings to suppress them, but sometimes they contain useful notifications that are relevant elsewhere in my codebase, so I'd prefer not to if I can help it!

Is there a roadmap to migrate to non-deprecated Tensorflow APIs?

opened by MajorError 10

When i use tensorflow-gpu i got error

My opencv version is 4-1 and tensorflow-gpu version is 1-13-1, and when i rundetector = MTCNN() just like what the README said i got a attribute error said ' 'MTCNN' object has no attribute '_MTCNN__session' '. Could anyone tell me how to fix it?

opened by Mi-zo-re 9
Getting negative image coordinates from 'box' key

Hi, Thanks for this amazing MTCNN for faces. I would like to know why negative coordinates in 'box' key (like (55, -5, 49, 79)) are fetching in face detection.In OpenCV 2d image coordinates starts from (0,0). How to assume this negative coordinates.
bug

opened by prnvjb 7
Tensorflow now supports python 3.7 but mtcnn don't.

Getting this error: site-packages/mtcnn/mtcnn.py", line 616, in del self.__session.close() AttributeError: 'MTCNN' object has no attribute '_MTCNN__session'

Tensorflow now support python 3.7. https://github.com/tensorflow/tensorflow/issues/31431

opened by steffanjensen 5

Tensorflow warning of deprecation.

WARNING:tensorflow:From /usr/local/miniconda3/envs/mtcnn_py35/lib/python3.5/site-packages/mtcnn/layer_factory.py:211: calling reduce_max (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead

opened by shiquanwang 4

negative value in face box ?

x, y, width, height = face['box'] I test a picture, but got y equals -25

code:

from mtcnn import MTCNN
import cv2

img = cv2.cvtColor(cv2.imread("6.jpg"), cv2.COLOR_BGR2RGB)
detector = MTCNN()
res = detector.detect_faces(img)
print(res)

output:

Using TensorFlow backend.
2020-06-06 13:25:22.146922: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa34b334670 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-06 13:25:22.146983: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[{'box': [43, -25, 177, 262], 'confidence': 0.990744948387146, 'keypoints': {'left_eye': (122, 89), 'right_eye': (191, 81), 'nose': (178, 133), 'mouth_left': (121, 174), 'mouth_right': (184, 168)}}]

picture(6.jpg):

opened by shelldonhu 3

ValueError: Variable pnet/conv1/weights already exists

When I run the code, I get the following error. I would appreciate if someone helped me. I have: python 3.6 mtcnn 0.0.7 tensorflow 1.9.0 running on CPU All installed with pip

File "<ipython-input-1-59d1cc96181f>", line 1, in <module>
    runfile('D:/FaceRecognition/Main.py', wdir='D:/FaceRecognition')

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 678, in runfile
    execfile(filename, namespace)

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 106, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "D:/FaceRecognition/Main.py", line 32, in <module>
    main()

  File "D:/FaceRecognition/Main.py", line 19, in main
    addNewFace()

  File "D:\FaceRecognition\Engine.py", line 30, in addNewFace
    detector = MTCNN()

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\mtcnn\mtcnn.py", line 193, in __init__
    self.__pnet = PNet(self.__session, False)

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\mtcnn\network.py", line 44, in __init__
    self._config()

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\mtcnn\mtcnn.py", line 55, in _config
    padding='VALID', relu=False)

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\mtcnn\layer_factory.py", line 123, in new_conv
    kernel = self.__make_var('weights', shape=[kernel_size[1], kernel_size[0], channels_input // group, channels_output])

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\mtcnn\layer_factory.py", line 78, in __make_var
    return tf.get_variable(name, shape, trainable=self.__network.is_trainable())

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1328, in get_variable
    constraint=constraint)

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1090, in get_variable
    constraint=constraint)

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 435, in get_variable
    constraint=constraint)

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 404, in _true_getter
    use_resource=use_resource, constraint=constraint)

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 743, in _get_single_variable
    name, "".join(traceback.format_list(tb))))

ValueError: Variable pnet/conv1/weights already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\framework\ops.py", line 1740, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access
  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\framework\ops.py", line 3414, in create_op
    op_def=op_def)
  File "C:\Users\kaany\Anaconda3\envs\facerec\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)

opened by ylmzkaan 3

result is not same as facenet?

Hi

I use this image as input

The result is

I use mtcnn in facenet with the same image, it can catch all faces

i install mtcnn use pip3 install mtcnn

opened by lingchingteng 3

PyTorch : unable to load the dataset

#coding=utf-8
import pdb
import os, sys, random
import numpy as np
import torch
import torch.nn.functional as F
import torch.utils.data as data
from PIL import Image
from mtcnn import MTCNN
detector = MTCNN()
import operator
import cv2
## data generator for afew
class VideoDataset(data.Dataset):
    def __init__(self, video_root, video_list, rectify_label=None, transform=None, csv = False):

        self.imgs_first, self.index = load_imgs_total_frame(video_root, video_list, rectify_label)
        self.transform = transform

    def __getitem__(self, index):

        path_first, target_first = self.imgs_first[index]
        img_cv2 = np.array(Image.open(path_first).convert("RGB"))
        face=detector.detect_faces(img_cv2)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2.shape[1])
        img_first=Image.fromarray(img_cv2[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        if self.transform is not None:
            img_first = self.transform(img_first)

        return img_first, target_first, self.index[index]

    def __len__(self):
        return len(self.imgs_first)

# 
class TripleImageDataset(data.Dataset):
    def __init__(self, video_root, video_list, rectify_label=None, transform=None):

        self.imgs_first, self.imgs_second, self.imgs_third, self.index = load_imgs_tsn(video_root, video_list,
                                                                                           rectify_label)
        self.transform = transform

    def __getitem__(self, index):

        path_first, target_first = self.imgs_first[index]
        
        img_cv2_f = np.array(Image.open(path_first).convert("RGB"))
        face=detector.detect_faces(img_cv2_f)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2_f.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2_f.shape[1])
        img_first=Image.fromarray(img_cv2_f[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        
        if self.transform is not None:
            img_first = self.transform(img_first)

        path_second, target_second = self.imgs_second[index]
        
        
        
        img_cv2_s = np.array(Image.open(path_second).convert("RGB"))
        face=detector.detect_faces(img_cv2_s)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2_s.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2_s.shape[1])
        img_second=Image.fromarray(img_cv2_s[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        if self.transform is not None:
            img_second = self.transform(img_second)

        path_third, target_third = self.imgs_third[index]
        
        img_cv2_t = np.array(Image.open(path_third).convert("RGB"))
        face=detector.detect_faces(img_cv2_t)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2_t.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2_t.shape[1])
        img_third=Image.fromarray(img_cv2_t[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        
        
        
        if self.transform is not None:
            img_third = self.transform(img_third)
        return img_first, img_second, img_third, target_first, self.index[index]

    def __len__(self):
        return len(self.imgs_first)

def load_imgs_tsn(video_root, video_list, rectify_label):
    imgs_first = list()
    imgs_second = list()
    imgs_third = list()

    with open(video_list, 'r') as imf:
        index = []
        for id, line in enumerate(imf):

            video_label = line.strip().split()

            video_name = video_label[0]  # name of video
            label = rectify_label[video_label[1]]  # label of video

            video_path = os.path.join(video_root, video_name)  # video_path is the path of each video
            ###  for sampling triple imgs in the single video_path  ####

            img_lists = os.listdir(video_path)
            img_lists.sort()  # sort files by ascending
            img_count = len(img_lists)  # number of frames in video
            num_per_part = int(img_count) // 3

            if int(img_count) > 3:
                for i in range(img_count):

                    random_select_first = random.randint(0, num_per_part)
                    random_select_second = random.randint(num_per_part, num_per_part * 2)
                    random_select_third = random.randint(2 * num_per_part, len(img_lists) - 1)

                    img_path_first = os.path.join(video_path, img_lists[random_select_first])
                    img_path_second = os.path.join(video_path, img_lists[random_select_second])
                    img_path_third = os.path.join(video_path, img_lists[random_select_third])

                    imgs_first.append((img_path_first, label))
                    imgs_second.append((img_path_second, label))
                    imgs_third.append((img_path_third, label))

            else:
                for j in range(len(img_lists)):
                    img_path_first = os.path.join(video_path, img_lists[j])
                    img_path_second = os.path.join(video_path, random.choice(img_lists))
                    img_path_third = os.path.join(video_path, random.choice(img_lists))

                    imgs_first.append((img_path_first, label))
                    imgs_second.append((img_path_second, label))
                    imgs_third.append((img_path_third, label))

            ###  return video frame index  #####
            index.append(np.ones(img_count) * id)  # id: 0 : 379
        index = np.concatenate(index, axis=0)
        # index = index.astype(int)
    return imgs_first, imgs_second, imgs_third, index


def load_imgs_total_frame(video_root, video_list, rectify_label):
    imgs_first = list()

    with open(video_list, 'r') as imf:
        index = []
        video_names = []
        for id, line in enumerate(imf):

            video_label = line.strip().split()

            video_name = video_label[0]  # name of video
            label = rectify_label[video_label[1]]  # label of video

            video_path = os.path.join(video_root, video_name)  # video_path is the path of each video
            ###  for sampling triple imgs in the single video_path  ####

            img_lists = os.listdir(video_path)
            img_lists.sort()  # sort files by ascending
            img_count = len(img_lists)  # number of frames in video

            for frame in img_lists:
                # pdb.set_trace()
                imgs_first.append((os.path.join(video_path, frame), label))
            ###  return video frame index  #####
            video_names.append(video_name)
            index.append(np.ones(img_count) * id)
        index = np.concatenate(index, axis=0)
        # index = index.astype(int)
    return imgs_first, index
    
## data generator for ck_plus
class TenFold_VideoDataset(data.Dataset):
    def __init__(self, video_root='', video_list='', rectify_label=None, transform=None, fold=1, run_type='train'):
        self.imgs_first, self.index = load_imgs_tenfold_totalframe(video_root, video_list, rectify_label, fold, run_type)

        self.transform = transform
        self.video_root = video_root

    def __getitem__(self, index):

        path_first, target_first = self.imgs_first[index]
        
        img_cv2 = np.array(Image.open(path_first).convert("RGB"))
        face=detector.detect_faces(img_cv2)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2.shape[1])
        img_first=Image.fromarray(img_cv2[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        
        if self.transform is not None:
            img_first = self.transform(img_first)

        return img_first, target_first, self.index[index]

    def __len__(self):
        return len(self.imgs_first)

class TenFold_TripleImageDataset(data.Dataset):
    def __init__(self, video_root='', video_list='', rectify_label=None, transform=None, fold=1, run_type='train'):

        self.imgs_first, self.imgs_second, self.imgs_third, self.index = load_imgs_tsn_tenfold(video_root,video_list,rectify_label, fold, run_type)

        self.transform = transform
        self.video_root = video_root

    def __getitem__(self, index):
        path_first, target_first = self.imgs_first[index]
        
        
        img_cv2_f = np.array(Image.open(path_first).convert("RGB"))
        face=detector.detect_faces(img_cv2_f)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2_f.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2_f.shape[1])
        img_first=Image.fromarray(img_cv2_f[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        
        if self.transform is not None:
            img_first = self.transform(img_first)

        path_second, target_second = self.imgs_second[index]
        img_cv2_s = np.array(Image.open(path_second).convert("RGB"))
        face=detector.detect_faces(img_cv2_s)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2_s.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2_s.shape[1])
        img_second=Image.fromarray(img_cv2_s[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        
        
        if self.transform is not None:
            img_second = self.transform(img_second)

        path_third, target_third = self.imgs_third[index]
        
        img_cv2_t = np.array(Image.open(path_third).convert("RGB"))
        face=detector.detect_faces(img_cv2_t)
        face.sort(key=operator.itemgetter('confidence'),reverse=True)
        start_c=face[0]['keypoints']['left_eye']
        end_c=face[0]['keypoints']['right_eye']
        n_p=face[0]['keypoints']['nose']
        y_s=max(0,min(start_c[1],end_c[1])-(n_p[1]-start_c[1]))
        y_s_e=min(max(start_c[1],end_c[1])+(n_p[1]-start_c[1])//2,img_cv2_t.shape[0])
        x_s=max(0,start_c[0]-(end_c[0]-start_c[0])//2)
        x_s_e=min(end_c[0]+(end_c[0]-start_c[0])//2,img_cv2_t.shape[1])
        img_third=Image.fromarray(img_cv2_t[y_s:y_s_e,x_s:x_s_e,:]).convert("RGB")
        
        if self.transform is not None:
            img_third = self.transform(img_third)

        return img_first, img_second, img_third, target_first, self.index[index]

    def __len__(self):
        return len(self.imgs_first)


def load_imgs_tenfold_totalframe(video_root, video_list, rectify_label, fold, run_type):
    imgs_first = list()
    new_imf = list()

    ''' Make ten-fold list '''
    with open(video_list, 'r') as imf:
        imf = imf.readlines()
    if run_type == 'train':
        fold_ = list(range(1, 11))
        fold_.remove(fold)  # [1,2,3,4,5,6,7,8,9, 10] -> [2,3,4,5,6,7,8,9,10]

        for i in fold_:
            fold_str = str(i) + '-fold'  # 1-fold
            for index, item in enumerate(
                    imf):  # 0, '1-fold\t31\n' in {[0, '1-fold\t31\n'], [1, 'S037/006 Happy\n'], ...}
                if fold_str in item:  # 1-fold in '1-fold\t31\n'
                    for j in range(index + 1, index + int(item.split()[1]) + 1):  # (0 + 1, 0 + 31 + 1 )
                        new_imf.append(imf[j])  # imf[2] = 'S042/006 Happy\n'

    if run_type == 'test':
        fold_ = fold
        fold_str = str(fold_) + '-fold'
        for index, item in enumerate(imf):
            if fold_str in item:
                for j in range(index + 1, index + int(item.split()[1]) + 1):
                    new_imf.append(imf[j])

    index = []
    for id, line in enumerate(new_imf):

        video_label = line.strip().split()

        video_name = video_label[0]  # name of video
        try:
            label = rectify_label[video_label[1]]  # label of video
        except:
            pdb.set_trace()
        video_path = os.path.join(video_root, video_name)  # video_path is the path of each video
        ###  for sampling triple imgs in the single video_path  ####
        img_lists = os.listdir(video_path)
        img_lists.sort()  # sort files by ascending
        
        img_lists = img_lists[ - int(round(len(img_lists))) : ]

        img_count = len(img_lists)  # number of frames in video
        for frame in img_lists:
            imgs_first.append((os.path.join(video_path, frame), label))
        ###  return video frame index  #####
        index.append(np.ones(img_count) * id)

    index = np.concatenate(index, axis=0)
    return imgs_first, index

def load_imgs_tsn_tenfold(video_root, video_list, rectify_label, fold, run_type):
    imgs_first = list()
    imgs_second = list()
    imgs_third = list()
    new_imf = list()
    ''' Make ten-fold list '''
    with open(video_list, 'r') as imf:
        imf = imf.readlines()
    if run_type == 'train':
        fold_ = list(range(1, 11))
        fold_.remove(fold)  # [1,2,3,4,5,6,7,8,9,10] -> [2,3,4,5,6,7,8,9,10]
        for i in fold_:
            fold_str = str(i) + '-fold'  # 1-fold
            for index, item in enumerate(
                    imf):  # 0, '1-fold\t31\n' in {[0, '1-fold\t31\n'], [1, 'S037/006 Happy\n'], ...}
                if fold_str in item:  # 1-fold in '1-fold\t31\n'
                    for j in range(index + 1, index + int(item.split()[1]) + 1):  # (0 + 1, 0 + 31 + 1 )
                        new_imf.append(imf[j])  # imf[2] = 'S042/006 Happy\n'
    if run_type == 'test':
        fold_ = fold
        fold_str = str(fold_) + '-fold'
        for index, item in enumerate(imf):
            if fold_str in item:
                for j in range(index + 1, index + int(item.split()[1]) + 1):
                    new_imf.append(imf[j])
    ''' Make triple-image list '''
    index = []
    for id, line in enumerate(new_imf):
        video_label = line.strip().split()
        video_name = video_label[0]  # name of video
        label = rectify_label[video_label[1]]  # label of video
        video_path = os.path.join(video_root, video_name)  # video_path is the path of each video
        ###  for sampling triple imgs in the single video_path  ####
        img_lists = os.listdir(video_path)
        img_lists.sort()  # sort files by ascending
        img_lists = img_lists[ - int(round(len(img_lists))):]
        img_count = len(img_lists)  # number of frames in video
        num_per_part = int(img_count) // 5
        if int(img_count) > 5:
            for i in range(img_count):
                # pdb.set_trace()
                random_select_first = random.randint(0, num_per_part)
                random_select_second = random.randint(num_per_part, 2 * num_per_part)
                random_select_third = random.randint(2 * num_per_part, 3 * num_per_part)

                img_path_first = os.path.join(video_path, img_lists[random_select_first])
                img_path_second = os.path.join(video_path, img_lists[random_select_second])
                img_path_third = os.path.join(video_path, img_lists[random_select_third])

                imgs_first.append((img_path_first, label))
                imgs_second.append((img_path_second, label))
                imgs_third.append((img_path_third, label))

        else:
            for j in range(len(img_lists)):
                img_path_first = os.path.join(video_path, img_lists[j])
                img_path_second = os.path.join(video_path, random.choice(img_lists))
                img_path_third = os.path.join(video_path, random.choice(img_lists))

                imgs_first.append((img_path_first, label))
                imgs_second.append((img_path_second, label))
                imgs_third.append((img_path_third, label))

        ###  return video frame index  #####
        index.append(np.ones(img_count) * id)  # id: 0 : 379
    index = np.concatenate(index, axis=0)
    # index = index.astype(int)
    # pdb.set_trace()
    return imgs_first, imgs_second, imgs_third, index

Can you please point out the mistake ?

opened by ritvikagrawal1 2

Incompatible with tensorflow 2.0

Looks like the TensorFlow backend used in this Keras model is still on tf 1.0, so when I trymtcnn = MTCNN(), I get an error of AttributeError: module 'tensorflow' has no attribute 'get_default_graph' Can you please implement a version compatible with tf2.0?

opened by blahBlahhhJ 2

AttributeError: 'str' object has no attribute 'close'

from mtcnn.mtcnn import MTCNN
detector = MTCNN(weights_file="/path/to/mtcnn_weights.npy")

I want to provide weights file while instantiating MTCNN object because I want to make an executable of my code using pyisntaller, but above code gives me AttributeError.

Traceback (most recent call last):
  File "main.py", line 8, in <module>
    detector = MTCNN(weights_file=MTCNN_PATH)
  File "/usr/local/lib/python3.6/dist-packages/mtcnn/mtcnn.py", line 205, in __init__
    weights_file.close()
AttributeError: 'str' object has no attribute 'close'

How do I solve this issue can someone please provide the solution or workaround of it Thanks in advance...

opened by krunal704 2

mtcnn: self.thresholds, self.nms_thresholds,self.factor What does it mean? How do I tweak it?

What is the meaning of these variables?

self.min_face_size = 20 self.thresholds = [0.5,0.5,0.5] self.nms_thresholds = [0.6, 0.6, 0.6] self.factor = 0.60

How do I tweak them? What is the best mtcnn weights?

opened by martinenkoEduard 1

Disable progress bars?

Apologies if this is a Keras or TF issue, but how do I disable the stdout and stderr logging with mtcnn?

Example Class:

#!/usr/bin/env python
""" Face Detection Filter for Progeny """ 
from __future__ import absolute_import
from filter_base import Filter
from mtcnn import MTCNN
import os
import cv2 
import logging

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

class FaceDetectFilter(Filter):
    pass

    def __init__(self, threshold=0.70):
        super().__init__()
        self.threshold = threshold
        self.detector = MTCNN()
        return

    def get_faces(self, path):
        img = cv2.cvtColor(cv2.imread(path), cv2.COLOR_BGR2RGB)
        faces = self.detector.detect_faces(img)
        return faces

    def determine_class(self, imgrow):
        faces = self.get_faces(imgrow['thumb_path'])
        if faces is not None and len(faces) > 0:
            for face in faces:
                if face['confidence'] >= self.threshold:
                    return 'face'

        return 'noface'

Results in:

1/1 [==============================] - 0s 15ms/step
1/1 [==============================] - 0s 18ms/step
1/1 [==============================] - 0s 15ms/step
1/1 [==============================] - 0s 15ms/step
1/1 [==============================] - 0s 15ms/step
1/1 [==============================] - 0s 14ms/step
2/2 [==============================] - 0s 2ms/step
1/1 [==============================] - 0s 26ms/step

Logging is disabled but the (tqdm?) progress bars remain. This prevents my own tqdm progress bars from working.

Any ideas?

opened by xeb 10

Get wrong detected face

Hello everyone and author @ipazc, How can I set manual params similar to minNeighbors in Haar? I got detection wrong face while test this image in mtcnn. Thank alots.

opened by mercy-thuyle 0

AttributeError: module 'keras.utils.generic_utils' has no attribute 'populate_dict_with_module_objects'

Versions

mtcnn: 0.1.0
keras: 2.4.3
opencv-python: 4.5.2.54

Error

File "/home/leo/Workspace/global/pbl5/pbl5-api/api/urls.py", line 5, in <module>
    from . import views
  File "/home/leo/Workspace/global/pbl5/pbl5-api/api/views.py", line 15, in <module>
    from ai_models import recog, train
  File "/home/leo/Workspace/global/pbl5/pbl5-api/ai_models/recog.py", line 1, in <module>
    import detect
  File "/home/leo/Workspace/global/pbl5/pbl5-api/detect.py", line 4, in <module>
    from mtcnn import MTCNN
  File "/home/leo/.local/lib/python3.8/site-packages/mtcnn/__init__.py", line 26, in <module>
    from mtcnn.mtcnn import MTCNN
  File "/home/leo/.local/lib/python3.8/site-packages/mtcnn/mtcnn.py", line 37, in <module>
    from mtcnn.network.factory import NetworkFactory
  File "/home/leo/.local/lib/python3.8/site-packages/mtcnn/network/factory.py", line 27, in <module>
    from keras.models import Model
  File "/home/leo/.local/lib/python3.8/site-packages/keras/__init__.py", line 20, in <module>
    from . import initializers
  File "/home/leo/.local/lib/python3.8/site-packages/keras/initializers/__init__.py", line 124, in <module>
    populate_deserializable_objects()
  File "/home/leo/.local/lib/python3.8/site-packages/keras/initializers/__init__.py", line 82, in populate_deserializable_objects
    generic_utils.populate_dict_with_module_objects(
AttributeError: module 'keras.utils.generic_utils' has no attribute 'populate_dict_with_module_objects'

Description

I run the example in Readme and it shows this error. I found out the way to solve it, that's changes import part in mtcnn/network/factory/py from

from keras.layers import Input, Dense, Conv2D, MaxPooling2D, PReLU, Flatten, Softmax
from keras.models import Model

from tensorflow.keras.layers import Input, Dense, Conv2D, MaxPooling2D, PReLU, Flatten, Softmax
from tensorflow.keras.models import Model

However, I think it's better if there's a change in the source code.

Thanks in advance!

opened by lioaslan 0

Error in importing

I am getting following error on importing MTCNN class although I have installed it correctly.

from mtcnn import MTCNN Traceback (most recent call last):

File "", line 1, in from mtcnn import MTCNN

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\mtcnn_init_.py", line 26, in from mtcnn.mtcnn import MTCNN

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\mtcnn\mtcnn.py", line 37, in from mtcnn.network.factory import NetworkFactory

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\mtcnn\network\factory.py", line 26, in from keras.layers import Input, Dense, Conv2D, MaxPooling2D, PReLU, Flatten, Softmax

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\keras_init_.py", line 3, in from tensorflow.keras.layers.experimental.preprocessing import RandomRotation

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow_init_.py", line 41, in from tensorflow.python.tools import module_util as _module_util

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python_init_.py", line 45, in from tensorflow.python import data

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data_init_.py", line 25, in from tensorflow.python.data import experimental

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental_init_.py", line 96, in from tensorflow.python.data.experimental import service

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\service_init_.py", line 21, in from tensorflow.python.data.experimental.ops.data_service_ops import distribute

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\data_service_ops.py", line 25, in from tensorflow.python.data.experimental.ops import compression_ops

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\compression_ops.py", line 20, in from tensorflow.python.data.util import structure

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\util\structure.py", line 33, in from tensorflow.python.ops import tensor_array_ops

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\ops\tensor_array_ops.py", line 38, in from tensorflow.python.ops import array_ops

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1475, in _NON_AUTOPACKABLE_TYPES = set(np.core.numerictypes.ScalarType)

AttributeError: module 'numpy.core' has no attribute 'numerictypes'

from mtcnn import MTCNN Traceback (most recent call last):

File "", line 1, in from mtcnn import MTCNN

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\mtcnn_init_.py", line 26, in from mtcnn.mtcnn import MTCNN

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\mtcnn\mtcnn.py", line 37, in from mtcnn.network.factory import NetworkFactory

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\mtcnn\network\factory.py", line 26, in from keras.layers import Input, Dense, Conv2D, MaxPooling2D, PReLU, Flatten, Softmax

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\keras_init_.py", line 3, in from tensorflow.keras.layers.experimental.preprocessing import RandomRotation

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow_init_.py", line 41, in from tensorflow.python.tools import module_util as _module_util

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python_init_.py", line 45, in from tensorflow.python import data

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data_init_.py", line 25, in from tensorflow.python.data import experimental

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental_init_.py", line 96, in from tensorflow.python.data.experimental import service

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\service_init_.py", line 21, in from tensorflow.python.data.experimental.ops.data_service_ops import distribute

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\data_service_ops.py", line 25, in from tensorflow.python.data.experimental.ops import compression_ops

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\compression_ops.py", line 20, in from tensorflow.python.data.util import structure

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\data\util\structure.py", line 33, in from tensorflow.python.ops import tensor_array_ops

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\ops\tensor_array_ops.py", line 38, in from tensorflow.python.ops import array_ops

File "C:\Users\Jagdeep\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1475, in _NON_AUTOPACKABLE_TYPES = set(np.core.numerictypes.ScalarType)

AttributeError: module 'numpy.core' has no attribute 'numerictypes'

opened by jgdpsingh 0

Owner

Iván de Paz Centeno

Lead Data Scientist, R&D Engineer at Smarkia.

GitHub

In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台本项目是一个简易的人脸检测识别平台，提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js，后端采用 pytorch，

5 Aug 2, 2022

Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.

BBB Face Recognizer Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time. Instalati

232 Dec 24, 2022

Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

52 Nov 9, 2022

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection ?? Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

86 Aug 3, 2022

Swapping face using Face Mesh with TensorFlow Lite

17 Apr 26, 2022

Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

Map Metrics for Trajectory Quality Map metrics toolkit provides a set of metrics to quantitatively evaluate trajectory quality via estimating consiste

31 Oct 28, 2022

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

519 Dec 29, 2022

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

77 Dec 8, 2022

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

1 Jan 13, 2022

RetinaFace: Deep Face Detection Library in TensorFlow for Python

RetinaFace is a deep learning based cutting-edge facial detector for Python coming with facial landmarks.

512 Dec 29, 2022

Face Mask Detection on Image and Video using tensorflow and keras

Face-Mask-Detection Face Mask Detection on Image and Video using tensorflow and keras Train Neural Network on face-mask dataset using tensorflow and k

12 Nov 11, 2022

Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

1 Oct 27, 2021

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]

PINTO_model_zoo Please read the contents of the LICENSE file located directly under each folder before using the model. My model conversion scripts ar

2.4k Jan 5, 2023

MTCNN face detection implementation for TensorFlow, as a PIP package.

Related tags

Overview

MTCNN

INSTALLATION

USAGE

BENCHMARK

MODEL

LICENSE

REFERENCE

Comments

code:

output:

picture(6.jpg):

Versions

Error

Description

Owner

Iván de Paz Centeno

In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.

Face Library is an open source package for accurate and real-time face detection and recognition

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Swapping face using Face Mesh with TensorFlow Lite

Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

RetinaFace: Deep Face Detection Library in TensorFlow for Python

Face Mask Detection on Image and Video using tensorflow and keras

Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

pip install python-office

End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition, TPAMI 2021

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

VGGFace2-HQ - A high resolution face dataset for face editing purpose