Face Recognition

In this article, we will know what is face recognition and how is different from face detection. We will go briefly over the theory of face recognition. Facial recognition is the process of identifying or verifying the identity of a person using their face. It captures, analyzes, and compares patterns based on the person’s facial details. The face detection process is an essential step in detecting and locating human faces in images and videos.

What is Face Detection?

In computer vision,  one essential problem we are trying to figure out is to automatically detect objects in an image without human intervention. Face detection can be thought of as a problem where we detect human faces in an image.

What is Face Recognition?

Face recognition is a method of identifying or verifying the identity of an individual using their face. There are various algorithms that can do face recognition but their accuracy might vary. We make use of face embedding in which each face is converted into a vector and this technique is called deep metric learning. 


Let me further divide this process into three simple steps for easy understanding:

1. Retinaface (used for Face Detection):

Face Detection is the first step to the solution whenever we work with faces. So, we decide to use the Retinaface architecture to detect the faces. It performs three different face localization tasks together, that are face detection, 2D face alignment, and 3D face reconstruction based on a single shot framework. All the three targets are solved keeping in mind only one common target that all the points regressed for the above three tasks should lie on the image plane.

The model is trained to predict simultaneously:

    • Whether the current detection is a face or not (binary classification trained using softmax)
    • 5 Landmark points (Eyes, nose tip, and lips edge)
    • 3D pose of the face

Face Alignment:

First, a face detector needs to be used to detect a face on an image. After that, we can use face alignment for cases that do not satisfy our model’s expected input. Identification is considered a rather challenging problem already, so face alignment is utilized to make the model’s life a bit easier. If a face is transformed into a canonical pose (like the tip of the nose in the center of the image, etc.), the model can focus on getting important information straight away.

Arcface (used for Feature Extraction):

ArcFace is the state-of-the-art face recognition approach accepted on CVPR 2019. Now that we have cropped the face out of the image, we extract features from it. Here we are going to use face embeddings to extract the features out of the face. A neural network takes an image of the person’s face as input and outputs a vector that represents the most important features of a face. In machine learning, this vector is called embedding and thus we call this vector face embedding. Now how does this help in recognizing the faces of different persons?

While training the neural network, the network learns to output similar vectors for faces that look similar. For example, if I have multiple images of faces within different timespan, of course, some of the features of my face might change but not up to much extent. So in this case the vectors associated with the faces are similar or in short, they are very close in the vector space. 

It is a machine learning model that takes two face images as input and outputs the distance between them to see how likely they are to be the same person. The distance between faces is calculated using cosine distance. 

Cosine Distance = 1 – Cosine Similarity

Cosine similarity measures the similarity between two vectors of an inner product space. It is measured by the cosine of the angle between two vectors and determines whether two vectors are pointing in roughly the same direction. It is often used to measure document similarity in text analysis.

Smaller angles between vectors produce larger cosine values, indicating greater cosine similarity. For example: When two vectors have the same orientation, the angle between them is 0, and the cosine similarity is 1. Perpendicular vectors have a 90-degree angle between them and a cosine similarity of 0.

To create a complete project on Face Recognition, we must work on Data Gathering so we gather data by capturing the person’s face and then creating an augmentation of the person’s image and saving the NumPy array of the augmented images, and using the cosine similarity to recognize the face and return the result.

Masked Face Recognition:

During the COVID-19 coronavirus epidemic, almost everyone wears a facial mask, which poses a huge challenge to face recognition. Traditional face recognition systems may not effectively recognize masked faces, but removing the mask for authentication will increase the risk of virus infection. Inspired by the COVID-19 pandemic response, the widespread requirement that people wear protective face masks in public places has driven, how to face recognition technology deals with masked faces.