Once identity walks in front of the camera, the system goes through certain face recognition steps. These are not actually phases but different processes. For better understanding we'll look at them as phases:
1. Head and face detection: System marks the head and the face with a green circle and blue broken-line square. Sighting is triggered after initial face detection is done. Head detection serves as a base for people counting and heat maps while face detection and sighting serve the complete face recognition process. The first phase is done on a locally dedicated PC.
2. In the second phase, the system does landmark detection & face alignment. The system provides 3 dots to a detected face in the area of eyes and nose as a signal that the system has located the face and knows it's orientation as a 2D surface in a 360 degree space. Face alignment is important due to the fact that the system only executes face recognition from a frontal face position and face alignment can compensate only the in-plane face rotation and scaling to normalize face detection for subsequent steps. Meaning, in this phase, the system is scanning the face and actually setting up certain parameters for recognition. In this phase system still does AI processing on a locally dedicated server.
3. The third phase is the identification consisted of embedding calculation and attributes prediction. This is where the central embedder converts the biometric data collected through the sighting into a vector consisted of 512 numbers. Each time an identity shows up in front of a camera, embedding calculation and attributes prediction is performed for every detection of sufficient quality within sighting. So if the subject appears 10 times embedder will create a new vector each time. Central embedding process is done on a locally dedicated server and and afterwards gets sent to a cloud database. So, all vectors ever created are stored on a dedicated cloud. These predictions are aggregated for sighting in order to get a better estimate for embedding vector and attributes. Sighting central embedding vector is then used in the identification process. Once the vector is sent to a cloud database, it's being compared to all existing vectors there to check if any one of those has a similar value. If there is an existing vector with a similar value it joins the new one and enters the embedding vector collection that is representation of one identity. The system tracks the similarity of vectors with beforehand set thresholds. Slightly similar vectors are listed as similar identities. If a new vector is different from any existing vector, the system creates a new identity.
4. The last phase in this cycle is Age and Gender Prediction. This is where the system makes an estimation of identity's age and gender based on the face analysis. The age and gender prediction are performed by a neural network or a model trained especially for this purpose. Meaning, age and gender prediction is executed independently.