Each identity is saved in a cloud database in the form of a vector, numerically marked with 512 digits. Central embedder takes the biometric data from the face and converts it into a vector. Therefore, the vector is a direct representation of biometric data and not the randomized number.
Once vector was created by the embedder it gets sent to the cloud database where all vectors are stored. The cloud database is asked if there is any existing vector in there that has similar value to the new one. That step right there is what determines if the detected subject is an existing identity or a new one. The purpose of data distance is to improve on defining, merging, differentiating and identifying vectors and it works with the help of a predefined threshold.
This data can be found under Directory in Vision GUI as part of the "Similar" column and is primarily used to compare the distance of vectors.
The data distance in this sense shows the similarity or the difference between vectors, or in the case of "Similar Identity" the or difference between listed identities. The closer the data distance is to 0 the more matching vectors are of compared identities, or more similar the identities are. The further away it is from zero the bigger the difference is between compared vectors.
1. If data distance is lower than 0.6 - A new subject (face) was detected and the central embedder has created a vector based on biometric data. The value of the new vector gets compared to the values of all existing vectors in the database. The database shows there is an existing vector with close value to the new one and uses a predefined threshold to take convenient action. If the similarity between those vectors has data distance value below the threshold of 0.6, the system responds "This is the same identity".
After the response, it takes action of automatically merging two vectors and in GUI user sees that the subject was recognized as an existing identity. In this case, new sighting goes under sightings of existing identity along with other metadata. Data distance of 0.6 is a configurable parameter.
2. If data distance is above 0.6 - A new subject (face) was detected and the central embedder has created a vector based on biometric data. The value of the new vector gets compared to the values of all existing vectors in the database. The database shows there is an existing vector with close value to the new one and uses a predefined threshold to take convenient action. If the similarity between those vectors has data distance value above 0.6 than system responds "This is the new and possibly similar identity".
After the response, it takes action of automatically creating a new identity and places it under "Similar" in Directory (GUI) if it's close to 0.6. In this case sighting and all metadata goes to the new identity and user can choose to manually merge two identities in GUI.