OCR technology in identity verification




Today we are talking about OCR technology in remote identity verification solutions.

⚠️⚠️⚠️ A reminder

The OCR or Optical Character Recognition technology makes the text of digital impressions editable. It is capable of extracting written information from scanned documents, jpg files, written text or printed documents. OCR captures the information in a document, then extracts it and transforms the data into a machine-readable format for further processing and analysis.

Why is OCR technology important in online identity verification?

Identity verification solutions include different technologies, such as artificial intelligence, machine learning, OCR or the liveness detection. The combination of technologies, including OCR, allows to automate the identity verification process and transfer it to the digital environment.

OCR technology is important in the processes of identity verification because it extracts the relevant data from the identity document .

Because OCR technology is part of online identity verification solutions, the user only has to show their identity documentation to the camera of their device. Automatically, the OCR captures all the information necessary for verification, the user does not have to type or enter any type of information. Without this technology, the user himself would have to fill in his data manually.

OCR streamlines the data extraction process, resulting in reduced registration times and a smoother customer experience.

Furthermore, OCR technology is crucial to determine whether the document to be verified is authentic or not. Combined with artificial intelligence and machine learning , it is the part of the solution that is responsible for extracting the data that is checked against official document templates and searched in databases. Without a technology capable of converting the information in the identity documents into a language that can be understood by the computer system, the verification process would not be possible.

The challenge of OCR technology in identity verification

OCR technology was initially designed to capture and extract data from black text on a white background.

Implementing OCR in identity verification is a challenge, since it faces a much more complete and random task (the identity documents of each country are different): it has to extract complex data fields with small fonts, holograms, watermarks, varied backgrounds, etc. And not only that, in addition to identifying important information, it has to structure and verify it.

OCR technologies, as their name suggests, are based on optical character recognition, that is, they distinguish characters (letters, numbers, symbols, etc.). But in order to distinguish or recognize those characters, it is needed to train the technology in advance.

During system training, the elements or characters that the OCR needs to recognize (for example, fonts, holograms, backgrounds, etc.) are stored, in addition to the templates of the identity documents to be verified (for example, Spanish ID).

In an identity verification process, when a user captures a photo of their identity document with their mobile phone or a web camera, the system must have the official template of the identity document registered, and the characters and others elements. Only then it will be able to detect what type of document it is and then correctly structure the information it has to extract.

Basically, OCR technologies analyze documents and images, pixel by pixel to find the characters they have internalized. It can be said that they work in a similar way to the technologies of facial recognition , because the system also looks for matches with the information it has stored.

How we solve this challenge in Alice

At Alice we strive to understand the real problems of the technologies we develop in order to mold our solutions to reality. We have been developing technology and adapting to real problems for more than 10 years, that is why we understand any technological challenge as a dynamic element. Like any technology, the key lies in continuous improvement and adaptation to circumstances.

Specifically, in the case of OCR, we have two aces up our sleeve.

  1. We continually evaluate, analyze, and train our algorithms , allowing us to detect and extract characters with 90% accuracy. Our OCR technology is capable of recognizing more than 150 languages, more than 120 identity documents, more than 301 driving licenses, etc.
  2. We own our technology , which allows us to be very agile when making changes and incorporating new data. We include any identity document in less than 24 hours!

If you want to know more about us, follow us on LinkedIn!

If you liked it, share it on