Prediction (AI) vs Recognition (OCR)

What's a prediction and especially prediction confidence and how does this differ from recognition and recognition confidence.

Introduction

When uploading a document to the Parashift Platform for Separation, Classification or Extraction, different tasks are performed beforehand to enable all these functionalities. These pre-processing steps include image enhancement, image conversion and auto-rotation.

Another very important pre-processing step is reading all the text and barcodes on the document. (often referred to as "OCR") At Parashift, we summarize all these reading steps under one central term, Recognition, be it text, barcode or other data.

Based on the original image and the text from Recognition, we can then proceed to separate documents, classify them, and extract data. For the Extraction, the machine predicts (hopefully) one or multiple candidates and picks one as the final Extraction result; this is called Prediction.

 

 

  Recognition Prediction
Description

The process of reading all text/barcodes on the document (image -> text)

Per document, there can be hundreds of recognition tokens, in the viewer, these are represented by a grey background

The process of using the image and recognized values to predict the data to be extracted

Per field, there can be multiple predictions called candidates. In the Extraction validation, these are shown in a drop-down for each field.

Value The text on the document, literally how it was printed on the image  A predicted text, based on the recognition data, is already transformed into the desired output format. 
Examples
  • 16 October 2022
  • 12.345,67
  • He11o
  • 2022-10-16 (auto transformed date)
  • 12345.67 (auto transformed float)
  • Hello (configured transformation)
Confidence How confident is the machine that it has read the text on the image correctly (in percent) How confident is the machine that it predicted the correct value (in percent)

 

Validations, Warnings & Errors can be configured for both Recognition and Prediction data. This is very useful to distinguish between a bad reading (recognition confidence low) and the machine being unsure about a prediction (prediction confidence low)