Prediction (AI) vs Recognition (OCR)

What's a prediction and especially prediction confidence and how does this differ from recognition and recognition confidence.

Introduction

When uploading a document to the Parashift Platform for Separation, Classification or Extraction different tasks are performed beforehand to enable all these functionalities. These pre-processing steps include image enhancement, image conversion and auto-rotation.

One other, very important, pre-processing step is reading all the text and barcodes on the document. (often referred to as "OCR") At Parashift we summarize all these reading steps under one central term, Recognition, be it text, barcode or other data.

Based on the original image and the text from Recognition we then can proceed to separate documents, classify them and extract data. For the Extraction the machines predicts (hopefully) one or multiple candidates and picks one as final Extraction result, this is called Prediction.

 

 

  Recognition Prediciton
Description

The process of reading all text/barcodes on the document (image -> text)

Per document there can be hundreds of recognition tokens, in the viewer these are represented by a grey background

The process of using the image and recognized values to predict the data to be extracted

Per field there can be multiple predicitions, called candidates. In the Extraction validation these are shown in a drop-down for each field.

Value The text on the document, literally how it was printed on the image  A predicted text, based on the recognition data, already transformed into the desired output format. 
Examples
  • 16 October 2022
  • 12.345,67
  • He11o
  • 2022-10-16 (auto transformed date)
  • 12345.67 (auto transformed float)
  • Hello (configured transformation)
Confidence How confident is the machine that it has read the text on the image correctly (in percent) How confident is the machine that it predicted the correct value (in percent)

 

Validations, Warnings & Errors can be configured for both, Recognition and Prediction data. This is very useful to distinguish between a bad reading (recognition confidence low) and the machine being unsure about a prediction (prediction confidence low)