Learn how to manually validate the extraction of your documents
This article describes the "old" Extraction Validation interface that, down the line, will be replaced by an updated version that is already out in "beta".
After uploading a document, our automatic extraction will extract as much information as possible.
Once this process is finished, the document will show up in the Extraction Overview(1), where you can enter the Validation with the blue Extraction button(2).
In Extraction Validation you can navigate through all pages of the document on the left hand side.
- Page navigation - skim through pages and/or jump to the start or the end of your document.
- Rotation - If our preprocessing was unable to rotate your document, you can do so manually.
- Zoom in/out - If you need to take a closer look at your document, you can zoom in/out.
- Ruler - If you're validating pages full of line items, our ruler might come in handy not to lose where you left off over to the validation area.
- Help - Here you'll find a list of shortcuts to all features that come in handy when validating.
On the right hand side you'll find all fields that are part of the document type your document has been processed with.
The colored box and it's value indicates, how certain our "machine" is with it's prediction:
- Green (by default 100-95%) - Extraction confidence is high enough to skip the requirement of manual confirmation
- Yellow (by default 95%-30%) - Predictions are entered into each field, but they have to be confirmed manually
- Red (by default 30-0%) - Predictions are not entered, manual capturing/confirmation required
In order to add any value into a field, a connection to the document is required. This can be done by drawing a box over the corresponding value on your document. With this we're collecting the coordinates of where the value is present, which is very valuable information for our machine learning models.
If a prediction or the validated value is present in a field, you can see where it has been found by hovering over the field itself. A blue line will lead to the validated value, which additionally has a blue outline.
Depending on the document type configuration, some fields/fieldsets can be captured multiple times by clicking "Add Item"(1).
In the above case, a fieldset of 2 fields (FIELD 1 & FIELD 2) have been added 3 times.
These fieldsets can be resorted (2) or removed (3).
Once all fields have been validated (green), the "Done" button is ready and you can finish the processing of your document.