Classify documents automatically

Learn about the automatic classification of incoming documents to a document type

Following the automatic or manual separation of documents, the appropriate document type must be determined for each document.
You can already define the document type during the upload. In this case, the classification is skipped and the document is assigned to the specified document type.
In addition, the Parashift Platform provides an automatic approach to classify documents. Rules for recognizing the document type are established on the basis of the documents already being processed. For example, frequently occurring words and phrases are used to classify documents into the appropriate document type.

The classification model is constantly extended by means of new documents and increasingly stabilizes as soon as it is aware of all manifestations of a document type. If, for example, initially only German-language documents are processed for the document type "invoice" and English-language documents for "delivery notes", a German delivery note will be probably classified as an invoice.

Simplified example of an automated document classification

In the example shown here, the document type "Invoice" and the document type "Delivery note" have been trained. The knowledge base for the individual document types is now shown in simplified form. For invoices, the word "Invoice number" is usually found, for delivery notes the word "Delivery note number".

When a new document is uploaded, the words on the document are now compared with the keywords of the trained document types. In this case, there are three words that would indicate an invoice and one word that would indicate a delivery note.Accordingly, the document would automatically be interpreted as an invoice.