Relationships & Structure of Documents & Fields

Introduction

The Parashift Platform is designed to extract information from any kind of document. Depending on use case, clients can have one, two, or hundreds of document types (configurations) active, each with maybe similar or unique data points to be extracted.

It is therefore very important to have a standardized way to output all this captured information. This article describes the relationship between documents and fields and gives some examples of how to best download results via API.

Relationships & Structure

In a nutshell, the following diagram shows how the different objects (documents and document_fields) are linked with each other.

Main takeaways

One document consists of zero or multiple document_fields, while a document_field is always linked to one document.

Example API Calls

One document with its document_fields

GET /documents/123456/?include=document_fields

Alternative: All document_fields belonging to one document

GET /document_fields/?filter[document_id]=123456&include=document

One document with its document_fields and their extraction_canidates

GET /documents/123456/?include=document_fields&extra_fields[document_fields]=extraction_candidates

All the documents and their classification_candidates

GET /documents/?extra_fields[documents]=classification_candidates

(?)

One document with recognition_text (ocr)

GET /documents/123456/?extra_fields[documents]=recognition_text