What is a batch and do I need to know about them? Batch attributes, optional attributes (extra_fields), relationships and example API calls
Introduction
The Parashift Platform is capable of separating incoming documents (e.g. 20-page pdf) into multiple smaller documents (e.g. 5 documents with 4 pages each). A "batch" is used to keep the relation between the originally uploaded file(s) and the documents created from that file.
One single document has always one batch, while one batch consists of at least one but often multiple documents.
Depending on your use case you either heavily rely on batches (you use separation) or you don't need them at all (every uploaded document is already separated correctly).
Attributes
Name | Type | Writable | Details | links |
name | string | on create | ||
status | string | no |
Allowed values: pending, in_progress, done, failed |
|
upload_configuration | string | on create | ||
validation_required | boolean | no | ||
created_at | datetime | no | ||
updated_at | datetime | no |
Optional Attributes (extra_fields)
none
Relationships (include)
officially none
There is no public relationship in the API between batch and document. One common use case is however to get all documents belonging to one batch.
If you want to do this just query for documents and filter on the batch id, e.g.
GET /documents/?filter[batch_id]=123456
Example API Calls
Show single batch
GET /Batches/123456
response
{
"data": {
"id": "123456",
"type": "batches",
"attributes": {
"created_at": "2022-03-10T09:38:53.532823Z",
"name": "Batch-manual.pdf",
"status": "in_progress",
"tenant_id": "543",
"updated_at": "2022-03-10T09:39:09.841364Z",
"upload_configuration": "client",
"validation_required": true
}
},
"meta": {}
}
List batches
GET /Batches/
response
{
"data": [
{
"id": "111111",
"type": "batches",
"attributes": {
"created_at": "2020-11-25T19:24:01.175038Z",
"name": null,
"status": "in_progress",
"tenant_id": "543",
"updated_at": "2022-03-04T14:10:00.661514Z",
"upload_configuration": "client",
"validation_required": true
}
},
{
"id": "222222",
"type": "batches",
"attributes": {
"created_at": "2020-12-08T09:55:03.084905Z",
"name": null,
"status": "done",
"tenant_id": "543",
"updated_at": "2021-01-07T17:07:40.399357Z",
"upload_configuration": "client",
"validation_required": true
}
}
],
"meta": {}
}
Recommended Reading
I strongly recommend the following articles, going into detail about how a batch is related to documents, pages and especially input_files.
Also, check out our Postman API Documentation