NAV Navbar

Getting Started

Introduction

The Rossum API allows you to programmatically access and manage your organization's Rossum data and account information. The API allows you to do the following programmatically:

On this page, you will find an introduction to the API usage from a developer perspective, and a reference to all the API objects and methods.

Developer Resources

There are several other key resources related to implementing, integrating and extending the Rossum platform:

Quick API Tutorial

For a quick tutorial on how to authenticate, upload a document and export extracted data, see the sections below. If you want to skip this quick tutorial, continue directly to the Overview section.

It is a good idea to go through the introduction to the Rossum platform on the Developer Portal first to make sure you are up to speed on the basic Rossum concepts.

If in trouble, feel free to contact us at support@rossum.ai.

Install curl tool

Test curl is installed properly

curl https://api.elis.rossum.ai/v1
{"organizations":"https://api.elis.rossum.ai/v1/organizations","workspaces":"htt
ps://api.elis.rossum.ai/v1/workspaces","schemas":"https://api.elis.rossum.ai/v1/
schemas","connectors":"https://api.elis.rossum.ai/v1/connectors","inboxes":"http
s://api.elis.rossum.ai/v1/inboxes","queues":"https://api.elis.rossum.ai/v1/queue
s","documents":"https://api.elis.rossum.ai/v1/documents","users":"https://api.el
is.rossum.ai/v1/users","groups":"https://api.elis.rossum.ai/v1/groups","annotati
ons":"https://api.elis.rossum.ai/v1/annotations","pages":"https://api.elis.rossu
m.ai/v1/pages"}

All code samples included in this API documentation use curl, the command line data transfer tool. On MS Windows 10, MacOS X and most Linux distributions, curl should already be pre-installed. If not, please download it from curl.haxx.se).

Optionally use jq tool to pretty-print JSON output

curl https://api.elis.rossum.ai/v1 | jq
{
  "organizations": "https://api.elis.rossum.ai/v1/organizations",
  "workspaces": "https://api.elis.rossum.ai/v1/workspaces",
  "schemas": "https://api.elis.rossum.ai/v1/schemas",
  "connectors": "https://api.elis.rossum.ai/v1/connectors",
  "inboxes": "https://api.elis.rossum.ai/v1/inboxes",
  "queues": "https://api.elis.rossum.ai/v1/queues",
  "documents": "https://api.elis.rossum.ai/v1/documents",
  "users": "https://api.elis.rossum.ai/v1/users",
  "groups": "https://api.elis.rossum.ai/v1/groups",
  "annotations": "https://api.elis.rossum.ai/v1/annotations",
  "pages": "https://api.elis.rossum.ai/v1/pages"
}

You may also want to install jq tool to make curl output human-readable.

Use the API on Windows

This API documentation is written for usage in command line interpreters running on UNIX based operation systems (Linux and Mac). Windows users may need to use the following substitutions when working with API:

Character used in this documentation Meaning/usage Substitute character for Windows users
' single quotes "
" double quotes "" or \"
\ continue the command on the next line ^

Example of API call on UNIX-based OS

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"target_queue": "https://api.elis.rossum.ai/v1/queues/8236", "target_status": "to_review"}' \
  'https://api.elis.rossum.ai/v1/annotations/315777/copy'

Examples of API call on Windows

curl -H "Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03" -H "Content-Type: application/json" ^
  -d "{""target_queue"": ""https://api.elis.rossum.ai/v1/queues/8236"", ""target_status"": ""to_review""}" ^
  "https://api.elis.rossum.ai/v1/annotations/315777/copy"


curl -H "Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03" -H "Content-Type: application/json" ^
  -d "{\"target_queue\": \"https://api.elis.rossum.ai/v1/queues/8236\", \"target_status\": \"to_review\"}" ^
  "https://api.elis.rossum.ai/v1/annotations/315777/copy"

Create an account

In order to interact with the API, you need an account. If you do not have one, you can create one via our self-service portal.

Login to the account

Fill-in your username and password (login credentials to work with API are the same as those to log into your account.). Trigger login endpoint to obtain a key (token), that can be used in subsequent calls.

curl -s -H 'Content-Type: application/json' \
  -d '{"username": "east-west-trading-co@elis.rossum.ai", "password": "aCo2ohghBo8Oghai"}' \
  'https://api.elis.rossum.ai/v1/auth/login'
{"key": "db313f24f5738c8e04635e036ec8a45cdd6d6b03"}

This key will be valid for a default expire time (currently 162 hours) or until you log out from the sessions.

Upload a document

In order to upload a document (PDF or image) through the API, you need to obtain the id of a queue first.

curl -s -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03'
  'https://api.elis.rossum.ai/v1/queues?page_size=1' | jq -r .results[0].url
https://api.elis.rossum.ai/v1/queues/8199

Then you can upload document to the queue. Alternatively, you can send documents to a queue-related inbox. See upload for more information about importing files.

curl -s -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -F content=@document.pdf 'https://api.elis.rossum.ai/v1/queues/8199/upload' | jq -r .results[0].annotation
https://api.elis.rossum.ai/v1/annotations/319668

Wait for document to be ready and review extracted data

As soon as a document is uploaded, it will show up in the queue and the data extraction will begin. It may take a few seconds to several minutes to process a document. You can check status of the annotation and wait until its status is changed to to_review.

curl -s -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319668' | jq .status
"to_review"

After that, you can open the Rossum web interface elis.rossum.ai to review and confirm extracted data.

Download reviewed data

Now you can export extracted data using the export endpoint of the queue. You can select XML, CSV or JSON format. For CSV, use URL like:

curl -s -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8199/export?status=exported&format=csv&id=319668'
Invoice number,Invoice Date,PO Number,Due date,Vendor name,Vendor ID,Customer name,Customer ID,Total amount,
2183760194,2018-06-08,PO2231233,2018-06-08,Alza.cz a.s.,02231233,Rossum,05222322,500.00

Logout

Finally you can dispose token safely using logout endpoint:

curl -s -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/auth/logout'
{"detail":"Successfully logged out."}

Overview

HTTP and REST

The Rossum API is organized around REST. Our API has predictable, resource-oriented URLs, and uses HTTP response codes to indicate API errors. We use built-in HTTP features, like HTTP authentication and HTTP verbs, which are understood by off-the-shelf HTTP clients.

HTTP Verbs

Call the API using the following standard HTTP methods:

We support cross-origin resource sharing, allowing you to interact securely with our API from a client-side web application. JSON is returned by API responses, including errors (except when another format is requested, e.g. XML).

Authentication

Most of the API endpoints require user to be authenticated. To login to the Rossum API, post an object with username and password fields. Login returns new authentication key to be used in token authentication.

User may delete a token using logout endpoint or automatically after a configured time (default expire time is 162 hours). Default expire time can be lowered using max_token_lifetime_s field. When token expires, 401 status is returned. Users are expected to re-login to obtain a new token.

Login

Login user using username and password

curl -H 'Content-Type: application/json' \
  -d '{"username": "east-west-trading-co@elis.rossum.ai", "password": "aCo2ohghBo8Oghai"}' \
  'https://api.elis.rossum.ai/v1/auth/login'
{
  "key": "db313f24f5738c8e04635e036ec8a45cdd6d6b03"
}

POST /v1/auth/login

Use token key in requests

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/organizations/406'

Login user expiring after 1 hour

curl -H 'Content-Type: application/json' \
  -d '{"username": "east-west-trading-co@elis.rossum.ai", "password": "aCo2ohghBo8Oghai", "max_token_lifetime_s": 3600}' \
  'https://api.elis.rossum.ai/v1/auth/login'
{
  "key": "ltcg2p2w7o9vxju313f04rq7lcc4xu2bwso423b3"
}

Response

Status: 200

Returns object with "key", which is an actual authentication token.

Logout

Logout user

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/auth/logout'
{
  "detail": "Successfully logged out."
}

POST /v1/auth/logout

Logout user, discard auth token.

Response

Status: 200

Pagination

All object list operations are paged by default, so you may need several API calls to obtain all objects of given type.

Parameter Default Maximum Description
page_size 20 100 (*) Number of results per page
page 1 Page of results

(*) Maximum page size of annotation list and CSV export is 1000.

Filters and ordering

List queues of workspace 7540, with locale en_US and order results by name.

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues?workspace=7540&locale=en_US&ordering=name'

Lists may be filtered using various attributes. Multiple attributes are combined with AND, which results in more specific response. Please refer to the particular object description.

All filter parameters that refers to object expect only id (instead of URL) of the object.

Ordering of results may be enforced by the ordering parameter and one or more keys delimited by a comma. Preceding key with a minus sign - enforces descending order.

Metadata

Example metadata in a document object

{
  "id": 319768,
  "url": "https://api.elis.rossum.ai/v1/documents/319768",
  "s3_name": "05feca6b90d13e389c31c8fdeb7fea26",
  "annotations": [
    "https://api.elis.rossum.ai/v1/annotations/319668"
  ],
  "mime_type": "application/pdf",
  "arrived_at": "2019-02-11T19:22:33.993427Z",
  "original_file_name": "document.pdf",
  "content": "https://api.elis.rossum.ai/v1/documents/319768/content",
  "metadata": {
    "customer-id": "f205ec8a-5597-4dbb-8d66-5a53ea96cdea",
    "source": 9581,
    "authors": ["Joe Smith", "Peter Doe"]
  }
}

When working with API objects, it may be useful to attach some information to the object (e.g. customer id to a document). You can store custom JSON object in a metadata section available in most objects.

List of objects with metadata support: organization, workspace, user, queue, schema, connector, inbox, document, annotation, page.

Total metadata size may be up to 1 kB per object.

Versioning

API Version is part of the URL, e.g. https://api.elis.rossum.ai/v1/users.

To allow API progress, we consider addition of a field in a JSON object to be a backward-compatible operation that may be introduced at any time. Clients are expected to deal with such changes.

In order to be notified about future API changes and improvements, please subscribe to the rossum-api-announcements group.

Dates

All dates fields are represented as ISO 8601 formatted strings, e.g. 2018-06-01T21:36:42.223415Z. All returned dates are in UTC timezone.

Errors

Our API uses conventional HTTP response codes to indicate the success or failure of an API request.

Code Status Meaning
400 Bad Request Invalid input data or error from connector.
401 Unauthorized The username/password is invalid or token is invalid (e.g. expired).
403 Forbidden Insufficient permission, missing authentication, invalid CSRF token and similar issues.
404 Not Found Entity not found (e.g. already deleted).
405 Method Not Allowed You tried to access an endpoint with an invalid method.
409 Conflict Trying to change annotation not currently assigned or assigned to another user.
413 Payload Too Large for too large payload (especially for files uploaded).
500 Internal Server Error We had a problem with the server. Try again later.
503 Service Unavailable We're temporarily offline for maintenance. Please try again later.

Import and Export

Documents may be imported into Rossum using the REST API and email gateway. Supported file formats are PDF, PNG, and JPEG.

In order to get the best results from Rossum the documents should be in A4 format of at least 150 DPI (in case of scans/photos).

Upload

You can upload a document to the queue using upload endpoint with one or more files to be uploaded. You can also specify additional field values in upload endpoint. As soon as a document is uploaded, data extraction is started.

Upload endpoint supports basic authentication to enable easy integration with third-party systems.

Import by Email

It is also possible to send documents by email using a properly configured inbox that is associated with a queue. Users then only need to know the email address to forward emails to.

For every incoming email, Rossum extracts PDF documents and images, stores them in the queue and starts data extraction process.

Small images (up to 100x100 pixels) are ignored, see inbox for reference.

Export

In order to export extracted and confirmed data you can call export endpoint. You can specify status, time-range filters and annotation id list to limit returned results.

Upload endpoint supports basic authentication to enable easy integration with third-party systems.

Auto-split of document

It is possible to process a single PDF file that contains several invoices. Just insert a special separator page between the documents. You can print this page and insert it between documents while scanning.

Rossum will recognize a QR code on the page and split the PDF into individual documents automatically. Produced documents are imported to the queue, while the original document is set to a split state.

Automation

All imported documents are processed by the data extraction process to obtain values of fields specified in the schema. Extracted values are then available for validation in the UI.

Using per-queue automation settings, it is possible to skip manual UI validation step and automatically proceed with the export of the document.

Currently, there are three levels of automation:

Sources of field validation

Low-confidence fields are marked in the UI by an "eye" icon, we consider them to be not validated. On the API level they have an empty validation_sources list.

Validation of a field may be introduced by various sources: data extraction confidence above a threshold, computation of various checksums (e.g. VAT rate, net amount and gross amount) or a human review. These validations are recorded in the validation_source list. The data extraction confidence threshold may be adjusted, see validation sources for details.

AI Confidence Scores

While there are multiple ways to automatically pre-validate fields, the most prominent one is score-based validation based on AI Core Engine confidence scores.

The confidence score predicted for each AI-extractd field is stored in the rir_confidence attribute. The score is a number between 0 and 1, and is calibrated in such a way that it corresponds to the probability of a given value to be correct. In other words, a field with score 0.80 is expected to be correct 4 out of 5 times.

The value of the score_threshold (can be set on queue, or individually per datapoint in schema; default is 0.975) attribute represents the minimum score that triggers automatic validation. Because of the score meaning, this directly corresponds to the achieved accuracy. For example, if a score threshold for validation is set at 0.975, that gives an expected error rate of 2.5% for that field.

Usage report

In order to obtain an overview of the Rossum usage, you can download Excel file with basic Rossum statistics.

The statistics contains following attributes:

Download usage statistics (January 2019).

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/manage/usage-report?from=2019-01-01&to=2019-01-31'

Excel file (xlsx) may be downloaded from https://api.elis.rossum.ai/manage/usage-report.

You may specify date range using from and to parameters (inclusive). If not specified, a report for last 12 months is generated.

Document Schema

Every queue has an associated schema that specifies which fields will be extracted from documents as well as the structure of the data sent to connector and exported from the platform.

Rossum schema supports single datapoints (datapoint), lists of datapoints of the same type (multivalue) or tuples (tuple). At the topmost level, each schema consists of sections, which may either directly contain actual data (datapoints) or use nested multivalues and tuples as containers for single datapoints.

But while schema may theoretically consist of an arbitrary number of nested containers, the Rossum UI supports only certain particular combinations of datapoint types. The supported shapes are:

Schema content

Schema content consists of a list of section objects.

Common attributes

The following attributes are common for all schema objects:

Attribute Type Description Required
category string Category of an object, one of section, multivalue, tuple or datapoint. yes
id string Unique identifier of an object. yes
label string User-friendly label for an object, shown in the user interface yes
hidden boolean If set to true, the object is not visible in the user interface, but remains stored in the database and may be exported. Default is false. no

Section

Example of a section

{
    "category": "section",
    "id": "amounts_section",
    "label": "Amounts",
    "children": [...],
    "icon": ""
}

Section represents a logical part of the document, such as amounts or vendor info. It is allowed only at the top level. Schema allows multiple sections, and there should be at least one section in the schema.

Attribute Type Description Required
children list[object] Specifies objects grouped under a given section. It can contain multivalue or datapoint objects. yes
icon string The icon that appears on the left panel in the UI for a given section (not yet supported on UI).

Datapoint

A datapoint represents a single value, typically a field of a document or some global document information. Fields common to all datapoint types:

Attribute Type Description Required
type string Data type of the object, must be one of the following: string, number, date, enum, button yes
can_export boolean If set to false, datapoint is not exported through export endpoint. Default is true.
can_collapse boolean If set to true, tabular (multivalue-tuple) datapoint may be collapsed in the UI. Default is false.
rir_field_names list[string] List of references used to initialize an object value. See below for the description.
default_value string Default value used either for fields that do not use hints from DE API predictions (i.e. rir_field_names are not specified), or DE API does not return any data for the field.
constraints object A map of various constraints for the field. See Value constraints.
width integer Width of the column (in characters). Default widths are: number: 8, string: 20, date: 10, enum: 20. Only supported for table datapoints.
stretch boolean If total width of columns doesn’t fill up the screen, datapoints with stretch set to true will be expanded proportionally to other stretching columns. Only supported for table datapoints.
width_chars integer Deprecated. Use width and stretch properties instead.
score_threshold float [0;1] Threshold used to automatically validate field content based on AI confidence scores. If not set, queue.default_score_threshold is used.

rir_field_names attribute allows to specify source of initial value of the object. List items may be:

If more list items in rir_field_names are specified, the first available value will be used.

String type

Example string datapoint

{
    "category": "datapoint",
    "id": "invoice_id",
    "label": "Invoice ID",
    "type": "string",
    "default_value": null,
    "rir_field_names": ["invoice_id"],
    "constraints": {
        "length": {
            "exact": null,
            "max": 16,
            "min": null
        },
        "regexp": {
            "pattern": ""
        },
        "required": false
    }
}

String datapoint does not have any special attribute.

Date type

Example date datapoint

{
  "id": "item_delivered",
  "type": "date",
  "label": "Item Delivered",
  "format": "MM/DD/YYYY",
  "category": "datapoint"
}

Attributes specific to Date datapoint:

Attribute Type Description Required
format string Enforces a format for date datapoint on the UI. See Date format below for more details.

Date format supported: available tokens

Example date formats:

Number type

Example number datapoint

{
  "id": "item_quantity",
  "type": "number",
  "label": "Quantity",
  "format": "#,##0.#",
  "category": "datapoint",
}

Attributes specific to Number datapoint:

Attribute Type Description Required
format string Available choices are # ##0,#, # ##0.#, #,##0.#, # ##0, #,##0 and null. Default is # ##0.#.

Enum type

Example enum datapoint with options

{
  "id": "document_type",
  "type": "enum",
  "label": "Document type",
  "hidden": false,
  "category": "datapoint",
  "options": [
    {
      "label": "Invoice Received",
      "value": "21"
    },
    {
      "label": "Invoice Sent",
      "value": "22"
    },
    {
      "label": "Receipt",
      "value": "23"
    }
  ],
  "default_value": "21",
  "rir_field_names": []
}

Attributes specific to Enum datapoint:

Attribute Type Description Required
options object See object description below. yes

Every option consists of an object with keys:

Attribute Type Description Required
value string Value of the option. yes
label string User-friendly label for the option, shown in the UI. yes

Enum datapoint value is matched in a case insensitive mode, e.g. EUR currency value returned by the AI Core Engine is matched successfully against {"value": "eur", "label": "Euro"} option.

Button type

Specifies a button shown in Rossum UI. For more details please refer to custom UI extension.

Example button datapoint

{
  "id": "show_email",
  "type": "button",
  "category": "datapoint",
  "popup_url": "http://example.com/show_customer_data",
}

Despite being a datapoint object, button currently cannot hold any value. Therefore, the set of available Button datapoint attributes is limited to:

Attribute Type Description Required
type string Data type of the object, must be one of the following: string, number, date, enum, button yes
can_export boolean If set to false, datapoint is not exported through export endpoint. Default is true.
can_collapse boolean If set to true, tabular (multivalue-tuple) datapoint may be collapsed in the UI. Default is false.
popup_url string URL of a popup window to be opened when button is pressed. yes

Value constraints

Example value constraints

{
  "id": "invoice_id",
  "type": "string",
  "label": "Invoice ID",
  "category": "datapoint",
  "constraints": {
    "length": {
      "max": 32,
      "min": 5
    },
    "required": false
  },
  "default_value": null,
  "rir_field_names": [
    "invoice_id"
  ]
}

Constraints limit allowed values. When constraints is not satisfied, annotation is considered invalid and cannot be exported.

Attribute Type Description Required
length object Defines minimum, maximum or exact length for the datapoint value. By default, minimum and maximum are 0 and infinity, respectively. Supported attributes: min, max and exact
regexp object When specified, content must match a regular expression. Supported attributes: pattern
required boolean Specifies if the datapoint is required by the schema. Default value is true.

Multivalue

Example of a multivalue:

{
  "category": "multivalue",
  "id": "line_item",
  "label": "Line Item",
  "children": {
    ...
  },
  "min_occurrences": null,
  "max_occurrences": null
}

Example of a multivalue with grid row-types specification:

{
  "category": "multivalue",
  "id": "line_item",
  "label": "Line Item",
  "children": {
    ...
  },
  "grid": {
    "row_types": [
      "header", "data", "footer"
    ],
    "default_row_type": "data",
    "row_types_to_extract": [
      "data"
    ]
  },
  "min_occurrences": null,
  "max_occurrences": null
}

Multivalue is list of datapoints or tuples of the same type. It represents a container for data with multiple occurrences (such as line items) and can contain only objects with the same id.

Attribute Type Description Required
children object Object specifying type of children. It can contain only objects with categories tuple or datapoint. yes
min_occurrences integer Minimum number of occurrences of nested objects. If condition of min_occurrences is violated corresponding fields should be manually reviewed. Minimum required value for the field is 0. If not specified, it is set to 0 by default.
max_occurrences integer Maximum number of occurrences of nested objects. All additional rows above max_occurrences are removed by extraction process. Minimum required value for the field is 1. If not specified, it is set to infinity by default.
grid object Configure magic-grid feature properties, see below.

Multivalue grid object

Multivalue grid object allows to specify a row type for each row of the grid. For data representation of actual grid data rows see Grid object description.

Attribute Type Description Default Required
row_types list[string] List of allowed row type values. ["data"] yes
default_row_type string Row type to be used by default data yes
row_types_to_extract string Types of rows to be extracted to related table data yes

For example to distinguish two header types and a footer in the validation interface, following row types may be used: header, subsection_header, data and footer.

Currently, data extraction classifies every row as either data or header (additional row types may be introduced in the future). We remove rows returned by data extraction that are not in row_types list (e.g. header by default) and are on the top/bottom of the table. When they are in the middle of the table, we mark them as skipped (null).

There are three visual modes, based on row_types quantity:

Tuple

Example of a tuple:

{
  "category": "tuple",
  "id": "tax_details",
  "label": "Tax Details",
  "children": [
    ...
  ],
  "rir_field_names": [
    "tax_details"
  ]
}

Container representing tabular data with related values, such as tax details. A tuple must be nested within a multivalue object, but unlike multivalue, it may consist of objects with different ids.

Attribute Type Description Required
children list[object] Array specifying objects that belong to a given tuple. It can contain only objects with category datapoint. yes
rir_field_names list[string] List of names used to initialize content from DE API predictions. If specified, the value of the first extracted field from the array is used, otherwise, no DE API initialization is done for the object.

Updating Schema

When project evolves, it is a common practice to enhance or change the extracted field set. This is done by updating the schema object.

By design, Rossum supports multiple schema versions at the same time. However, each document annotation is related to only one of those schemas. If the schema is updated, all related document annotations are updated accordingly. See preserving data on schema change below for limitations of schema updates.

In addition, every queue is linked to a schema, which is used for all newly imported documents.

When updating a schema, there are two possible approaches:

Use case 1 - Initial setting of a schema

Use case 2 - Updating attributes of a field (label, constraints, options, etc.)

Use case 3 - Adding new field to a schema, even for already imported documents.

Use case 4 - Adding new field to schema, only for newly imported documents

Use case 5 - Deleting schema field, even for already imported documents.

Use case 6 - Deleting schema field, only for newly imported documents

Preserving data on schema change

In order to transfer annotation field values properly during the schema update, a datapoint's category and schema_id must be preserved.

Supported operations that preserve fields values are:

Extracted field types

DE API currently automatically extracts the following fields at the all endpoint, subject to ongoing expansion.

Identifiers

Example of a schema with different identifiers:

[
  {
    "category": "section",
    "children": [
      {
        "category": "datapoint",
        "constraints": {
          "required": false
        },
        "default_value": null,
        "id": "invoice_id",
        "label": "Invoice number",
        "rir_field_names": [
          "invoice_id"
        ],
        "type": "string"
      },
      {
        "category": "datapoint",
        "constraints": {
          "required": false
        },
        "default_value": null,
        "format": "D/M/YYYY",
        "id": "date_issue",
        "label": "Issue date",
        "rir_field_names": [
          "date_issue"
        ],
        "type": "date"
      },
      {
        "category": "datapoint",
        "constraints": {
          "required": false
        },
        "default_value": null,
        "id": "terms",
        "label": "Terms",
        "rir_field_names": [
          "terms"
        ],
        "type": "string"
      }
    ],
    "icon": null,
    "id": "invoice_info_section",
    "label": "Basic information"
  }
]
Attr. rir_field_names Field label Description
account_num Bank Account Bank account number
bank_num Sort Code Sort code. Numerical code of the bank.
bic BIC/SWIFT Bank BIC or SWIFT code.
const_sym Constant Symbol Statistical code on payment order.
customer_id Customer Number The number by which the customer is registered in the system of the supplier.
date_due Date Due The due date of the invoice.
date_issue Issue Date Date of issue of the document.
date_uzp Tax Point Date The date of taxable event.
iban IBAN Bank account number in IBAN format.
invoice_id Invoice Identifier Invoice number.
order_id Order Number Purchase order identification.
recipient_address Recipient Address Address of the customer.
recipient_dic Recipient Tax Number Tax identification number of the customer.
recipient_ic Recipient Company ID Company identification number of the customer.
recipient_name Recipient Name Name of the customer.
recipient_vat_id Recipient VAT Number Customer VAT Number
sender_address Supplier Address Address of the supplier.
sender_dic Supplier Tax Number Tax identification number of the supplier.
sender_ic Supplier Company ID Business/organization identification number of the supplier.
sender_name Supplier Name Name of the supplier.
sender_vat_id Supplier VAT Number VAT identification number of the supplier.
spec_sym Specific Symbol Payee id on the payment order, or similar.
terms Terms Payment terms as written on the document (eg. "45 days", "upon receipt").
var_sym Payment reference In some countries used by the supplier to match the payment received against the invoice.

Document attributes

Attr. rir_field_names Field label Description
currency Currency The currency which the invoice is to be paid in. Possible values: CZK, DKK, EUR, GBP, NOK, SEK, USD or other. May be also in lowercase.
invoice_type Invoice Type Possible values: credit_note, debit_note, tax_invoice (most typical), proforma, receipt or other.
language Language The language which the document was written in. Possible values: ces, deu, eng, fra, slk or other.

Amounts

Attr. rir_field_names Field label Description
amount_due Amount Due Final amount including tax to be paid after deducting all discounts and advances.
amount_rounding Amount Rounding Remainder after rounding amount_total.
amount_total Total Amount Subtotal over all items, including tax.
amount_paid Amount paid Amount paid already.
amount_total_base Tax Base Total Base amount for tax calculation.
amount_total_tax Tax Total Total tax amount.

Typical relations (may depend on local laws):

amount_total = amount_total_base + amount_total_tax
amount_rounding = amount_total - round(amount_total)
amount_due = amount_total - amount_paid + amount_rounding

All amounts are in the main currency of the invoice (as identified in the currency response field). Amounts in other currencies are generally excluded.

Tax details table

Tax details table and breakdown by tax rates.

Attr. rir_field_names Field label Description
tax_detail_base Tax Base Sum of tax bases for items with the same tax rate.
tax_detail_rate Tax Rate One of the tax rates in the tax breakdown.
tax_detail_tax Tax Amount Sum of taxes for items with the same tax rate.
tax_detail_total Tax Total Total amount including tax for all items with the same tax rate.

Line items table

Example of a line items table:

    {
    "category": "section",
    "children": [
      {
        "category": "multivalue",
        "children": {
          "category": "tuple",
          "children": [
            {
              "category": "datapoint",
              "constraints": {
                "required": false
              },
              "default_value": null,
              "id": "item_id",
              "label": "Item Id",
              "rir_field_names": [
                "table_column_code"
              ],
              "type": "string",
              "width": 20
            },
            {
              "category": "datapoint",
              "constraints": {
                "required": true
              },
              "default_value": null,
              "id": "item_desc",
              "label": "Description",
              "rir_field_names": [
                "table_column_description"
              ],
              "type": "string",
              "stretch": true
            },
            {
              "category": "datapoint",
              "constraints": {
                "required": false
              },
              "default_value": null,
              "format": "# ##0.#",
              "id": "item_quantity",
              "label": "Quantity",
              "rir_field_names": [
                "table_column_quantity"
              ],
              "type": "number",
              "width": 15
            },
            {
              "category": "datapoint",
              "constraints": {
                "required": false
              },
              "default_value": null,
              "format": "# ##0.#",
              "id": "item_net_unit_price",
              "label": "Unit price w/o tax",
              "rir_field_names": [
                "table_column_amount_base"
              ],
              "type": "number"
            },
            {
              "category": "datapoint",
              "constraints": {
                "required": false
              },
              "default_value": null,
              "format": "# ##0.#",
              "id": "item_vat_rate",
              "label": "Tax rate",
              "rir_field_names": [
                "table_column_rate"
              ],
              "type": "number"
            },
            {
              "category": "datapoint",
              "constraints": {
                "required": false
              },
              "default_value": null,
              "format": "# ##0.#",
              "id": "item_amount_total",
              "label": "Price w tax",
              "rir_field_names": [
                "table_column_amount_total"
              ],
              "type": "number"
            }
          ],
          "id": "line_item",
          "label": "Line item",
          "rir_field_names": []
        },
        "default_value": null,
        "id": "line_items",
        "label": "Line item",
        "max_occurrences": null,
        "min_occurrences": null
      }
    ],
    "icon": null,
    "id": "line_items_section",
    "label": "Line items"
  }

DE API currently automatically extracts line item table content and recognizes row and column types as detailed below. Invoice line items come in a wide variety of different shapes and forms. The current implementation can deal with (or learn) most layouts, with borders or not, different spacings, header rows, etc. We currently make two further assumptions:

We plan to gradually remove both assumptions in the future.

Attribute rir_field_names Field label Description
table_column_code Item Code/Id Can be the SKU, EAN, a custom code (string of letters/numbers) or even just the line number.
table_column_description Item Description Line item description. Can be multi-line with details.
table_column_quantity Item Quantity Quantity of the item.
table_column_uom Item Unit of Measure Unit of measure of the item (kg, container, piece, gallon, ...).
table_column_rate Item Rate Tax rate for the line item.
table_column_tax Item Tax Tax amount for the line. Rule of thumb: tax = rate * amount_base.
table_column_amount_base Amount Base Unit price without tax. (This is the primary unit price extracted.)
table_column_amount Amount Unit price with tax. Rule of thumb: amount = amount_base + tax.
table_column_amount_total_base Amount Total Base The total amount to be paid for all the items excluding the tax. Rule of thumb: amount_total_base = amount_base * quantity.
table_column_amount_total Amount Total The total amount to be paid for all the items including the tax. Rule of thumb: amount_total = amount * quantity.
table_column_other Other Unrecognized data type.

Document Lifecycle

Each document is submitted to Rossum within a given queue. Then it goes through a variety of states as it is processed, and eventually exported.

State Description
importing Document is being processed by the AI Core Engine for data extraction; initial state of the document.
failed_import Import failed e.g. due to a malformed document file.
to_review Initial extraction step is done and the document is waiting for user validation.
reviewing Document is undergoing validation in the user interface.
exporting Document is validated and is now awaiting the completion of connector save call.
exported Document is validated and successfully passed all hooks; this is the typical terminal state of a document.
failed_export When the connector returned an error.
postponed Operator has chosen to postpone the document instead of exporting it.
deleted When the document was deleted by the user.
purged Only metadata was preserved after a deletion.

Extensions

The Rossum platform may be extended via third-party, externally running services. These extensions may either call the Rossum API, or can register to receive callbacks from the Rossum platform on various occassions. Currently we support two modes of callback: Connector and Webhook.

See the Building Your Own Extension set of guides in Rossum's developer portal for an introduction to Rossum extensions.

Connector Extension

The connector component is aimed at two main usecases: applying custom business rules during extraction, and direct integration of Rossum with downstream systems. The connector component receives two types of callbacks - an on-the-fly validation callback on every update of captured data, and an on-export save callback when the document capture is finalized.

The custom business rules take use chiefly of the on-the-fly validation callback. The connector can auto-validate and transform both the initial AI-based extractions and each user operator edit within the validation screen; based on the input, it can push user-visible messages and value updates back to Rossum. This allows for both simple tweaks (like verifying that two amounts sum together or transforming decimal points to thousand separators) and complex functionality like intelligent PO match.

The integration with downstream systems on the other hand relies mainly on the save callback. At the same moment a document is exported from Rossum, it can be imported to a downstream system. Since there are typically constraints on the captured data, these constraints can be enforced even within the validation callback.

Connectors are designed to be implemented using a push-model using common HTTPS protocol. When annotation data is changed, or when data export is triggered, specific connector endpoint is called with annotation data as a request payload.

Setup a connector

In Rossum, a connector object defines service_url and params for construction of HTTPS requests and authorization_token that is passed in every request to authenticate the caller as the actual Rossum server. It may also uniquely identify the organization when multiple Rossum organizations share the same connector server.

The authorization token is passed as an Authorizaton HTTP header: Authorization secret_key {authorization_token}

To set-up a connector for a queue, create a connector object with URL of the remote.

Connector may be related to one or more queues.

Connector API

Example data sent to connector (validate, save)

{
  "meta": {
    "document_url": "https://api.elis.rossum.ai/v1/documents/6780",
    "arrived_at": "2019-01-30T07:55:13.208304Z",
    "original_file": "https://api.elis.rossum.ai/v1/original/bf0db41937df8525aa7f3f9b18a562f3",
    "original_filename": "Invoice.pdf",
    "queue_name": "Invoices",
    "workspace_name": "EU",
    "organization_name": "East West Trading Co",
    "annotation": "https://api.elis.rossum.ai/v1/annotations/4710",
    "queue": "https://api.elis.rossum.ai/v1/queues/63",
    "workspace": "https://api.elis.rossum.ai/v1/workspaces/62",
    "organization": "https://api.elis.rossum.ai/v1/organizations/1",
    "modifier": "https://api.elis.rossum.ai/v1/users/27",
    "updated_datapoint_ids": ["197468"],
    "modifier_metadata": {},
    "queue_metadata": {},
    "annotation_metadata": {},
    "rir_poll_id": "54f6b9ecfa751789f71ddf12"
  },
  "content": [
    {
      "id": "197466",
      "category": "section",
      "schema_id": "invoice_info_section",
      "children": [
        {
          "id": "197467",
          "category": "datapoint",
          "schema_id": "invoice_number",
          "page": 1,
          "position": [916, 168, 1190, 222],
          "rir_position": [916, 168, 1190, 222],
          "rir_confidence": 0.97657,
          "value": "FV103828806S",
          "validation_sources": ["score"],
          "type": "string"
        },
        {
          "id": "197468",
          "category": "datapoint",
          "schema_id": "date_due",
          "page": 1,
          "position": [938, 618, 1000, 654],
          "rir_position": [940, 618, 1020, 655],
          "rir_confidence": 0.98279,
          "value": "12/22/2018",
          "validation_sources": ["score"],
          "type": "date"
        },
        {
          "id": "197469",
          "category": "datapoint",
          "schema_id": "amount_due",
          "page": 1,
          "position": [1134, 1050, 1190, 1080],
          "rir_position": [1134, 1050, 1190, 1080],
          "rir_confidence": 0.74237,
          "value": "55.20",
          "validation_sources": ["human"],
          "type": "number"
        }
      ]
    },
    {
      "id": "197500",
      "category": "section",
      "schema_id": "line_items_section",
      "children": [
        {
          "id": "197501",
          "category": "multivalue",
          "schema_id": "line_items",
          "children": [
            {
              "id": "198139",
              "category": "tuple",
              "schema_id": "line_item",
              "children": [
                {
                  "id": "198140",
                  "category": "datapoint",
                  "schema_id": "item_desc",
                  "page": 1,
                  "position": [173, 883, 395, 904],
                  "rir_position": null,
                  "rir_confidence": null,
                  "value": "Red Rose",
                  "validation_sources": [],
                  "type": "string"
                },
                {
                  "id": "198142",
                  "category": "datapoint",
                  "schema_id": "item_net_unit_price",
                  "page": 1,
                  "position": [714, 846, 768, 870],
                  "rir_position": null,
                  "rir_confidence": null,
                  "value": "1532.02",
                  "validation_sources": ["human"],
                  "type": "number"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

All connector endpoints, representing hooks at particular points in the document lifetime, are simple verbs that receive a JSON POSTed and potentially expect a JSON returned in turn.

Errors

If a connector does not implement an endpoint, it may return HTTP status 404. An endpoint may fail, returning either HTTP 4xx or HTTP 5xx; for some hooks (like validate and save), this may trigger a user interface message; either the error key of a JSON response is used, or the response body itself in case it is not JSON. The connector hook save can be called in asynchronous (default) as well as synchronous mode (useful for embedded mode).

Data format

The received JSON object contains two keys, meta carrying the metadata and content carrying endpoint-specific content.

The metadata identify the concerned document, containing attributes:

Key Type Description
document_url URL document URL
arrived_at timestamp A time of document arrival in Rossum (ISO 8601)
original_file URL Permanent URL for the document original file
original_filename string Filename of the document on arrival in Rossum
queue_name string Name of the document's queue
workspace_name string Name of the document's workspace
organization_name string Name of the document's organization
annotation URL Annotation URL
queue URL Document's queue URL
workspace URL Document's workspace URL
organization URL Document's organization URL
modifier URL Modifier URL
modifier_metadata object Client data of the modifier, see metadata
queue_metadata object Client data of the queue, see metadata
annotation_metadata object Client data of the annotation, see metadata
rir_poll_id string Internal extractor processing id
updated_datapoint_ids list[string] Ids of objects that were recently modified by user

A common class of content is the annotation tree, which is a JSON object that can contain nested datapoint objects, and matches the schema datapoint tree.

Intermediate nodes have the following structure:

Key Type Description
id integer A unique id of the given node
schema_id string Reference mapping the node to the schema tree
category string One of section, multivalue, tuple
children list A list of other nodes

Datapoint (leaf) nodes structure contains actual data:

Key Type Description
id integer A unique id of the given node
schema_id string Reference mapping the node to the schema tree
category string datapoint
type string One of string, date or number, as specified in the schema
value string The datapoint value, string represented but normalizes, to that they are machine readable: ISO format for dates, a decimal for numbers
page integer A 1-based integer index of the page, optional
position list[float] List of four floats describing the x1, y1, x2, y2 bounding box coordinates
rir_position list[float] Bounding box of the value as detected by the data extractor. Format is the same as for position.
rir_confidence float Confidence (estimated probability) that this field was extracted correctly.

Hook: validate

This hook is called after operator opens a document in the Rossum verification interface and then everytime after operator updates a field. The request path is fixed to /validate and cannot be changed.

It may:

Both the messages and the updated data are shown in the verification interface. Moreover, the messages may block export in the case of errors.

This hook should be fast as it is part of an interactive workflow.

Receives an annotation tree as content.

Example of validate response

{
  "messages": [
    {
      "content": "Invalid invoice number format",
      "id": "197467",
      "type": "error"
    }
  ],
  "operations": [
    {
      "op": "replace",
      "id": "198143",
      "value": {
        "content": {
          "value": "John",
          "position": [103, 110, 121, 122],
          "page": 1
        },
        "hidden": false,
        "options": [],
        "validation_sources": ["human"]
      }
    },
    {
      "op": "remove",
      "id": "884061"
    },
    {
      "op": "add",
      "id": "884060",
      "value": [
        {
          "schema_id": "item_description",
          "content": {
            "page": 1,
            "position": [162, 852, 371, 875],
            "value": "Bottle"
          }
        }
      ]
    }
  ],
  "updated_datapoints": [
    {
      "id": "198142",
      "value": "1532.00"
    }
  ]
}

Returns a JSON object with the following lists: messages, operations and updated_datapoints.

Key messages (required)

The message object contains attributes:

Key Type Description
id integer Optional unique id of the concerned datapoint; omit for a document-wide issues
type enum One of: error, warning or info.
content string A descriptive message to be shown to the user

For example, you may use error for fatals like a missing required field, whereas info is suitable to decorate a supplier company id with its name as looked up in the suppliers database.

Key operations (optional)

Allows to specify a sequence of operations that should be performed on particular datapoint objects.

To replace a datapoint value (or other supported attribute), use replace operation:

Key Type Description
op string Type of operation: replace
id integer Datapoint id
value object Updated data, format is the same as in Anotation Data. Only value, position, page, validation_sources, hidden and options attributes may be updated.

Please note that section, multivalue and tuple may not be updated.

To add a new row into a multivalue, use add operation:

Key Type Description
op string Type of operation: add
id integer Multivalue id (parent of new datapoint)
value list[object] Added row data. List of objects, format of the object is the same as in Anotation Data. schema_id attribute is required, only value, position, page, validation_sources, hidden and options attributes may be set.

The row will be appended to the current list of rows. Please note that only multivalue children datapoints may be added.

To remove a row from a multivalue, use remove operation:

Key Type Description
op string Type of operation: remove
id integer Datapoint id

Please note that only multivalue children datapoints may be removed.

Key updated_datapoints (optional, deprecated)

We also support a simplified version of updates using updated_datapoints response key. It only supports updates (no add or remove operations) and is now deprecated. The updated datapoint object contains attributes:

Key Type Description
id string A unique id of the concerned datapoint, currently only datapoints of category datapoint can be updated
value string New value of the datapoint. Value is formatted according to the datapoint type (e.g. date is string representation of ISO 8601 format).
hidden boolean Toggle for hiding/showing of the datapoint, see datapoint
options list[object] Options of the datapoint -- valid only for type=enum, see enum options
position list[float] New position of the datapoint, list of four numbers.

Validate endpoint should always return 200 OK status.

An error message returned from the connector prevents user from exporting the document.

The hook is also called when document processing has finished (on importing => to_review status change). It may be used to update or enhance annotation data before user starts to validate the document. This initial validate hook is marked with initial=true URL parameter.

Hook: save

This hook is called when the invoice transitions from the reviewing state. Connector may process the final document annotation and save it to the target system. It receives an annotation tree as content. The request path is fixed to /save and cannot be changed.

The save hook is called asynchronously (unless synchronous mode is set in related connector object. Timeout of the save hook endpoint is 60 seconds.

The request should return an empty response with 204 No Content status for success.

For graceful failure with error message return 422 Unprocessable Entity or 500 Internal Server Error status with response with same format as in validate hook. If the hook fails with an HTTP error, the document transitions to the failed_export state - it is then available to the operators for manual review and requeuing to the to_review state in the user interface. Requeuing may be done also programmatically via the API using a PATCH call to set to_review annotation status. Patching annotation status to exporting state triggers an export retry.

Webhook Extension

Webhooks are a more lightweight mechanism, used to send simple one-way notifications about specific events, typically associated with object lifecycle, to external components. A webhook is notified when a defined type of webhook event occurs.

For description how to create and manage webhooks, see the Webhook API.

Validating payloads from Rossum

For authorization of payloads, the shared secret method is used. When a secret token is set in webhook.config.secret, Rossum uses it to create a hash signature with each payload. This hash signature is passed along with each request in the headers as X-Elis-Signature.

The goal is to compute a hash using webhook.config.secret and the request body, and ensure that the signature produced by Rossum is the same. Rossum uses HMAC SHA1 signature.

Example of webhook receiver, which verifies the validity of Rossum request

import hashlib
import hmac

from flask import Flask, request, abort

app = Flask(__name__)

SECRET_KEY = "<Your secret key stored in webhook.config.secret>"  # never store this in code

@app.route("/test_webhook", methods=["POST"])
def test_webhook():
    digest = hmac.new(SECRET_KEY.encode(), request.data, hashlib.sha1).hexdigest()
    try:
        prefix, signature = request.headers["X-Elis-Signature"].split("=")
    except ValueError:
        abort(401, "Incorrect header format")

    if not (prefix == "sha1" and hmac.compare_digest(signature, digest)):
        abort(401, "Authorization failed.")
    return 

Webhook Events

Event Actions Description
annotation_status changed Hook is notified, whenever a status change occurs.

Annotation status data format

Example data sent to webhook

{
  "action": "changed",
  "event": "annotation_status",
  "annotation": {
    "document": "https://api.elis.rossum.ai/v1/documents/314621",
    "id": 314521,
    "queue": "https://api.elis.rossum.ai/v1/queues/8236",
    "schema": "https://api.elis.rossum.ai/v1/schemas/223",
    "pages": [
      "https://api.elis.rossum.ai/v1/pages/551518"
    ],
    "modifier": null,
    "modified_at": null,
    "confirmed_at": null,
    "exported_at": null,
    "assigned_at": null,
    "status": "to_review",
    "previous_status": "importing",
    "rir_poll_id": "54f6b91cfb751289e71ddf12",
    "messages": null,
    "url": "https://api.elis.rossum.ai/v1/annotations/314521",
    "content": "https://api.elis.rossum.ai/v1/annotations/314521/content",
    "time_spent": 0,
    "metadata": {}
  },
  "document": {
    "id": 314621,
    "url": "https://api.elis.rossum.ai/v1/documents/314621",
    "s3_name": "272c2f41ae84a4f19a422cb432a490bb",
    "mime_type": "application/pdf",
    "arrived_at": "2019-02-06T23:04:00.933658Z",
    "original_file_name": "test_invoice_1.pdf",
    "content": "https://api.elis.rossum.ai/v1/documents/314621/content",
    "metadata": {}
  }
}
Key Type Description
action string Type of action that occurred in the event
event string Type of event that occurred
document object document object (attribute annotations is excluded)
annotation object annotation object (enriched with attribute previous_status)

Custom UI Extension

Sometimes users might want to extend the behavior of UI validation view with something special. That should be the goal of custom UI extensions.

Buttons

Currently, there are two different ways of using a custom button:

  1. Popup Button - opens a specific URL in the web browser
  2. Validate Button - triggers a standard validate call to connector

If you would like to read more about how to create a button, see the Button schema.

Popup Button opens a website completely managed by the user in a separate tab. It runs in parallel to the validation interface session in the app. Such website can be used for any interface that will assist operators in the reviewing process.

Example Use Cases of Popup Button:

  1. opening an email linked to the annotated document
  2. creating a new item in external database according to extracted data
Communication with the Validation Interface

Although it is completely possible to query our API from within your popup, direct way of communication with the validation interface may be desired for the sake of simplicity. Such communication uses standard browser API of window.postMessage.

You will need to use window.addEventListeners in order to receive messages from the validation interface:

Once the listener is in place, you can post one of supported message types:

Providing message type to postMessage lets Rossum interface know what operation user requests and determines the type of the answer which could be used to match appropriate response.

Validate button

If popup_url key is missing in button's schema, clicking the button will trigger a standard validate call to connector. In such call, updated_datapoint_ids will contain the ID of the pressed button.

Note: if you’re missing some annotation data that you’d like to receive in a similar way, do contact our support team. We’re collecting feedback to further expand this list.

Embedded Mode

In some use-cases, it is desirable to use only the per-annotation validation view of the Rossum application. Rossum may be integrated with other systems using so-called embedded mode.

In embedded mode, special URL is constructed and then used in iframe or popup browser window to show Rossum annotation view. Some view navigation widgets are hidden (such as home, postpone and delete buttons), so that user is only allowed to update and confirm all field values.

Embedded mode workflow

The host application first uploads a document using standard Rossum API. During this process, an annotation object is created. It is possible to obtain a status of the annotation object and wait for the status to became to_review (ready for checking) using annotation endpoint.

As soon as importing of the annotation object has finished, an authenticated user may call start_embedded endpoint to obtain a URL that is to be included in iframe or popup browser window of the host application. Parameters of the call are return_url and cancel_url that are used to redirect to in a browser when user finishes the annotation.

The URL contains security token that is used by embedded Rossum application to access Rossum API. When the checking of the document has finished, user clicks on done button and host application is notified about finished annotation through save endpoint of the connector HTTP API. By default, this call is made asynchronously, which causes a lag (up to a few seconds) between the click on done button and the call to save endpoint. However, it is possible to switch the calls to synchronous mode by switching the connector asynchronous toggle to false (see connector for reference).

API Reference

For introduction to the Rossum API, see Overview

Most of the API endpoints require user to be authenticated, see Authentication for details.

Organization

Example organization object

{
  "id": 406,
  "url": "https://api.elis.rossum.ai/v1/organizations/406",
  "name": "East West Trading Co",
  "workspaces": [
    "https://api.elis.rossum.ai/v1/workspaces/7540"
  ],
  "users": [
    "https://api.elis.rossum.ai/v1/users/10775"
  ],
  "ui_settings": {},
  "metadata": {}
}

Organization is a basic unit that contains all objects that are required to fully use Rossum platform.

Attribute Type Default Description Read-only
id integer Id of the organization true
name string Name of the organization (not visible in UI)
url URL URL of the organization true
workspaces list[URL] List of workspaces objects in the organization true
users list[URL] List of users in the organization true
ui_settings object {} Organization-wide frontend UI settings (e.g. locales). Rossum internal.
metadata object {} Client data, see metadata.

Create new organization

curl -s -X POST -H 'Content-Type: application/json' \
  -d '{"template_name": "UK Demo Template", "organization_name": "East West Trading Co", "user_fullname": "John Doe", "user_email": "john@east-west-trading.com", "user_password": "owo1aiG9ua9Aihai", "user_ui_settings": { "locale": "en" }, "create_key": "13156106d6f185df24648ac7ff20f64f1c5c06c144927be217189e26f8262c4a"}' \
  'https://api.elis.rossum.ai/v1/organizations/create'
{
  "organization": {
    "id": 105,
    "url": "https://api.elis.rossum.ai/v1/organizations/105",
    "name": "East West Trading Co",
    "workspaces": [
      "https://api.elis.rossum.ai/v1/workspaces/160"
    ],
    "users": [
      "https://api.elis.rossum.ai/v1/users/173"
    ],
    "ui_settings": {},
    "metadata": {}
  }
}

POST /v1/organizations/create

Create new organization and related objects (workspace, queue, user, schema, inbox).

Attribute Type Description
template_name enum Template to use for new organization (see below)
organization_name string Name of the organization. Will be also used as a base for inbox e-mail address.
user_fullname string Full user name
user_email EMAIL Valid email of the user (also used as Rossum login)
user_password string Initial user password
user_ui_settings object Initial UI settings, default: {"locale": "en"}
create_key string A key that allows to create an organization

You need a create_key in order to create an organization. Please contact support@rossum.ai to obtain one.

Selected template_name affects default schema and extracted fields. Available templates:

Template name Description
EU Demo Template VAT invoices with EU-style bank information
US Demo Template Tax invoices in the US and internationally
UK Demo Template Also best for India, Canada and Australia
CZ Demo Template Czech standard invoices
Empty Organization Template Empty organization, suitable for further customization

Response

Status: 200

Returns object with organization key and organization object value.

List all organizations

List all organizations

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/organizations'
{
  "pagination": {
    "total": 1,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 406,
      "url": "https://api.elis.rossum.ai/v1/organizations/406",
      "name": "East West Trading Co",
      "workspaces": [
        "https://api.elis.rossum.ai/v1/workspaces/7540"
      ],
      "users": [
        "https://api.elis.rossum.ai/v1/users/10775"
      ],
      "ui_settings": {},
      "metadata": {}
    }
  ]
}

GET /v1/organizations

Retrieve all organization objects.

Supported filters: id, name

Supported ordering: id, name

Response

Status: 200

Returns paginated response with a list of organization objects. Usually, there would only be one organization.

Retrieve a organization

Get organization object 406

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/organizations/406'
{
  "id": 406,
  "url": "https://api.elis.rossum.ai/v1/organizations/406",
  "name": "East West Trading Co",
  "workspaces": [
    "https://api.elis.rossum.ai/v1/workspaces/7540"
  ],
  "users": [
    "https://api.elis.rossum.ai/v1/users/10775"
  ],
  "ui_settings": {},
  "metadata": {}
}

GET /v1/organizations/{id}

Get an organization object.

Response

Status: 200

Returns organization object.

User

Example user object

{
  "id": 10775,
  "url": "https://api.elis.rossum.ai/v1/users/10775",
  "first_name": "John",
  "last_name": "Doe",
  "email": "john-doe@east-west-trading.com",
  "date_joined": "2018-09-19T13:44:56.000000Z",
  "username": "john-doe@east-west-trading.com",
  "groups": [
    "https://api.elis.rossum.ai/v1/groups/3"
  ],
  "organization": "https://api.elis.rossum.ai/v1/organizations/406",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "is_active": true,
  "last_login": "2019-02-07T16:20:18.652253Z",
  "ui_settings": {},
  "metadata": {}
}

A user object represents individual user of Rossum. Every user is assigned to an organization.

A user can be assigned user role (permission groups): viewer, annotator, manager, admin.

User may be assigned to one or more queues and can only access annotations from the assigned queues. This restriction is not applied to admin users, who may access annotations from all queues.

Users cannot be deleted, but can be disabled (set is_active to false). Field email cannot be changed through the API (due to security reasons).

Attribute Type Default Description Read-only
id integer Id of the user true
url URL URL of the user true
first_name string First name of the user
last_name string Last name of the user
email string Email of the user true
date_joined datetime Date of user join
username string Username of a user
groups list[URL] [] List of user role (permission groups)
organization URL Related organization
queues list[URL] [] List of queues user is assigned to.
is_active bool true Whether user is enabled or disabled
last_login datetime Date of last login
ui_settings object {} User-related frontend UI settings (e.g. locales). Rossum internal.
metadata object {} Client data, see metadata.

List all users

List all users in the organization.

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/users'
{
  "pagination": {
    "total": 1,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 10775,
      "url": "https://api.elis.rossum.ai/v1/users/10775",
      "first_name": "John",
      "last_name": "Doe",
      "email": "john-doe@east-west-trading.com",
      "date_joined": "2018-09-19T13:44:56.000000Z",
      "username": "john-doe@east-west-trading.com",
      ...
    }
  ]
}

GET /v1/users

Retrieve all user objects.

Supported filters: id, organization, username, first_name, last_name, email, is_active, last_login, groups, queues

Supported ordering: id, username, first_name, last_name, email, last_login, date_joined

Response

Status: 200

Returns paginated response with a list of user objects.

Create new user

Create new user in organization 406

curl -s -X POST -H 'Content-Type: application/json' -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -d '{"organization": "https://api.elis.rossum.ai/v1/organizations/406", "username": "jane@east-west-trading.com", "email": "jane@east-west-trading.com", "queues": ["https://api.elis.rossum.ai/v1/queues/8236"], "groups": ["https://api.elis.rossum.ai/v1/groups/2"]}' \
  'https://api.elis.rossum.ai/v1/users'
{
  "id": 10997,
  "url": "https://api.elis.rossum.ai/v1/users/10997",
  "first_name": "",
  "last_name": "",
  "email": "jane@east-west-trading.com",
  "date_joined": "2019-02-09T22:16:38.969904Z",
  "username": "jane@east-west-trading.com",
  "groups": [
    "https://api.elis.rossum.ai/v1/groups/2"
  ],
  "organization": "https://api.elis.rossum.ai/v1/organizations/406",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "is_active": true,
  "last_login": null,
  "ui_settings": {}
}

POST /v1/users

Create a new user object.

Response

Status: 201

Returns created user object.

Retrieve a user

Get user object 10997

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/users/10997'
{
  "id": 10997,
  "url": "https://api.elis.rossum.ai/v1/users/10997",
  "first_name": "Jane",
  "last_name": "Bond",
  "email": "jane@east-west-trading.com",
  "date_joined": "2019-02-09T22:16:38.969904Z",
  "username": "jane@east-west-trading.com",
  ...
}

GET /v1/users/{id}

Get a user object.

Response

Status: 200

Returns user object.

Update a user

Update user object 10997

curl -s -X PUT -H 'Content-Type: application/json' -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -d '{"organization": "https://api.elis.rossum.ai/v1/organizations/406", "username": "jane@east-west-trading.com", "queues": ["https://api.elis.rossum.ai/v1/queues/8236"], "groups": ["https://api.elis.rossum.ai/v1/groups/2"], "first_name": "Jane"}' \
  'https://api.elis.rossum.ai/v1/users/10997'
{
  "id": 10997,
  "url": "https://api.elis.rossum.ai/v1/users/10997",
  "first_name": "Jane",
  "last_name": "",
  "email": "jane@east-west-trading.com",
  ...
}

PUT /v1/users/{id}

Update user object.

Response

Status: 200

Returns updated user object.

Update part of a user

Update first_name of user object 10997

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"first_name": "Emma"}' \
  'https://api.elis.rossum.ai/v1/users/10997'
{
  "id": 10997,
  "url": "https://api.elis.rossum.ai/v1/users/10997",
  "first_name": "Emma",
  "last_name": "",
  ...
}

PATCH /v1/users/{id}

Update part of user object.

Response

Status: 200

Returns updated user object.

Password

Due to security reasons, user passwords cannot be set directly using the standard CRUD operations. Instead, the following endpoints can be used for resetting and changing passwords.

Change password

Change password of user object 10997

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"new_password1": "new_password", "new_password2": "new_password", "old_password": "old_password"}' \
  'https://api.elis.rossum.ai/v1/auth/password/change'
{
  "id": 10997,
  "url": "https://api.elis.rossum.ai/v1/users/10997",
  ...
}

POST /v1/auth/password/change

Change password of current user.

Response

Status: 200

Returns user object.

Reset password

Reset password of a user with username jane@east-west-trading.com

curl -X POST -H 'Content-Type: application/json' \
  -d '{"username": "jane@east-west-trading.com"}' \
  'https://api.elis.rossum.ai/v1/auth/password/reset'
{"detail": "Password reset e-mail has been sent."}

POST /v1/auth/password/reset

Reset password to a users specified by their usernames. The users are sent an email with a verification URL leading to web form, where they can set their password.

Response

Status: 200

User Role

Example role object

{
  "id": 3,
  "url": "https://api.elis.rossum.ai/v1/groups/3",
  "name": "admin"
}

User role is a group of permissions that are assigned to the user. Permissions are assigned to individual operations on objects.

Attribute Type Default Description Read-only
id integer Id of the user role true
url URL URL of the user role true
name string Name oft the user role true

There are three pre-defined roles:

Role Description
viewer Read-only user, cannot change any API object. May be useful for automated data export or auditor access.
annotator User that is allowed to change annotation, its datapoints and import a document.
manager In addition to permissions of annotator the user is also allowed to access usage-reports.
admin User can modify API objects to set-up organization (e.g. workspaces, queues, schemas)

User can only access annotations from queues it is assigned to, with the exception of admin role that can access any queue.

Permissions assigned to the role cannot be changed through the API.

List all user roles

List all user roles (groups)

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/groups'
{
  "pagination": {
    "total": 3,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "url": "https://api.elis.rossum.ai/v1/groups/1",
      "name": "viewer"
    },
    {
      "url": "https://api.elis.rossum.ai/v1/groups/2",
      "name": "annotator"
    },
    {
      "url": "https://api.elis.rossum.ai/v1/groups/3",
      "name": "admin"
    }
  ]
}

GET /v1/groups

Retrieve all organization objects.

Supported filters: name

Supported ordering: name

Response

Status: 200

Returns paginated response with a list of group objects.

Retrieve a user role

Get group object 2

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/groups/2'
{
  "url": "https://api.elis.rossum.ai/v1/groups/2",
  "name": "annotator"
}

GET /v1/groups/{id}

Get a user role object.

Response

Status: 200

Returns group object.

Workspace

Example workspace object

{
  "id": 7540,
  "name": "East West Trading Co",
  "url": "https://api.elis.rossum.ai/v1/workspaces/7540",
  "autopilot": true,
  "organization": "https://api.elis.rossum.ai/v1/organizations/406",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199",
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "metadata": {}
}

A workspace object is a container of queue objects.

Attribute Type Default Description Read-only
id integer Id of the workspace true
name string Name of the workspace
url URL URL of the workspace true
autopilot bool false Whether to automatically confirm datapoints (hide eyes) from previously seen annotations
organization URL Related organization
queues list[URL] [] List of queues that belongs to the workspace true
metadata object {} Client data, see metadata.

List all workspaces

List all workspaces

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/workspaces'
{
  "pagination": {
    "total": 1,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 7540,
      "name": "East West Trading Co",
      "url": "https://api.elis.rossum.ai/v1/workspaces/7540",
      "autopilot": true,
      "organization": "https://api.elis.rossum.ai/v1/organizations/406",
      "queues": [
        "https://api.elis.rossum.ai/v1/queues/8199",
        "https://api.elis.rossum.ai/v1/queues/8236"
      ],
      "metadata": {}
    }
  ]
}

GET /v1/workspaces

Retrieve all workspace objects.

Supported filters: id, name, organization, autopilot

Supported ordering: id, name

Response

Status: 200

Returns paginated response with a list of workspace objects.

Create a new workspace

Create new workspace in organization 406 named Test Workspace

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Test Workspace", "organization": "https://api.elis.rossum.ai/v1/organizations/406"}' \
  'https://api.elis.rossum.ai/v1/workspaces'
{
  "id": 7694,
  "name": "Test Workspace",
  "url": "https://api.elis.rossum.ai/v1/workspaces/7694",
  "autopilot": false,
  "organization": "https://api.elis.rossum.ai/v1/organizations/406",
  "queues": [],
  "metadata": {}
}

POST /v1/workspaces

Create a new workspace object.

Response

Status: 201

Returns created workspace object.

Retrieve a workspace

Get workspace object 7694

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/workspaces/7694'
{
  "id": 7694,
  "name": "Test Workspace",
  "url": "https://api.elis.rossum.ai/v1/workspaces/7694",
  "autopilot": false,
  "organization": "https://api.elis.rossum.ai/v1/organizations/406",
  "queues": [],
  "metadata": {}
}

GET /v1/workspaces/{id}

Get an workspace object.

Response

Status: 200

Returns workspace object.

Update a workspace

Update workspace object 7694

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "My Workspace", "organization": "https://api.elis.rossum.ai/v1/organizations/406"}' \
  'https://api.elis.rossum.ai/v1/workspaces/7694'
{
  "id": 7694,
  "name": "My Workspace",
  "url": "https://api.elis.rossum.ai/v1/workspaces/7694",
  "autopilot": false,
  "organization": "https://api.elis.rossum.ai/v1/organizations/406",
  "queues": [],
  "metadata": {}
}

PUT /v1/workspaces/{id}

Update workspace object.

Response

Status: 200

Returns updated workspace object.

Update part of a workspace

Update name of workspace object 7694

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Important Workspace"}' \
  'https://api.elis.rossum.ai/v1/workspaces/7694'
{
  "id": 7694,
  "name": "Important Workspace",
  "url": "https://api.elis.rossum.ai/v1/workspaces/7694",
  "autopilot": false,
  "organization": "https://api.elis.rossum.ai/v1/organizations/406",
  "queues": [],
  "metadata": {}
}

PATCH /v1/workspaces/{id}

Update part of workspace object.

Response

Status: 200

Returns updated workspace object.

Delete a workspace

Delete workspace 7694

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/workspaces/7694'

DELETE /v1/workspaces/{id}

Delete workspace object.

Response

Status: 204

Queue

Example queue object

{
  "id": 8198,
  "name": "Received invoices",
  "url": "https://api.elis.rossum.ai/v1/queues/8198",
  "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540",
  "connector": null,
  "webhooks": [],
  "schema": "https://api.elis.rossum.ai/v1/schemas/31336",
  "inbox": "https://api.elis.rossum.ai/v1/inboxes/1229",
  "users": [
    "https://api.elis.rossum.ai/v1/users/10775"
  ],
  "session_timeout": "01:00:00",
  "rir_url": "https://all.rir.rossum.ai",
  "rir_params": null,
  "counts": {
    "importing": 0,
    "split": 0,
    "failed_import": 0,
    "to_review": 2,
    "reviewing": 0,
    "exporting": 0,
    "postponed": 0,
    "failed_export": 0,
    "exported": 0,
    "deleted": 0,
    "purged": 0
  },
  "default_score_threshold": null,
  "automation_enabled": false,
  "automation_level": "never",
  "preserve_annotation_schema": false,
  "locale": "en_US",
  "metadata": {},
  "settings": {"columns": [{"schema_id": "tags"}]},
}

A queue object represents a basic organization unit of annotations. Annotations are imported to a queue either through a REST API upload endpoint or by sending an email to a related inbox. Export is also performed on a queue using export endpoint.

Queue also specifies a schema for annotations and a connector.

Annotators and viewers only see queues they are assigned to.

Attribute Type Default Description Read-only
id integer Id of the queue true
name string Name of the queue
url URL URL of the queue true
workspace URL Workspace in which the queue should be placed
connector URL null Connector associated with the queue
webhooks list[URL] [] Webhooks associated with the queue
schema URL Schema which will be applied to annotations in this queue
inbox URL null Inbox for import to this queue
users list[URL] [] Users associated with this queue
session_timeout timedelta 1 hour Time before annotation will be returned from revieving status to to_review
rir_url URL null URL representing the particular AI Core Engine variant used for document processing. Usually https://all.rir.rossum.ai, will vary for custom model training users.
rir_params string null URL parameters to be passed to the AI Core Engine, see below
counts object Count of annotations per status true
default_score_threshold float [0;1] null Threshold used to automatically validate field content based on AI confidence scores. If not set, global default 0.975 is used.
automation_enabled bool false Toggle for switching automation on/off
automation_level string never Set level of automation, see Automation level.
preserve_annotation_schema bool false (true since February 1, 2020) Whether to preserve annotation.schema (not updating to the queue.schema whenever queue.schema is replaced)
locale string en_GB Typical originating region of documents processed in this queue specified in the locale format, see below.
metadata object {} Client data, see metadata.
settings object {} Queue settings object contains schema ids list to be shown on a dashboard.

More specific AI Core Engine parameters influencing the extraction may be set using rir_params field. So far, these parameters are publicly available:

The locale field is a hint for the AI Core Engine on how to resolve some ambiguous cases during data extraction, concerning e.g. date formats or decimal separators that may depend on the locale. For example, in US the typical date format is mm/dd/yyyy whilst in Europe it is dd.mm.yyyy. A date such as "12. 6. 2018" will be extracted as Jun 12 when locale is en_GB, while the same date will be extracted as Dec 6 when locale is en_US.

Automation level

With queue attribute automation_level you can set at which circumstances should be annotation auto-exported after the AI-based data extraction, without validation in the UI (skipping the to_review and reviewing state).

Attribute can have following options:

Automation level Description
always Auto-export all documents with no validation errors. When there is an error triggered for a non-required field, such values are deleted and export is re-tried.
confident Auto-export documents with at least one validation source and no validation errors.
never Annotation is not automatically exported and must be validated in UI manually.

Import a document

Upload file using a form (multipart/form-data)

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -F content=@document.pdf \
  'https://api.elis.rossum.ai/v1/queues/8236/upload'

Upload file in a request body

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -H 'Content-Disposition: attachment; filename=document.pdf' --data-binary @file.pdf \
  'https://api.elis.rossum.ai/v1/queues/8236/upload'

Upload file in a request body with a filename in URL

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  --data-binary @file.pdf \
  'https://api.elis.rossum.ai/v1/queues/8236/upload/document.pdf'

Upload multiple files using multipart/form-data

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -F content=@document1.pdf -F content=@document2.pdf \
  'https://api.elis.rossum.ai/v1/queues/8236/upload'

Upload file using basic authentication

curl -u 'east-west-trading-co@elis.rossum.ai:secret' \
  -F content=@document.pdf \
  'https://api.elis.rossum.ai/v1/queues/8236/upload'

Upload file with additional field values

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -F content=@document.pdf \
  -F values='{"upload:organization_unit":"Sales"}' \
  'https://api.elis.rossum.ai/v1/queues/8236/upload'

POST /v1/queues/{id}/upload

POST /v1/queues/{id}/upload/{filename}

Uploads a document to the queue (starting in the importing state). This creates a document object and an empty annotation object.

The file can be sent as a part of multipart/form-data or, alternatively, in the request body. Multiple files upload is supported, the total size of the data uploaded may not exceed 40 MB.

You can also specify additional values passed using values form field. It may be used to initialize datapoint values by setting the value of rir_field_names in the schema.

For example upload:organization_unit field may be referenced in a schema like this:

   {
     "category": "datapoint",
     "id": "organization_unit",
     "label": "Org unit",
     "type": "string",
     "rir_field_names": ["upload:organization_unit"]
     ...
   }

Upload endpoint also supports basic authentication to enable easy integration with third-party systems.

Response

Status: 200

Response contains a list of annotations and documents created. Top-level keys annotation and document are obsolete and should be ignored.

Example upload response

{
  "results": [
    {
      "annotation": "https://api.elis.rossum.ai/v1/annotations/315509",
      "document": "https://api.elis.rossum.ai/v1/documents/315609"
    },
    {
      "annotation": "https://api.elis.rossum.ai/v1/annotations/315510",
      "document": "https://api.elis.rossum.ai/v1/documents/315610"
    }
  ],
  "annotation": "https://api.elis.rossum.ai/v1/annotations/315509",
  "document": "https://api.elis.rossum.ai/v1/documents/315609"
}

Export annotations

Download CSV file with selected columns from annotations 315777 and 315778.

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8236/export?format=csv&columns=meta_file_name,invoice_id,date_issue,sender_name,amount_total&id=315777,315778'
meta_file_name,Invoice number,Invoice Date,Sender Name,Total amount
template_invoice.pdf,12345,2017-06-01,"Peter, Paul and Merry",900.00
quora.pdf,2183760194,2018-08-06,"Quora, Inc",500.00

Download CSV file with prepend_columns and append_columns from annotations 315777 and 315778.

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8236/export?format=csv&prepend_columns=meta_file_name&append_columns=meta_url&id=315777,315778'
meta_file_name,Invoice number,Invoice Date,Sender Name,Total amount,meta_url
template_invoice.pdf,12345,2017-06-01,"Peter, Paul and Merry",900.00,https://api.elis.rossum.ai/v1/annotations/315777
quora.pdf,2183760194,2018-08-06,"Quora, Inc",500.00,https://api.elis.rossum.ai/v1/annotations/315778

Download CSV file for a specific page when downloading large amounts of data.

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8236/export?format=csv&status=exported&page=1&page_size=1000'

Download XML file with all exported annotations

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8236/export?format=xml&status=exported'
<?xml version="1.0" encoding="utf-8"?>
<export>
  <results>
    <annotation url="https://api.elis.rossum.ai/v1/annotations/315777">
      <status>exported</status>
      <arrived_at>2019-10-13T21:33:01.509886Z</arrived_at>
      <exported_at>2019-10-13T12:00:01.000133Z</exported_at>
      <document url="https://api.elis.rossum.ai/v1/documents/315877">
        <file_name>template_invoice.pdf</file_name>
        <file>https://api.elis.rossum.ai/v1/documents/315877/content</file>
      </document>
      <modifier/>
      <schema url="https://api.elis.rossum.ai/v1/schemas/31336"/>
      <metadata/>
      <content>
        <section schema_id="invoice_details_section">
          <datapoint schema_id="invoice_id" type="string" rir_confidence="0.99">12345</datapoint>
          ...
        </section>
      </content>
    </annotation>
  </results>
  <pagination>
    <next/>
    <previous/>
    <total>1</total>
    <total_pages>1</total_pages>
  </pagination>
</export>

Download JSON file with all exported annotations that were imported on October 13th 2019.

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8236/export?format=json&status=exported&arrived_at_after=2019-10-13&arrived_at_before=2019-10-14'
{
  "pagination": {
    "total": 5,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "url": "https://api.elis.rossum.ai/v1/annotations/315777",
      "status": "exported",
      "arrived_at": "2019-10-13T21:33:01.509886Z",
      "exported_at": "2019-10-14T12:00:01.000133Z",
      "document": {
        "url": "https://api.elis.rossum.ai/v1/documents/315877",
        "file_name": "template_invoice.pdf",
        "file": "https://api.elis.rossum.ai/v1/documents/315877/content"
      },
      "modifier": null,
      "schema": {
        "url": "https://api.elis.rossum.ai/v1/schemas/31336"
      },
      "metadata": {},
      "content": [
        {
          "category": "section",
          "schema_id": "invoice_details_section",
          "children": [
            {
              "category": "datapoint",
              "schema_id": "invoice_id",
              "value": "12345",
              "type": "string",
              "rir_confidence": 0.99
            },
            ...
          ]
        }
      ]
    }
  ]
}

GET /v1/queues/{id}/export

Export annotations from the queue in XML, CSV or JSON format.

Output format is negotiated by Accept header or format parameter. Supported formats are: csv, xml and json.

Filters

Filters may be specified to limit annotations to be exported, all filters applicable to the annotation list are supported. Multiple filter attributes are combined with AND, which results in more specific response.

The most common filters are either list of ids or specifying a time period:

Attribute Description
id Id of annotation to be exported, multiple ids may be separated by a comma.
status Annotation status
modifier User id
arrived_at_before ISO 8601 timestamp (e.g. arrived_at_before=2019-11-15)
arrived_at_after ISO 8601 timestamp (e.g. arrived_at_after=2019-11-14)
exported_at_before ISO 8601 timestamp (e.g. exported_at_before=2019-11-14 22:00:00)
exported_at_after ISO 8601 timestamp (e.g. exported_at_after=2019-11-14 12:00:00)
page_size Number of the documents to be exported. Maximum value is 1000. To be used together with page attribute.
page Number of a page to be exported when using pagination. Useful for exports

of large amounts of data. To be used together with the page_size attribute.

Response

Status: 200

Returns paginated response that contains annotation data in one of the following format.

CSV

Columns included in CSV output are defined by columns, prepend_columns and append_columns URL parameters. prepend_columns parameter defines columns at the beginning of the row while append_columns at the end. All stated parameters are specified by datapoint schema ids and meta-columns. Default is to export all fields defined in a schema.

Supported meta-columns are: meta_arrived_at, meta_file, meta_file_name, meta_status, meta_url.

XML

XML format is described by XML Schema Definition queues_export.xsd.

JSON

JSON format uses format similar to the XML format above.

List all queues

List all queues in workspace 7540 ordered by name

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues?workspace=7540&ordering=name'
{
  "pagination": {
    "total": 2,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 8199,
      "name": "Receipts",
      "url": "https://api.elis.rossum.ai/v1/queues/8199",
      "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540",
      ...
    },
    {
      "id": 8198,
      "name": "Received invoices",
      "url": "https://api.elis.rossum.ai/v1/queues/8198",
      "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540",
      ...
    }
  ]
}

GET /v1/queues

Retrieve all queue objects.

Supported ordering: id, name, workspace, connector, webhooks, schema, inbox, locale

Filters

Attribute Description
id Id of a queue
name Name of a queue
workspace Id of a workspace
inbox Id of an inbox
connector Id of a connector
webhooks Ids of webhooks
locale Queue object locale

Response

Status: 200

Returns paginated response with a list of queue objects.

Create new queue

Create new queue in workspace 7540 named Test Queue

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Test Queue", "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540", "schema": "https://api.elis.rossum.ai/v1/schemas/31336"}' \
  'https://api.elis.rossum.ai/v1/queues'
{
  "id": 8236,
  "name": "Test Queue",
  "url": "https://api.elis.rossum.ai/v1/queues/8236",
  "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540",
  ...
}

POST /v1/queues

Create a new queue object.

Response

Status: 201

Returns created queue object.

Retrieve a queue

Get queue object 8198

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8198'
{
  "id": 8198,
  "name": "Received invoices",
  "url": "https://api.elis.rossum.ai/v1/queues/8198",
  "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540",
  ...
}

GET /v1/queues/{id}

Get a queue object.

Response

Status: 200

Returns queue object.

Update a queue

Update queue object 8236

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "My Queue", "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540", "schema": "https://api.elis.rossum.ai/v1/schemas/31336"}' \
  'https://api.elis.rossum.ai/v1/queues/8236'
{
  "id": 8236,
  "name": "My Queue",
  "url": "https://api.elis.rossum.ai/v1/queues/8236",
  "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540",
  ...
}

PUT /v1/queues/{id}

Update queue object.

Response

Status: 200

Returns updated queue object.

Update part of a queue

Update name of queue object 8236

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "New Queue"}' \
  'https://api.elis.rossum.ai/v1/queues/8236'
{
  "id": 8236,
  "name": "New Queue",
  "url": "https://api.elis.rossum.ai/v1/queues/8236",
  "workspace": "https://api.elis.rossum.ai/v1/workspaces/7540",
  ...
}

PATCH /v1/queues/{id}

Update part of queue object.

Response

Status: 200

Returns updated queue object.

Delete a queue

Delete queue 8236

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/queues/8236'

DELETE /v1/queues/{id}

Delete queue object.

Response

Status: 204

Assign annotation

POST /v1/queues/{id}/next

Assign calling user an available annotation from the queue.

This endpoint is INTERNAL and may change in the future.

Inbox

Example inbox object

{
  "id": 1234,
  "name": "Receipts",
  "url": "https://api.elis.rossum.ai/v1/inboxes/1234",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "email": "east-west-trading-co-a34f3a@elis.rossum.ai",
  "email_prefix": "east-west-trading-co",
  "bounce_email_to": "bounces@east-west.com",
  "bounce_unprocessable_attachments": false,
  "bounce_deleted_annotations": false,
  "metadata": {}
}

An inbox object enables email ingestion to a related queue. We enforce email domain to match Rossum domain (e.g. elis.rossum.ai). email_prefix may be used to construct unique email address.

Attribute Type Default Description Read-only
id integer Id of the inbox true
name string Name of the inbox (not visible in UI)
url URL URL of the inbox true
queues list[URL] List of queues that receive documents from inbox. Usually there is one inbox per queue.
email EMAIL Rossum email address (e.g. east-west-trading-co-a34f3a@elis.rossum.ai)
email_prefix string Rossum email address prefix (e.g. east-west-trading-co)
bounce_email_to EMAIL Email address to send notifications to (e.g. about failed import).
bounce_unprocessable_attachments bool false Whether return back unprocessable attachments (e.g. MS Word docx) or just silently ignore them.
bounce_postponed_annotations bool false Whether to send notification when annotation is postponed.
bounce_deleted_annotations bool false Whether to send notification when annotation is deleted.
metadata object {} Client data, see metadata.

List all inboxes

List all inboxes

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/inboxes'
{
  "pagination": {
    "total": 2,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 1234,
      "name": "Receipts",
      "url": "https://api.elis.rossum.ai/v1/inboxes/1234",
      "queues": [
        "https://api.elis.rossum.ai/v1/queues/8199"
      ],
      "email": "east-west-trading-co-recepits@elis.rossum.ai",
      "email_prefix": "east-west-trading-co-recepits",
      "bounce_email_to": "bounces@east-west.com",
      "bounce_unprocessable_attachments": false,
      "bounce_deleted_annotations": false,
      "metadata": {}
    },
    {
      "id": 1244,
      "name": "Beta Inbox",
      "url": "https://api.elis.rossum.ai/v1/inboxes/1244",
      "queues": [
        "https://api.elis.rossum.ai/v1/queues/8236"
      ],
      "email": "east-west-trading-co-beta@elis.rossum.ai",
      "email_prefix": "east-west-trading-co-beta",
      "bounce_email_to": "bill@east-west.com",
      "bounce_unprocessable_attachments": false,
      "bounce_deleted_annotations": false,
      "metadata": {}
    }
  ]
}

GET /v1/inboxes

Retrieve all inbox objects.

Supported filters: id, name, email, email_prefix, bounce_email_to, bounce_unprocessable_attachments, bounce_postponed_annotations, bounce_deleted_annotations

Supported ordering: id, name, email, email_prefix, bounce_email_to

Response

Status: 200

Returns paginated response with a list of inbox objects.

Create a new inbox

Create new inbox related to queue 8236 named Test Inbox

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Test Inbox", "email_prefix": "east-west-trading-co-test", "bounce_email_to": "joe@east-west-trading.com", "queues": ["https://api.elis.rossum.ai/v1/queues/8236"]}' \
  'https://api.elis.rossum.ai/v1/inboxes'
{
  "id": 1244,
  "name": "Test Inbox",
  "url": "https://api.elis.rossum.ai/v1/inboxes/1244",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "email": "east-west-trading-co-test-b21e3a@elis.rossum.ai",
  "email_prefix": "east-west-trading-co-test",
  "bounce_email_to": "joe@east-west-trading.com",
  "bounce_unprocessable_attachments": false,
  "bounce_postponed_annotations": false,
  "bounce_deleted_annotations": false,
  "metadata": {}
}

POST /v1/inboxes

Create a new inbox object.

Response

Status: 201

Returns created inbox object.

Retrieve a inbox

Get inbox object 1244

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/inboxes/1244'
{
  "id": 1244,
  "name": "Test Inbox",
  "url": "https://api.elis.rossum.ai/v1/inboxes/1244",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "email": "east-west-trading-co-beta@elis.rossum.ai",
  ...
}

GET /v1/inboxes/{id}

Get an inbox object.

Response

Status: 200

Returns inbox object.

Update a inbox

Update inbox object 1244

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Shiny Inbox", "email": "east-west-trading-co-test@elis.rossum.ai", "bounce_email_to": "jack@east-west-trading.com", "queues": ["https://api.elis.rossum.ai/v1/queues/8236"]}' \
  'https://api.elis.rossum.ai/v1/inboxes/1244'
{
  "id": 1244,
  "name": "Shiny Inbox",
  "url": "https://api.elis.rossum.ai/v1/inboxes/1244",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "email": "east-west-trading-co-test@elis.rossum.ai",
  "bounce_email_to": "jack@east-west-trading.com",
  ...
}

PUT /v1/inboxes/{id}

Update inbox object.

Response

Status: 200

Returns updated inbox object.

Update part of a inbox

Update email of inbox object 1244

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Common Inbox"}' \
  'https://api.elis.rossum.ai/v1/inboxes/1244'
{
  "id": 1244,
  "name": "Common Inbox",
  ...
}

PATCH /v1/inboxes/{id}

Update part of inbox object.

Response

Status: 200

Returns updated inbox object.

Delete a inbox

Delete inbox 1244

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/inboxes/1244'

DELETE /v1/inboxes/{id}

Delete inbox object.

Response

Status: 204

Connector

Example connector object

{
  "id": 1500,
  "name": "MyQ Connector",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "url": "https://api.elis.rossum.ai/v1/connectors/1500",
  "service_url": "https://myq.east-west-trading.com",
  "params": "strict=true",
  "authorization_token": "wuNg0OenyaeK4eenOovi7aiF",
  "asynchronous": true,
  "metadata": {}
}

A connector is an extension of Rossum that allows to validate and modify data during validation and also export data to an external system. A connector object is used to configure external or internal endpoint of such an extension service. For more information see Extensions.

Attribute Type Default Description Read-only
id integer Id of the connector true
name string Name of the connector (not visible in UI)
url URL URL of the connector true
queues list[URL] List of queues that use connector object.
service_url URL URL of the connector endpoint
params string Query params appended to the service_url
authorization_token string Token sent to connector in Authorization: token {authorization_token} header to ensure connector was contacted by Rossum (displayed only to admin user).
asynchronous bool true Affects exporting: when true, confirm endpoint returns immediately and connector's save endpoint is called asynchronously later on.
metadata object {} Client data, see metadata.

List all connectors

List all connectors

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/connectors'
{
  "pagination": {
    "total": 1,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 1500,
      "name": "MyQ Connector",
      "queues": [
        "https://api.elis.rossum.ai/v1/queues/8199"
      ],
      "url": "https://api.elis.rossum.ai/v1/connectors/1500",
      "service_url": "https://myq.east-west-trading.com",
      "params": "strict=true",
      "authorization_token": "wuNg0OenyaeK4eenOovi7aiF",
      "asynchronous": true,
      "metadata": {}
    }
  ]
}

GET /v1/connectors

Retrieve all connector objects.

Supported filters: id, name, service_url

Supported ordering: id, name, service_url

Response

Status: 200

Returns paginated response with a list of connector objects.

Create a new connector

Create new connector related to queue 8199 with endpoint URL https://myq.east-west-trading.com

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "MyQ Connector", "queues": ["https://api.elis.rossum.ai/v1/queues/8199"], "service_url": "https://myq.east-west-trading.com", "authorization_token":"wuNg0OenyaeK4eenOovi7aiF"}' \
  'https://api.elis.rossum.ai/v1/connectors'
{
  "id": 1500,
  "name": "MyQ Connector",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "url": "https://api.elis.rossum.ai/v1/connectors/1500",
  "service_url": "https://myq.east-west-trading.com",
  "params": null,
  "authorization_token": "wuNg0OenyaeK4eenOovi7aiF",
  "asynchronous": true,
  "metadata": {}
}

POST /v1/connectors

Create a new connector object.

Response

Status: 201

Returns created connector object.

Retrieve a connector

Get connector object 1500

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/connectors/1500'
{
  "id": 1500,
  "name": "MyQ Connector",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "url": "https://api.elis.rossum.ai/v1/connectors/1500",
  "service_url": "https://myq.east-west-trading.com",
  "params": null,
  "authorization_token": "wuNg0OenyaeK4eenOovi7aiF",
  "asynchronous": true,
  "metadata": {}
}

GET /v1/connectors/{id}

Get an connector object.

Response

Status: 200

Returns connector object.

Update a connector

Update connector object 1500

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "MyQ Connector (stg)", "queues": ["https://api.elis.rossum.ai/v1/queues/8199"], "service_url": "https://myq.stg.east-west-trading.com", "authorization_token":"wuNg0OenyaeK4eenOovi7aiF"} \
  'https://api.elis.rossum.ai/v1/connectors/1500'
{
  "id": 1500,
  "name": "MyQ Connector (stg)",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "url": "https://api.elis.rossum.ai/v1/connectors/1500",
  "service_url": "https://myq.stg.east-west-trading.com",
  "params": null,
  "authorization_token": "wuNg0OenyaeK4eenOovi7aiF",
  "asynchronous": true,
  "metadata": {}
}

PUT /v1/connectors/{id}

Update connector object.

Response

Status: 200

Returns updated connector object.

Update part of a connector

Update connector URL of connector object 1500

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"service_url": "https://myq.stg2.east-west-trading.com"}' \
  'https://api.elis.rossum.ai/v1/connectors/1500'
{
  "id": 1500,
  "name": "MyQ Connector",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "url": "https://api.elis.rossum.ai/v1/connectors/1500",
  "service_url": "https://myq.stg2.east-west-trading.com",
  "params": null,
  "authorization_token": "wuNg0OenyaeK4eenOovi7aiF",
  "asynchronous": true,
  "metadata": {}
}

PATCH /v1/connectors/{id}

Update part of connector object.

Response

Status: 200

Returns updated connector object.

Delete a connector

Delete connector 1500

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/connectors/1500'

DELETE /v1/connectors/{id}

Delete connector object.

Response

Status: 204

Webhook

Example webhook object

{
  "id": 1500,
  "url": "https://api.elis.rossum.ai/v1/webhooks/1500",
  "name": "Change of Status",
  "metadata": {},
  "queues": [
     "https://api.elis.rossum.ai/v1/queues/8199",
     "https://api.elis.rossum.ai/v1/queues/8191"
  ],
  "active": true,
  "events": [
    "annotation_status"
  ],
  "config": {
    "url": "https://myq.east-west-trading.com/api/hook1?strict=true",
    "secret": "secret-token",
    "insecure_ssl": false
  }
}

A webhook is an extension of Rossum that is notified when some event occurs. A webhook object is used to configure external or internal endpoint of such an extension service. For more information see Extensions.

Attribute Type Default Description Read-only
id integer Id of the webhook true
name string Name of the webhook (not visible in UI)
url URL URL of the webhook true
queues list[URL] List of queues that use webhook object.
active bool If set to true the webhook is notified.
events list[string] List of events, when the hook should be notified. For the list of events see Webhook events.
metadata object {} Client data, see metadata.
config object Configuration of the webhook.

Config attribute

Attribute Type Default Description Read-only
url URL URL of the webhook endpoint
secret string (optional) If set, it is used to create a hash signature with each payload. For more information see Validating payloads from Rossum
insecure_ssl bool false Disable SSL certificate verification (only use for testing purposes).

List all webhooks

List all webhooks

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/webhooks'
{
  "pagination": {
    "total": 1,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 1500,
      "url": "https://api.elis.rossum.ai/v1/webhooks/1500",
      "name": "Some Webhook",
      "metadata": {},
      "queues": [
        "https://api.elis.rossum.ai/v1/queues/8199",
        "https://api.elis.rossum.ai/v1/queues/8191"
      ],
      "active": true,
      "events": [
        "annotation_status"
      ],
      "config": {
        "url": "https://myq.east-west-trading.com/api/hook1?strict=true"
      }
    }
  ]
}

GET /v1/webhooks

Retrieve all webhook objects.

Supported filters: id, name, queues, active, config_url

Supported ordering: id, name, active, config_url, events

Response

Status: 200

Returns paginated response with a list of webhook objects.

Create a new webhook

Create new webhook related to queue 8199 with endpoint URL https://myq.east-west-trading.com

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "MyQ Webhook", "queues": ["https://api.elis.rossum.ai/v1/queues/8199"], "config": {"url": "https://myq.east-west-trading.com"}, "events": []}' \
  'https://api.elis.rossum.ai/v1/webhooks'
{
  "id": 1501,
  "url": "https://api.elis.rossum.ai/v1/webhooks/1501",
  "name": "MyQ Webhook",
  "metadata": {},
  "queues": [
     "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "active": true,
  "events": [],
  "config": {
    "url": "https://myq.east-west-trading.com"
  }
}

POST /v1/webhooks

Create a new webhook object.

Response

Status: 201

Returns created webhook object.

Retrieve a webhook

Get webhook object 1500

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/webhooks/1500'
{
  "id": 1500,
  "url": "https://api.elis.rossum.ai/v1/webhooks/1500",
  "name": "Some Webhook",
  "metadata": {},
  "queues": [
     "https://api.elis.rossum.ai/v1/queues/8199",
     "https://api.elis.rossum.ai/v1/queues/8191"
  ],
  "active": true,
  "events": [
    "annotation_status"
  ],
  "config": {
    "url": "https://myq.east-west-trading.com/api/hook1?strict=true"
  }
}

GET /v1/webhooks/{id}

Get an webhook object.

Response

Status: 200

Returns webhook object.

Update a webhook

Update webhook object 1500

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "MyQ Webhook (stg)", "queues": ["https://api.elis.rossum.ai/v1/queues/8199"], "config": {"url": "https://myq.stg.east-west-trading.com"}, "events": []} \
  'https://api.elis.rossum.ai/v1/webhooks/1500'
{
  "id": 1500,
  "url": "https://api.elis.rossum.ai/v1/webhooks/1500",
  "name": "MyQ Webhook (stg)",
  "metadata": {},
  "queues": [
     "https://api.elis.rossum.ai/v1/queues/8199"
  ],
  "active": true,
  "events": [],
  "config": {
    "url": "https://myq.stg.east-west-trading.com"
  }
}

PUT /v1/webhooks/{id}

Update webhook object.

Response

Status: 200

Returns updated webhook object.

Update part of a webhook

Update webhook URL of webhook object 1500

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"config": {"url": "https://myq.stg2.east-west-trading.com"}}' \
  'https://api.elis.rossum.ai/v1/webhooks/1500'
{
  "id": 1500,
  "url": "https://api.elis.rossum.ai/v1/webhooks/1500",
  "name": "Some Webhook",
  "metadata": {},
  "queues": [
     "https://api.elis.rossum.ai/v1/queues/8199",
     "https://api.elis.rossum.ai/v1/queues/8191"
  ],
  "active": true,
  "events": [
    "annotation_status"
  ],
  "config": {
    "url": "https://myq.stg2.east-west-trading.com"
  }
}

PATCH /v1/webhooks/{id}

Update part of webhook object.

Response

Status: 200

Returns updated webhook object.

Delete a webhook

Delete webhook 1500

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/webhooks/1500'

DELETE /v1/webhooks/{id}

Delete webhook object.

Response

Status: 204

Schema

Example schema object

{
  "id": 31336,
  "name": "Basic Schema",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "url": "https://api.elis.rossum.ai/v1/schemas/31336",
  "content": [
    {
      "category": "section",
      "id": "invoice_details_section",
      "label": "Invoice details",
      "children": [
        {
          "category": "datapoint",
          "id": "invoice_id",
          "label": "Invoice number",
          "type": "string",
          "rir_field_names": [
            "invoice_id"
          ],
          "constraints": {
            "required": false
          },
          "default_value": null
        }
        ...
      ]
    },
    ...
  ],
  "metadata": {}
}

A schema object specifies the set of datapoints that are extracted from the document. For more information see Document Schema.

Attribute Type Default Description Read-only
id integer Id of the schema true
name string Name of the schema (not visible in UI)
url URL URL of the schema true
queues list[URL] List of queues that use schema object. true
content list[object] List of sections (top-level schema objects, see Document Schema for description of schema)
metadata object {} Client data, see metadata.

Validate a schema

Validate content of schema object 33725

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"content":[{"category":"section","id":"invoice_details_section","label":"Invoice details","icon": null,"children":[{"category":"datapoint","id":"invoice_id","label":"Invoice number","type":"string","rir_field_names":["invoice_id"]}]}]}' \
  'https://api.elis.rossum.ai/v1/schemas/33725'

POST /v1/schemas/validate

Validate schema object, check for errors.

Response

Status: 200 or 400

Returns 400 and error description in case of validation failure.

List all schemas

List all schemas

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/schemas'
{
  "pagination": {
    "total": 2,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 31336,
      "url": "https://api.elis.rossum.ai/v1/schemas/31336"
    },
    {
      "id": 33725,
      "url": "https://api.elis.rossum.ai/v1/schemas/33725"
    }
  ]
}

GET /v1/schemas

Retrieve all schema objects.

Supported filters: id, name

Supported ordering: id

Response

Status: 200

Returns paginated response with a list of schema objects.

Create a new schema

Create new empty schema

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Test Schema", "content": []}' \
  'https://api.elis.rossum.ai/v1/schemas'
{
  "id": 33725,
  "name": "Test Schema",
  "queues": [],
  "url": "https://api.elis.rossum.ai/v1/schemas/33725",
  "content": [],
  "metadata": {}
}

POST /v1/schemas

Create a new schema object.

Response

Status: 201

Returns created schema object.

Create schema from template organization

Create new schema object from template organization, see available templates in organization.

Create new schema object from template organization

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name": "Test Schema", "template_name": "EU Demo Template"}' \
  'https://api.elis.rossum.ai/v1/schemas/from_template'
{
  "name": "Test Schema",
  "id": 33726,
  "queues": [],
  "url": "https://api.elis.rossum.ai/v1/schemas/33726",
  "content": [
    {
        "id": "invoice_info_section",
        "icon": null,
        "label": "Basic information",
        "category": "section",
        "children": [
              ...
  ],
  "metadata": {}
}

POST /v1/schemas/from_template

Create a new schema object.

Response

Status: 201

Returns created schema object.

Retrieve a schema

Get schema object 31336

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/schemas/31336'
{
  "id": 31336,
  "name": "Basic schema",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "url": "https://api.elis.rossum.ai/v1/schemas/31336",
  "content": [
    {
      "category": "section",
      "id": "invoice_details_section",
      "label": "Invoice details",
      "children": [
        {
          "category": "datapoint",
          "id": "invoice_id",
          "label": "Invoice number",
          "type": "string",
          "rir_field_names": [
            "invoice_id"
          ],
          "constraints": {
            "required": false
          },
          "default_value": null
        },
        ...
      ]
    },
    ...
  ]
}

GET /v1/schemas/{id}

Get a schema object.

Response

Status: 200

Returns schema object.

Update a schema

Update content of schema object 33725

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"name":"Test Schema","content":[{"category":"section","id":"invoice_details_section","label":"Invoice details","icon": null,"children":[{"category":"datapoint","id":"invoice_id","label":"Invoice number","type":"string","rir_field_names":["invoice_id"]}]}]}' \
  'https://api.elis.rossum.ai/v1/schemas/33725'
{
  "id": 33725,
  "name": "Test Schema",
  "queues": [],
  "url": "https://api.elis.rossum.ai/v1/schemas/33725",
  "content": [
    {
      "category": "section",
      "id": "invoice_details_section",
      "label": "Invoice details",
      "children": [
        {
          "category": "datapoint",
          "id": "invoice_id",
          "label": "Invoice number",
          "type": "string",
          "rir_field_names": [
            "invoice_id"
          ],
          "default_value": null
        }
      ],
      "icon": null
    }
  ],
  "metadata": {}
}

PUT /v1/schemas/{id}

Update schema object. See Updating schema for more details about consequences of schema update.

Response

Status: 200

Returns updated schema object.

Update part of a schema

Update schema object 31336

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"content": []}' \
  'https://api.elis.rossum.ai/v1/schemas/31336'
{
  "id": 31336,
  "name": "Test Schema",
  "queues": [
    "https://api.elis.rossum.ai/v1/queues/8236"
  ],
  "url": "https://api.elis.rossum.ai/v1/schemas/31336",
  "content": [],
  "metadata": {}
}

PATCH /v1/schemas/{id}

Update part of schema object. See Updating schema for more details about consequences of schema update.

Response

Status: 200

Returns updated schema object.

Delete a schema

Delete schema 31336

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/schemas/31336'

DELETE /v1/schemas/{id}

Delete schema object.

Response

Status: 204

Document

Example document object

{
  "id": 314628,
  "url": "https://api.elis.rossum.ai/v1/documents/314628",
  "s3_name": "272c2f01ae84a4e19a421cb432e490bb",
  "annotations": [
    "https://api.elis.rossum.ai/v1/annotations/314528"
  ],
  "mime_type": "application/pdf",
  "arrived_at": "2019-10-13T23:04:00.933658Z",
  "original_file_name": "test_invoice_1.pdf",
  "content": "https://api.elis.rossum.ai/v1/documents/314628/content",
  "metadata": {}
}

A document object contains information about one input file (PDF or image). It cannot be created through the API, you need to use queue upload endpoint.

Attribute Type Default Description Read-only
id integer Id of the document true
url URL URL of the document true
s3_name string Internal true
annotations list[URL] List of annotations related to the document. Usually there is only one annotation.
mime_type string MIME type of the document (e.g. application/pdf) true
arrived_at datetime Timestamp of document upload or incoming email attachment extraction. true
original_file_name string File name of the attachment or upload. true
content URL Link to the document raw content (e.g. PDF file)
metadata object {} Client data, see metadata.

List all documents

List all documents

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/documents'
{
  "pagination": {
    "total": 2,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 314628,
      "url": "https://api.elis.rossum.ai/v1/documents/314628",
      "s3_name": "272c2f01ae84a4e19a421cb432e490bb",
      "annotations": [
        "https://api.elis.rossum.ai/v1/annotations/314528"
      ],
      "mime_type": "application/pdf",
      "arrived_at": "2019-10-13T23:04:00.933658Z",
      "original_file_name": "test_invoice_1.pdf",
      "content": "https://api.elis.rossum.ai/v1/documents/314628/content",
      "metadata": {}
    },
    {
      "id": 315609,
      "url": "https://api.elis.rossum.ai/v1/documents/315609",
      "s3_name": "8e506763caa2bc03f09cba3bf4817f84",
      "annotations": [
        "https://api.elis.rossum.ai/v1/annotations/315509"
      ],
      "mime_type": "image/png",
      "arrived_at": "2019-10-13T16:16:30.726217Z",
      "original_file_name": "test_invoice_2.png",
      "content": "https://api.elis.rossum.ai/v1/documents/315609/content",
      "metadata": {}
    }
  ]
}

GET /v1/documents

Retrieve all document objects.

Supported filters: id, arrived_at, original_file_name

Supported ordering: id, arrived_at, original_file_name, s3_name, mime_type

Response

Status: 200

Returns paginated response with a list of document objects.

Retrieve a document

Get document object 314628

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/documents/314628'
{
  "id": 314628,
  "url": "https://api.elis.rossum.ai/v1/documents/314628",
  "s3_name": "272c2f01ae84a4e19a421cb432e490bb",
  "annotations": [
    "https://api.elis.rossum.ai/v1/annotations/314528"
  ],
  "mime_type": "application/pdf",
  "arrived_at": "2019-10-13T23:04:00.933658Z",
  "original_file_name": "test_invoice_1.pdf",
  "content": "https://api.elis.rossum.ai/v1/documents/314628/content",
  "metadata": {}
}

GET /v1/documents/{id}

Get a document object.

Response

Status: 200

Returns document object.

Permanent URL

Download document original from a permanent URL

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/original/272c2f01ae84a4e19a421cb432e490bb'

GET /v1/original/272c2f01ae84a4e19a421cb432e490bb

Get original document content (e.g. PDF file).

Response

Status: 200

Returns original document file.

Delete a document

Delete document 314628

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/documents/314628'

DELETE /v1/documents/{id}

Delete a document object from the database. It also deletes the related annotation and page objects.

Never call this internal API, mark the annotation as deleted instead.

Response

Status: 204

Annotation

Example annotation object

{
  "document": "https://api.elis.rossum.ai/v1/documents/314628",
  "id": 314528,
  "queue": "https://api.elis.rossum.ai/v1/queues/8199",
  "schema": "https://api.elis.rossum.ai/v1/schemas/95",
  "pages": [
    "https://api.elis.rossum.ai/v1/pages/558598"
  ],
  "modifier": null,
  "modified_at": null,
  "confirmed_at": null,
  "exported_at": null,
  "assigned_at": null,
  "status": "to_review",
  "rir_poll_id": "54f6b9ecfa751789f71ddf12",
  "messages": null,
  "url": "https://api.elis.rossum.ai/v1/annotations/314528",
  "content": "https://api.elis.rossum.ai/v1/annotations/314528/content",
  "time_spent": 0,
  "metadata": {}
}

An annotation object contains all extracted and verified data related to a document. Every document belongs to a queue and is related to the schema object, that defines datapoint types and overall shape of the extracted data.

It cannot be created through the API, you need to use queue upload endpoint.

Attribute Type Default Description Read-only
id integer Id of the annotation true
url URL URL of the annotation true
status enum Status of the document, see Document Lifecycle for list of value.
document URL Related document.
queue URL Queues that annotation belongs to.
schema URL Schema that defines content shape. true
pages list[URL] List of rendered pages. true
modifier URL User that last modified the annotation.
assigned_at datetime Timestamp of last assignment to a user. true
modified_at datetime Timestamp of last modification. true
confirmed_at datetime Timestamp of confirmation by the user. true
exported_at datetime Timestamp of finished export. true
rir_poll_id string Internal.
messages list[object] [] List of messages from the connector.
content URL Link to annotation data (datapoint values), see Annotation data. true
time_spent float 0 Total time spent while validating the annotation.
metadata object {} Client data, see metadata.

Start annotation

Start annotation of object 319668

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319668/start'
{
  "annotation": "https://api.elis.rossum.ai/v1/annotations/319668",
  "session_timeout": "01:00:00"
}

POST /v1/annotations/{id}/start

Assign the calling user the annotation.

Response

Status: 200

Returns object with annotation and session_timeout keys.

Start embedded annotation

Start embedded annotation of object 319668

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"return_url": "https://service.com/return", "cancel_url": "https://service.com/cancel"}' \
  'https://api.elis.rossum.ai/v1/annotations/319668/start_embedded'
{
  "url": "https://embedded.elis.rossum.ai/document/319668#authToken=1c50ae8552441a2cda3c360c1e8cb6f2d91b14a9"
}

POST /v1/annotations/{id}/start_embedded

Start embedded annotation. It requires two parameters: return_url and cancel_url.

Key Description
return_url URL browser is redirected to in case of successful user validation (max. length: 256 chars)
cancel_url URL browser is redirected to in case of user canceling the validation (max. length: 256 chars)
max_token_lifetime_s Duration (in seconds) for which the token will be valid (optional, default: queue's session_timeout, max: 162 hours)

Response

Status: 200

Returns object with url that specifies URL to be used in the browser iframe/popup window. URL includes a token that is valid for this document only for a limited period of time.

Confirm annotation

Confirm annotation of object 319668

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319668/confirm'

POST /v1/annotations/{id}/confirm

Confirm annotation, switch status to exported (or exporting).

Response

Status: 204

Cancel annotation

Cancel annotation of object 319668

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319668/cancel'

POST /v1/annotations/{id}/cancel

Cancel annotation, switch its status back to to_review or postponed.

Response

Status: 204

Switch to postponed

Postpone annotation status of object 319668 to postponed

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319668/postpone'

POST /v1/annotations/{id}/postpone

Switch annotation status to postpone.

Response

Status: 204

Switch to deleted

Switch annotation status of object 319668 to deleted

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319668/delete'

POST /v1/annotations/{id}/delete

Switch annotation status to deleted. Annotation with status deleted is still available in Rossum UI.

Response

Status: 204

Rotate the annotation

Rotate the annotation 319668

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  -H 'Content-Type:application/json' -d '{"rotation_deg": 270}' \
  'https://api.elis.rossum.ai/v1/annotations/319668/rotate"

POST /v1/annotations/{id}/rotate

Rotate a document. It requires one parameter: rotation_deg.

Status of the annotation is switched to importing and the extraction phase begins over again. After the new extraction, the value from rotation_deg field is copied to pages rotation field rotation_deg.

Key Description
rotation_deg States degrees by which the document shall be rotated. Possible values: 0, 90, 180, 270.

Response

Status: 204

Search for text

Search for text in annotation 319668

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319668/search?phrase=some'
{
  "results": [
    {
      "rectangle": [
        67.15157010915198,
        545.9286363906203,
        87.99106633081445,
        563.4617583852776
      ],
      "page": 1
    },
    {
      "rectangle": [
        45.27717884130982,
        1060.3084761056693,
        66.11667506297229,
        1077.8415981003266
      ],
      "page": 1
    }
  ],
  "status": "ok"
}

GET /v1/annotations/{id}/search

Search for a phrase in the document.

Argument Description
phrase A phrase to search for
tolerance Allowed Edit distance from the search phrase. Only used for OCR invoices (images, such as png or PDF with scanned images).

Response

Status: 200

Returns results with a list of objects:

Key Type Description
rectangle list[float] Bounding box of an occurrence.
page integer Page of occurrence.

Convert grid to table data

Convert grid to tabular data in annotation 319623

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/319623/content/37507202/transform_grid_to_datapoints'

POST /v1/annotations/{id}/content/{id of the child node}/transform_grid_to_datapoints

Transform grid structure to tabular data of related multivalue object.

Response

Status: 200

List all annotations

List all annotations

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations'
{
  "pagination": {
    "total": 22,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "document": "https://api.elis.rossum.ai/v1/documents/315877",
      "id": 315777,
      "queue": "https://api.elis.rossum.ai/v1/queues/8236",
      "schema": "https://api.elis.rossum.ai/v1/schemas/31336",
      "pages": [
        "https://api.elis.rossum.ai/v1/pages/561206"
      ],
      "modifier": null,
      "modified_at": null,
      "confirmed_at": null,
      "exported_at": null,
      "assigned_at": null,
      "status": "to_review",
      "rir_poll_id": "54f6b9ecfa751789f71ddf12",
      "messages": null,
      "url": "https://api.elis.rossum.ai/v1/annotations/315777",
      "content": "https://api.elis.rossum.ai/v1/annotations/315777/content",
      "time_spent": 0,
      "metadata": {}
    },
    {
      ...
    }
  ]
}

GET /v1/annotations

Retrieve all annotation objects.

Supported ordering: document, document__arrived_at, document__original_file_name, modifier, modifier__username, queue, status, assigned_at, confirmed_at, modified_at, exported_at

Filters

Filters may be specified to limit annotations to be listed.

Attribute Description
status Annotation status, multiple values may be separated using a comma
id List of ids separated by a comma
modifier User id
document Document id
queue Queue id
arrived_at_before ISO 8601 timestamp (e.g. arrived_at_before=2019-11-15)
arrived_at_after ISO 8601 timestamp (e.g. arrived_at_after=2019-11-14)
assigned_at_before ISO 8601 timestamp (e.g. assigned_at_before=2019-11-15)
assigned_at_after ISO 8601 timestamp (e.g. assigned_at_after=2019-11-14)
confirmed_at_before ISO 8601 timestamp (e.g. confirmed_at_before=2019-11-15)
confirmed_at_after ISO 8601 timestamp (e.g. confirmed_at_after=2019-11-14)
modified_at_before ISO 8601 timestamp (e.g. modified_at_before=2019-11-15)
modified_at_after ISO 8601 timestamp (e.g. modified_at_after=2019-11-14)
exported_at_before ISO 8601 timestamp (e.g. exported_at_before=2019-11-14 22:00:00)
exported_at_after ISO 8601 timestamp (e.g. exported_at_after=2019-11-14 12:00:00)

Query fields

Obtain only subset of annotation attributes

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations?fields=id,url'
{
  "pagination": {
    "total": 22,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 320332,
      "url": "https://api.elis.rossum.ai/v1/annotations/320332"
    },
    {
      "id": 319668,
      "url": "https://api.elis.rossum.ai/v1/annotations/319668"
    },
    ...
  ]
}

In order to obtain only subset of annotation object attributes, one can use query parameter fields.

Argument Description
fields Comma-separated list of attributes to be included in the response.
fields! Comma-separated list of attributes to be excluded from the response.

Sideloading

Sideload documents and modifiers

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations?sideload=modifiers,documents'
{
  "pagination": {
    "total": 22,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "document": "https://api.elis.rossum.ai/v1/documents/320432",
      "id": 320332,
      ...,
      "modifier": "https://api.elis.rossum.ai/v1/users/10775",
      "status": "to_review",
      "rir_poll_id": "a898b6bdc8964721b38e0160",
      "messages": null,
      "url": "https://api.elis.rossum.ai/v1/annotations/320332",
      "content": "https://api.elis.rossum.ai/v1/annotations/320332/content",
      "time_spent": 0,
      "metadata": {}
    },
    ...
  ],
  "documents": [
    {
      "id": 320432,
      "url": "https://api.elis.rossum.ai/v1/documents/320432",
      ...
    },
    ...
  ],
  "modifiers": [
    {
      "id": 10775,
      "url": "https://api.elis.rossum.ai/v1/users/10775",
      ...
    },
    ...
  ]
}

In order to decrease the number of requests necessary for obtaining useful information about annotations, modifiers and documents can be sideloaded using query parameter sideload. This parameter accepts comma-separated list of keywords: modifiers, documents. Then the response is enriched by the requested keys, which contain lists of the sideloaded objects.

Response

Status: 200

Returns paginated response with a list of annotation objects.

Retrieve an annotation

Get annotation object 315777

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/315777'
{
  "document": "https://api.elis.rossum.ai/v1/documents/315877",
  "id": 315777,
  "queue": "https://api.elis.rossum.ai/v1/queues/8236",
  "schema": "https://api.elis.rossum.ai/v1/schemas/31336",
  "pages": [
    "https://api.elis.rossum.ai/v1/pages/561206"
  ],
  "modifier": null,
  "modified_at": null,
  "confirmed_at": null,
  "exported_at": null,
  "assigned_at": null,
  "status": "to_review",
  "rir_poll_id": "54f6b9ecfa751789f71ddf12",
  "messages": null,
  "url": "https://api.elis.rossum.ai/v1/annotations/315777",
  "content": "https://api.elis.rossum.ai/v1/annotations/315777/content",
  "time_spent": 0,
  "metadata": {}
}

GET /v1/annotations/{id}

Get an annotation object.

Response

Status: 200

Returns annotation object.

Update an annotation

Update annotation object 315777

curl -X PUT -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"document": "https://api.elis.rossum.ai/v1/documents/315877", "queue": "https://api.elis.rossum.ai/v1/queues/8236", "status": "postponed"}' \
  'https://api.elis.rossum.ai/v1/annotations/315777'
{
  "document": "https://api.elis.rossum.ai/v1/documents/315877",
  "id": 315777,
  "queue": "https://api.elis.rossum.ai/v1/queues/8236",
  ...
  "status": "postponed",
  "rir_poll_id": "a898b6bdc8964721b38e0160",
  "messages": null,
  "url": "https://api.elis.rossum.ai/v1/annotations/315777",
  "content": "https://api.elis.rossum.ai/v1/annotations/315777/content",
  "time_spent": 0,
  "metadata": {}
}

PUT /v1/annotations/{id}

Update annotation object.

Response

Status: 200

Returns updated annotation object.

Update part of an annotation

Update status of annotation object 315777

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"status": "deleted"}' \
  'https://api.elis.rossum.ai/v1/annotations/315777'
{
  "document": "https://api.elis.rossum.ai/v1/documents/315877",
  "id": 315777,
  ...
  "status": "deleted",
  "rir_poll_id": "a898b6bdc8964721b38e0160",
  "messages": null,
  "url": "https://api.elis.rossum.ai/v1/annotations/315777",
  "content": "https://api.elis.rossum.ai/v1/annotations/315777/content",
  "time_spent": 0,
  "metadata": {}
}

PATCH /v1/annotations/{id}

Update part of annotation object.

Response

Status: 200

Returns updated annotation object.

Copy annotation

Copy annotation 315777 to a queue 8236

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' -H 'Content-Type: application/json' \
  -d '{"target_queue": "https://api.elis.rossum.ai/v1/queues/8236", "target_status": "to_review"}' \
  'https://api.elis.rossum.ai/v1/annotations/315777/copy'
{
  "annotation": "https://api.elis.rossum.ai/v1/annotations/320332"
}

POST /v1/annotations/{id}/copy

Make a copy of annotation in another queue. All data and metadata are copied.

Key Description
target_queue URL of queue, where the copy should be placed.
target_status Status of copied annotation (if not set, it stays the same)

Response

Status: 200

Returns URL of the new annotation object.

Delete annotation

Delete annotation 315777

curl -X DELETE -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/315777'

DELETE /v1/annotations/{id}

Delete an annotation object from the database. It also deletes the related page objects.

Never call this internal API, mark the annotation as deleted instead.

Response

Status: 204

Annotation Data

Example annotation data

{
  "content": [
    {
      "id": 27801931,
      "url": "https://api.elis.rossum.ai/v1/annotations/319668/content/27801931",
      "children": [
        {
          "id": 27801932,
          "url": "https://api.elis.rossum.ai/v1/annotations/319668/content/27801932",
          "content": {
            "value": "2183760194",
            "page": 1,
            "position": [
              761,
              48,
              925,
              84
            ],
            "rir_text": "2183760194",
            "rir_position": [
              761,
              48,
              925,
              84
            ],
            "connector_text": null,
            "rir_confidence": 0.99234
          },
          "category": "datapoint",
          "schema_id": "invoice_id",
          "validation_sources": [
            "score"
          ],
          "time_spent": 0,
          "hidden": false
        },
        {
          "id": 27801933,
          "url": "https://api.elis.rossum.ai/v1/annotations/319668/content/27801933",
          "content": {
            "value": "6/8/2018",
            "page": 1,
            "position": [
              283,
              300,
              375,
              324
            ],
            "rir_text": "6/8/2018",
            "rir_position": [
              283,
              300,
              375,
              324
            ],
            "connector_text": null,
            "rir_confidence": 0.98279
          },
          "category": "datapoint",
          "schema_id": "date_issue",
          "validation_sources": [
            "score"
          ],
          "time_spent": 0,
          "hidden": false
        },
        {
          "id": 27801934,
          "url": "https://api.elis.rossum.ai/v1/annotations/319668/content/27801934",
          "content": null,
          "category": "datapoint",
          "schema_id": "email_button",
          "validation_sources": [
            "NA"
          ],
          "time_spent": 0,
          "hidden": false
        },
        ...
    }
  ]
}

Annotation data is used by the Rossum UI to display annotation data properly. Be aware that values are not normalized (e.g. numbers, dates) and data structure may be changed to accommodate UI requirements.

Top level content contains a list of section objects. results is currently a copy of content and is deprecated.

Section objects:

Attribute Type Description Read-only
id int A unique ID of a given section. true
url URL URL of the section. true
category string section
children list Array specifying objects that belong to the section.

Datapoint, multivalue and tuple objects:

Attribute Type Description Read-only
id int A unique ID of a given object. true
url URL URL of a given object. true
schema_id string Reference mapping the object to the schema tree.
category string Type of the object (datapoint, multivalue or tuple). true
children list Array specifying child objects. Only available for multivalue and tuple categories. true
content object (optional) A dictionary of the attributes of a given datapoint (only available for datapoint) see below for details. true
validation_sources list[object] Source of validation of the extracted data, see below.
time_spent float Total time spent while validating a given node. true
hidden bool If set to true, the datapoint is not visible in the user interface, but remains stored in the database.
grid object Specify grid structure, see below for details. Only allowed for multivalue object.

Content object

Can be null for datapoints of type button

Attribute Type Description Read-only
value string The extracted data of a given node.
page int Number of page where the data is situated (see position).
position list List of the coordinates of the label box of the given node.
rir_text string The extracted text, used as a reference for data extraction models. true
rir_page int The extracted page, used as a reference for data extraction models. true
rir_position list The extracted position, used as a reference for data extraction models. true
rir_confidence float Confidence (estimated probability) that this field was extracted correctly. true
connector_text string Text set by the connector. true
connector_position list Position set by the connector. true

Validation sources

validation_sources property is a list of sources that verified the extracted data. When the list is non-empty, datapoint is considered to be validated (and no eye-icon is displayed next to it in the Rossum UI).

Currently, there are five sources of validation:

A sixth possible validation source value NA signs that validation sources are "Not Applicable" and may now occur only for button datapoints.

The list is subject to ongoing expansion.

Example multivalue datapoint object with a grid

{
  "id": 122852,
  "schema_id": "line_items",
  "category": "multivalue",
  "grid": {
    "parts": [
      {
        "page": 1,
        "columns": [
          {
            "left_position": 348,
            "schema_id": "item_description",
            "header_texts": ["Description"]
          },
          {
            "left_position": 429,
            "schema_id": "item_quantity",
            "header_texts": ["Qty"]
          }
        ],
        "rows": [
          {
            "top_position": 618,
            "type": "header"
          },
          {
            "top_position": 649,
            "type": "data"
          }
        ],
        "width": 876,
        "height": 444
      }
    ]
  },
  ...
}

Grid object is used to store table vertical and horizontal separators and related attributes. Every grid consists of one or more parts.

Every part object consists of several attributes:

Attribute Type Description
page int A unique ID of a given object.
columns list[object] Description of grid columns.
rows list[object] Description of grid rows.
width float Total width of the grid.
height float Total height of the grid.

Every column contains attributes:

Attribute Type Description
left_position float Position of the column left edge.
schema_id string Reference to datapoint schema id. Used in grid-to-table conversion.
header_texts list[string] Extracted texts from column headers.

Every row contains attributes:

Attribute Type Description
top_position float Position of the row top edge.
type string Row type. Allowed values are specified in the schema, see grid. If null, the row is ignored during grid-to-table conversion.

Currently, it is only allowed to have one part per page (for a particular grid).

Get the annotation data

Get annotation data of annotation 315777

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/315777/content'

GET /v1/annotations/{id}/content

Get annotation data.

Response

Status: 200

Returns annotation data.

Send updated annotation data

Send feedback on annotation 315777

Start the annotation

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/315777/start'
{
  "annotation": "https://api.elis.rossum.ai/v1/annotations/315777",
  "session_timeout": "01:00:00"
}

Get the annotation data

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/315777/content'
{
  "id": 37507206,
  "url": "https://api.elis.rossum.ai/v1/annotations/315777/content/37507206",
  "content": {
    "value": "001",
    "page": 1,
    "position": [
      302,
      91,
      554,
      56
    ],
    "rir_text": "000957537",
    "rir_position": [
      302,
      91,
      554,
      56
    ],
    "connector_text": null,
    "rir_confidence": null
  },
  "category": "datapoint",
  "schema_id": "invoice_id",
  "validation_sources": [
    "human"
  ],
  "time_spent": 2.7,
  "hidden": false
  }

Patch the annotation

curl -X PATCH -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
-H 'Content-Type:application/json' -d '{"content": {"value": "#INV00011", "position": [302, 91, 554, 56]}}' \
'https://api.elis.rossum.ai/v1/annotations/315777/content/37507206'
{
  "id": 37507206,
  "url": "https://api.elis.rossum.ai/v1/annotations/431694/content/39125535",
  "content": {
    "value": "#INV00011",
    "page": 1,
    "position": [
      302,
      91,
      554,
      56
    ],
    "rir_text": "",
    "rir_position": null,
    "rir_confidence": null,
    "connector_text": null
  },
  "category": "datapoint",
  "schema_id": "invoice_id",
  "validation_sources": [],
  "time_spent": 0,
  "hidden": false
}

Confirm the annotation

curl -X POST -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/annotations/315777/confirm'

PATCH /v1/annotations/{id}/content/{id of the child node}

Update a particular annotation content node.

It is enough to pass just the updated attributes in the PATCH payload. start must be called on the annotation first and confirm after data is updated.

Response

Status: 200

Returns updated annotation data for the given node.

Page

Example page object

{
  "id": 558598,
  "annotation": "https://api.elis.rossum.ai/v1/annotations/314528",
  "number": 1,
  "rotation_deg": 0,
  "mime_type": "image/png",
  "s3_name": "4b66305775c029cb0cfa80fd0ebb2da6",
  "url": "https://api.elis.rossum.ai/v1/pages/558598",
  "content": "https://api.elis.rossum.ai/v1/pages/558598/content",
  "metadata": {}
}

A page object contains information about one page of the annotation (we render pages separately for every annotation, but this will change in the future).

Page objects are created automatically during document import and cannot be created through the API, you need to use queue upload endpoint. Pages cannot be deleted directly -- they are deleted on parent annotation delete.

Attribute Type Default Description Read-only
id integer Id of the page true
url URL URL of the page. true
annotation URL Annotation that page belongs to.
number integer Page index, first page has index 1.
rotation_deg integer Page rotation.
mime_type string MIME type of the page (image/png). true
s3_name string Internal true
content URL Link to the page raw content (pdf file).
metadata object {} Client data, see metadata.

List all pages

List all pages

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/pages'
{
  "pagination": {
    "total": 1,
    "total_pages": 1,
    "next": null,
    "previous": null
  },
  "results": [
    {
      "id": 558598,
      "annotation": "https://api.elis.rossum.ai/v1/annotations/314528",
      "number": 1,
      "rotation_deg": 0,
      "mime_type": "image/png",
      "s3_name": "7eb0dcc0faa8868b55fb425d21cc60dd",
      "url": "https://api.elis.rossum.ai/v1/pages/558598",
      "content": "https://api.elis.rossum.ai/v1/pages/558598/content",
      "metadata": {}
    }
  ]
}

GET /v1/pages

Retrieve all page objects.

Supported filters: id, annotation, number

Supported ordering: id, number, s3_name

Response

Status: 200

Returns paginated response with a list of page objects.

Retrieve a page

Get page object 558598

curl -H 'Authorization: token db313f24f5738c8e04635e036ec8a45cdd6d6b03' \
  'https://api.elis.rossum.ai/v1/pages/558598'
{
  "id": 558598,
  "annotation": "https://api.elis.rossum.ai/v1/annotations/314528",
  "number": 1,
  "rotation_deg": 0,
  "mime_type": "image/png",
  "s3_name": "7eb0dcc0faa8868b55fb425d21cc60dd",
  "url": "https://api.elis.rossum.ai/v1/pages/558598",
  "content": "https://api.elis.rossum.ai/v1/pages/558598/content",
  "metadata": {}
}

GET /v1/pages/{id}

Get an page object.

Response

Status: 200

Returns page object.

FAQ

POST fails with HTTP status 500

Please check that Content-Type header in the HTTP request is set correctly (e.g. application/json).

We will improve content type checking in the future , so that to return 400.

SSL connection errors

Rossum API only supports TLS 1.2 to ensure that up-to-date algorithms and ciphers are used.

Older SSL libraries may not work properly with TLS 1.2. If you encounter SSL/TLS compatibility issue, please make sure the library supports TLS 1.2 and the support is switched on.