PDF Parsing API

Summary

PDF Parsing API allows to extract the data from the bank statements in a PDF format. Lack of global structure in such files makes this process especially hard, but our proprietary algorithms are able to recognise transactions, KYC and other related data. Moreover, the algorithms aim to verify credibility and integrity of information contained in the statements.

As well as with our Banking API, PDF statements can also be complemented by our labels and scoring solutions.

How to integrate

Integration of the PDF Parsing API is as easy as it can get. You just need to upload a bank statement in the PDF format encoded using “Base64” (plain text) to the Kontomatik endpoint. To get the results back in the XML format, you just need to call the data endpoint.

If you choose our SaaS solution, you can alternatively use our front-end widget to make these steps even more seamless.

Supported banks and features

The PDF Parsing API is currently available only in Poland and we support all major banks there.

Nevertheless, we can only extract as much information as is available in a given statement and this differs across all the banks. We’ve prepared the table showing what you can get from each bank here.

Some of the features include:

Supported types of documents:

The following are not supported:

Anti-tampering verification

Verification aims to hinder malicious attempts at tampering with the PDF statements. Our algorithms verify:

It’s not guaranteed that checking these properties is enough to spot the fraud, but based on our analysis these are the most common to occur.

Sometimes, it might also be the case that someone accidentally edited the PDF document. In such situations, it’s best to ask the end-user to download the document again and not to open it before uploading it to the parser.

PDF parsing Widget

Our PDF Parsing Widget is a front-end widget that makes the process of parsing the bank statement even more seamless. It is available as part of our SaaS solution.

The end-user chooses the bank and uploads a bank statement as a PDF file. You can then get the extracted information using our data endpoint.

On-premise deployment

Our PDF parsing solution is mainly used as a SaaS, but we also offer on-premises deployments. It is designed for clients who don’t want the data to go through other servers than their own.

This option does however have some disadvantages:

To find out more about this solution, contact our Sales team.

FAQ

PDF parsing is a general term to describe a class of methods that are able to extract plain text from PDF documents that is human-readable.

Our proprietary PDF Parsing solution is designed specifically for handling bank documents. It automatically recognises transactions, KYC and other related data in a bank statement and extracts them for further processing.

Moreover, our algorithm always tries to verify if the statement wasn’t edited, the data is consistent, the digital signature is correct and other expected features are in place so that you can protect yourself from accepting statements that have been tempered with.

It depends on what is important to you and your end-users. In both cases, you will get the data in the same format. The main difference is that the PDF parsing doesn’t require the user to login to the online bank via our widget. As a result, the end-user has more flexibility on how they obtain the bank statement.

On the other hand, the Banking API combined with our widget makes the whole process of providing the data by the end-user easier.

Finally, the PDF Parsing solution can work entirely from your servers (on-premise) in contrast to the Banking API which is served only via the cloud (SaaS)*.

*On-premise PDF Parsing solution requires separate contracts and fees to the SaaS version. We recommend it only to big companies with highly developed infrastructure and security IT teams.

To parse a PDF bank statement, you need to upload it to the statement endpoint encoded using “Base64” to get the results back in the XML format.

If you choose our SaaS solution, you can also use our front-end widget to upload a pdf without encoding and get the XML later using the data endpoint.

See all FAQs

Documentation

Banking API

Read-only API to banks. Import all data from any supported bank into your system.

Open
Parsing API

PDF parsing API enables checking PDF statements against tampering attempts.

Open
Zero-integration solution

Easy integration with access to data via Customer Platform.

Open

Contact

Sales Contact

Do you need help in explaining our products, costs, and cooperation?

Contact our Sales team
Support Contact

Do you have technical questions about our products and services?

Contact Support