PDF Statements Parsing
Summary
PDF Statements Parsing allows you to upload end-user’s bank statements and receive extracted data in a processing-friendly version.
Each document is checked against our anti-tampering verification rules.
Supported documents
PDF Statements Parsing is supported only in Poland for individual clients and sole-proprietors (“Jednoosobowa działalność gospodarcza”).
PDF Statements Parsing supports two types of documents:
- monthly statements (“Wyciąg miesięczny”) - documents generated by the bank each month that the user can download without an option to filter the transactions
- account history (“Historia rachunku”) - documents generated on-demand by the user with possible filters applied
For the list of supported banks please refer to our Coverage.
The following documents will be rejected:
- statements from company/corporate accounts (other than sole-proprietors, i.e. “Jednoosobowa działalność gospodarcza”),
- statements in English or other languages (the layout needs to be in Polish),
- encrypted statements,
- scans, photos or otherwise prepared documents not downloaded directly from the user’s online banking.
Assuming you will use the same end-user identifier (ownerExternalId) for uploaded documents as for our other Importing Services, all data will be aggregated. For example, you can combine data from AIS, PDFs and Owner Upload.
How to use
Below you’ll find 3 ways in which you can use this service. While the PDF Widget and Insight are the easiest to start with, we strongly recommend API integration as the one offering the most flexibility and often resulting in a better end-user experience.
API integration
To integrate with our API you will need to:
- Get API access
- Get your Client ID from Insight, generate API key(s) and whitelist your server(s)
- Create a frontend solution to receive documents from your users and save them on your server
- Integrate with the Parse Statement endpoint
- (optionally) Integrate with Analytical Services endpoints
- Save the data from our API
Sample process flow
- A user visits your website, fills in a form to start your process
- You give the user an option to upload their bank statements
- Assign a unique identifier to the user
- Send each file separately to the Parse Statement endpoint and pass the same previously generated identifier each time
- For each file you will receive the extracted data separately
- (optionally) Make a request to the Aggregated Data endpoint to fetch all extracted data joined together
- (optionally) Make requests to our Analytical Services endpoints to fetch extra insights into the owner’s finances
- Inform the user about the status of the verification or your decision regarding the user’s application
PDF Widget
If you don’t want to create your own frontend for receiving PDFs or you don’t want to integrate with our API, you can process the documents through our frontend Widget dedicated for PDF Statements Parsing.
In order to do so:
- Get API access
- Get your Client ID from Insight
- Embed our PDF Widget on your website
- Configure the Widget parameters
- Optionally you can also:
    - 
        Assign unique ownerExternalIdto each end-user if you want to download the aggregated data later via our API
- 
        Handle the onFinishedcallback, for example, if you want to redirect the end-user after they uploaded the documents
 
- 
        Assign unique 
In the Widget embedded on your website, the end-user will have to choose their bank and then upload their statements. For each file they will see if it’s Accepted or Rejected.
Once all documents are uploaded, they have to click “Finish” (which is when the onFinished callback is triggered).
The uploaded data will be available for you to review in Insight.
Insight
Your employees can also manually upload documents received from end-users, e.g., by email. This is done via our PDF Widget, but it’s embedded in the Insight platform.
If you want to use this feature:
- Get API access
- Log in to Insight
- Choose “Upload PDF Statement” option in the menu on the left
- Assign an ID to the user (it can be an ID of your credit application)
- Upload documents you received from the user
- Click “Finished” when you’re done uploading all the documents
- You will be redirected to the page with the user data visualization
Please also note that:
- You can upload more data for the same end-user later if in the step 4 you provide the same ID that you had previously used when uploading the documents (e.g., if you received more documents, or you want to upload statements from another bank).
- When you’re redirected to the user data visualization page, it might say that the import is still in progress - that means we’re still going through the parsed transactions, but you can refresh the page after a few seconds and the alert should be gone.
Testing
In order to test this service you will have to:
- test using real bank statements that you acquired, works in test and production environments;
- test using our mock bank statements, generated from KontoBank, works only in the test environment.
Both types of documents are possible to use via all the supported upload methods, i.e. API, PDF Widget or Insight.
Remember that in the test environment you will have to use the test API URLs, your test Client ID, API keys and whitelisted servers.
To generate a fake statement from KontoBank:
- Go to https://bank.kontomatik.com/pdf/
- Choose one of the available mock accounts
- Choose document type:
    - monthly statement - spans over one month only for just one account
- account history - can include multiple accounts within a custom date range
 
- Select dates for which the statement will be generated
- Upload documents you received from the user
- (optional) Add errors to test negative scenarios
- Click “Generate PDF” and then save the file
Then upload the document using one of the 3 ways mentioned above, including further data analysis.
Anti-tampering verification
Verification aims to hinder malicious attempts at tampering with PDF documents. Our algorithms verify:
- Bank’s digital signature
- Consistency of the account balance with transactions
- PDF metadata characteristic
- Fonts, colors and sizes
- Bank logotype
- Consistency of the header period with transaction dates
- Keywords
- Document structure
It’s not guaranteed that checking these properties is enough to spot fraud, but based on our analysis these are the most common indicators.
Occasionally, someone might accidentally edit the PDF document. In such situations, it’s best to ask the end-user to download the document again and not to open it before uploading it to the parser.
On-premise deployment
Our PDF parsing solution is mainly used in the SaaS model, but we also offer on-premise deployments. It is designed for clients who don’t want the data to go through other servers than their own.
This option does however have some disadvantages:
- you need to maintain infrastructure and security all on your own,
- only the API integration option is available (no Widget or Insight),
- it’s not updated as frequently as the SaaS solution and you’re in charge of the installation,
- in case any bugs arise, debugging becomes much harder and as a result, it might take longer for our developers to fix the problems.
To find out more about this solution, contact our Sales team.