PDF Statements: XML to JSON migration
Jump to section:
Tampering verification and parsing modes
Announcement
In December 2025 we released a new API for the PDF Statements Parsing service.
The new API returns data in the JSON format and provides two parsing modes:
- Standard - similar flow to the current XML one
- Trusted - allows to bypass the anti-fraud validations
The previous XML-based API will be removed on April 1st 2027. Before this date we will be periodically sending reminders to switch to the new version.
API for downloading uploaded documents has also been changed and requires migration.
The documentation regarding the old XML API was moved to the deprecated sections, whereas the new API has taken over as the standard method of using this service.
Notices:
- Aggregated data endpoint is still available only in XML format.
- The changes don’t apply to the PDF Widget/parsing documents via Insight.
Tampering verification and parsing modes
When parsing a document we check for a number of indicators that may suggest that the file has been modified after it was downloaded from the bank website.
Previously when any modification indicator was triggered, parsing of such documents was stopped. Now you can use the trusted mode to get the parsing result.
All found indicators, if any, will be returned as a failedVerifications list, regardless of whether you use the standard or the trusted mode.
Why would I want to use the trusted mode?
In many credit application processes the tampering verification is essential. However, there are uses that don’t require a risk analysis, e.g. importing a list of accounts for future payments, importing data to a PFM solution, detecting simple spending habits.
Additionally, when using the trusted mode, you can pick and choose which indicators matter to you. If any are found, you can analyze the returned failedVerifications list and exclude those indicators that don’t seem to generate as much risk.
Occasionally banks may change some layout elements or metadata. When that happens, normally all documents would be rejected before we update our parser, but now you have the option to force parse such documents if you detect an abnormal number of such failed verifications from various users. Of course some layout changes may break the parsing flow beyond just the tampering verification and would still return an error.
Lastly, perhaps you just trust some sources enough to accept any inconsistencies as simple user or bank errors. For example:
- you generated a file yourself for testing purposes, opened it, clicked something which modified metadata by accident, but you still want to extract the data
- you have a different source of confirmation that the data on the document is correct so you just want to parse it into a computer-friendly format that is JSON
Request and response differences
Statements Parsing
| Old API (XML) | New API (JSON) | |
|---|---|---|
| Endpoint | /v1/statement.xml |
/v1/pdf/statementor /v1/pdf/statement-trusted |
| Required headers | Content-Type: multipart/form-data |
Content-Type: multipart/form-data |
| Required parameters | File in binary or Base64 (pdf / pdfAsBase64) |
File in binary (pdf) |
| Optional parameters | ownerExternalId, password, acceptInconsistent |
ownerExternalId, password |
| Response and exception format | XML | JSON |
<?xml version="1.0" encoding="utf-8"?>
<reply xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://developer.kontomatik.com/schema/v1/statement.xsd"
status="200 OK"
user="demo-test"
ownerExternalId="1765983311" responseTimestamp="2025-12-17T14:55:41Z">
<result>
<statement id="549008776" isConsistent="true">
<owners>
<owner>
<name>Marian Etatowy</name>
<kind>OWNER</kind>
</owner>
</owners>
<period>
<startOn>2025-05-01</startOn>
<endOn>2025-05-31</endOn>
</period>
<type>MONTHLY_STATEMENT</type>
<target name="KontoBank" officialName="KontoBank" institution="KontoBank">
<accounts>
<account>
<name>Konto Standard</name>
<iban>PL26167030183854195847457003</iban>
<currencyBalance>49995.00</currencyBalance>
<currencyName>PLN</currencyName>
<activeSinceAtLeast>2025-05-06</activeSinceAtLeast>
<moneyTransactions>
<moneyTransaction>
<transactionOn>2025-05-06</transactionOn>
<bookedOn>2025-05-06</bookedOn>
<currencyAmount>-50.00</currencyAmount>
<currencyBalance>49995.00</currencyBalance>
<title>Przelew własny</title>
<party>Marian Etatowy</party>
<partyIban>PL31109024028565354825788485</partyIban>
<kind>PRZELEW WYCHODZĄCY</kind>
<status>DONE</status>
<labels>
<label>internal</label>
</labels>
</moneyTransaction>
</moneyTransactions>
<owners>
<owner>
<name>Marian Etatowy</name>
<kind>OWNER</kind>
</owner>
</owners>
</account>
</accounts>
</target>
</statement>
</result>
</reply>
{
"requestId": "019b2cd1-7679-77ca-a34b-8551fe5489d1",
"content": {
"statement": {
"id": 549008818,
"ownerExternalId": "1765983472",
"target": {
"name": "KontoBank",
"institution": "KontoBank",
"officialName": "KontoBank"
},
"type": "MONTHLY_STATEMENT",
"period": {
"start": "2025-05-01",
"end": "2025-05-31"
}
},
"owners": [
{
"name": "Marian Etatowy",
"kind": "OWNER"
}
],
"accounts": [
{
"name": "Konto Standard",
"iban": "PL26167030183854195847457003",
"currencyBalance": 49995,
"currencyName": "PLN",
"owners": [
{
"name": "Marian Etatowy",
"kind": "OWNER"
}
],
"activeSinceAtLeast": "2025-05-06",
"moneyTransactions": [
{
"transactionOn": "2025-05-06",
"bookedOn": "2025-05-06",
"currencyAmount": -50,
"currencyBalance": 49995,
"title": "Przelew własny",
"party": "Marian Etatowy",
"partyIban": "PL31109024028565354825788485",
"kind": "PRZELEW WYCHODZĄCY",
"analytics": {
"labels": [
{
"name": "internal",
"vendors": []
}
]
}
}
]
}
]
}
}
Download documents
| Old API (XML) | New API | |
|---|---|---|
| Endpoint | /v1/documents.xml |
/v1/pdf/statements |
| Required parameters | ownerExternalId |
ownerExternalId |
| Response and exception format | XML | JSON |
| Fetching the file | GET request to fileUrl |
GET request to fileUrl |
| Fetching the file exception format | XML | JSON |
<?xml version="1.0" encoding="utf-8"?>
<reply xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://developer.kontomatik.com/schema/v1/get_documents.xsd"
status="200 OK"
user="demo-test"
responseTimestamp="2026-01-08T07:58:15Z">
<documents>
<document>
<state>SUCCESSFUL</state>
<parsingDate>2026-01-08</parsingDate>
<fileUrl>https://test.api.kontomatik.com/v1/document-file?documentId=549262652</fileUrl>
</document>
<document>
<state>ERROR</state>
<parsingDate>2026-01-08</parsingDate>
<fileUrl>https://test.api.kontomatik.com/v1/document-file?documentId=549262691</fileUrl>
</document>
</documents>
</reply>
{
"requestId": "019b9c9c-64ee-791a-a89f-90ba1311db0c",
"content": {
"ownerExternalId": "1767858629",
"documents": [
{
"statementId": "549262652",
"state": "SUCCESSFUL",
"parsingDate": "2026-01-08",
"fileUrl": "https://test.api.kontomatik.com/v1/pdf/statements/549262652"
},
{
"statementId": "549262691",
"state": "ERROR",
"parsingDate": "2026-01-08",
"fileUrl": "https://test.api.kontomatik.com/v1/pdf/statements/549262691"
}
]
}
}
Handling inconsistent layouts
The new API still recognizes some documents as inconsistent by default. Those are the documents that lack some information (e.g. balances, applied filters) that would allow us to check for consistency between transactions and sum totals.
The handling of such documents changes with the new API.
| Old API | New API | |
|---|---|---|
| Detect inconsistent layout | InvalidStatementConsistency exception returned + specific messageor isConsistent = false (when acceptInconsistent = true was used) |
Standard mode:FailedVerifications exception + ConsistencyCheckWarning in the failedVerifications list within exception structureTrusted mode: ConsistencyCheckWarning in the failedVerifications list within content structure |
| Accept inconsistent layout | Use acceptInconsistent = true in the API call |
1. Use standard mode 2. Receive FailedVerifications + ConsistencyCheckWarning3. Check if you accept other failed verifications (if there are any) 4. Use trusted mode for the same PDF or 1. Use trusted mode 2. Receive FailedVerifications + ConsistencyCheckWarning3. Check if you accept other failed verifications (if there are any) 4. Review the parsed data 5. Save the output in your systems* In both scenarios you will be billed only once. |
*the output of the trusted parsing will be saved on Kontomatik servers regardless of whether you accept/save it on your side or not.
Billing
The billing rules are the same as they’ve always been and they are strictly related to whether we returned the parsed data or not.
| PDF Characteristic | Old API (XML) | New API (Standard Mode) | New API (Trusted Mode) |
|---|---|---|---|
| Document was parsed without issues | * Data returned * Billed |
* Data returned * Billed |
* Data returned * Billed |
| Document could be parsed but modification indicators were found | * Exception returned * Failed verifications returned * Not billed |
* Exception returned * Failed verifications returned * Not billed |
* Data returned * Failed verifications returned * Billed |
| Parsing error occurred | * Exception returned * Not billed |
* Exception returned * Not billed |
* Exception returned * Not billed |
| Parsing error occurred and modification indicators were found | * Exception returned * Failed verifications returned * Not billed |
* Exception returned * Failed verifications returned * Not billed |
* Exception returned * Failed verifications returned * Not billed |