PDF Statements: XML to JSON migration
Jump to section:
Tampering verification and parsing modes
Summary
In response to customer requests regarding updates to the PDF Statements Parsing service, particularly in the area of the response format and anti-fraud verifications, in December 2025 a new version of our PDF parsing API was released.
The new API returns data in the JSON format and provides two parsing modes:
- Standard - similar to the current XML version
- Trusted - which allows returning the parsing result even if potential signs of modification are detected, such as incorrect fonts, invalid logos, or metadata inconsistencies
The trusted mode still requires specific statement layouts, however, validation checks related to possible document modifications do not stop the parsing process. If any inconsistencies are detected, they are returned alongside the parsed data as indicators for further analysis and decision-making on the client’s side. If the trusted mode doesn’t return data, it means that the statement layout is not supported or that the errors which occurred prevented further processing.
The document download API (documents.xml) also requires migration.
The documentation regarding the old XML API was moved to the deprecated sections, whereas the new API has taken over as the standard method of using this service.
On April 1, 2027, the current XML API will be permanently removed. We therefore kindly ask you to plan the necessary updates to your system integrations.
Notices:
- Aggregated data endpoint is still available only in XML format.
- The changes don’t apply to the PDF Widget/parsing documents via Insight.
Tampering verification and parsing modes
When parsing a document we check for a number of indicators that may suggest that the file has been modified after it was downloaded from the bank website. Previously when any modification indicator was triggered, parsing of such documents was stopped. Now you can use the trusted mode to get the parsing result.
All found indicators, if any, will be returned as a failedVerifications list, regardless of whether you use the standard or the trusted mode.
Why would I want to use the trusted mode?
In many credit application processes the tampering verification is essential. However, there are uses that don’t require a risk analysis, e.g. importing a list of accounts for future payments, importing data to a PFM solution, detecting simple spending habits.
Additionally, when using the trusted mode, you can pick and choose which indicators matter to you. If any are found, you can analyze the returned failedVerifications list and exclude those indicators that don’t seem to generate as much risk.
Occasionally banks may change some layout elements or metadata. When that happens, normally all documents would be rejected before we update our parser, but now you have the option to force parse such documents if you detect an abnormal number of such failed verifications from various users. Of course some layout changes may break the parsing flow beyond just the tampering verification and would still return an error.
Lastly, perhaps you just trust some sources enough to accept any inconsistencies as simple user or bank errors. For example:
- you generated a file yourself for testing purposes, opened it, clicked something which modified metadata by accident, but you still want to extract the data
- you have a different source of confirmation that the data on the document is correct so you just want to parse it into a computer-friendly format that is JSON
Standard mode cannot be migrated 1:1 from the XML API
Please note that the standard mode in the JSON API is not a direct replacement for the /statement.xml endpoint in terms of document handling. The most notable difference concerns how inconsistent statement layouts are processed in the new API. For details, please refer to the Handling inconsistent statements.
Request and response differences
Below you’ll find request and response examples comparing both API versions. These examples are provided for comparison purposes only and should not be treated as a complete API specification. Before implementing any changes, please refer to the full technical documentation, as field availability and behavior may differ from what is shown in the examples.
Statements Parsing
| Old API (XML) | New API (JSON) | |
|---|---|---|
| Endpoint | /v1/statement.xml |
/v1/pdf/statementor /v1/pdf/statement-trusted |
| Required headers | Content-Type: multipart/form-data |
Content-Type: multipart/form-data |
| Required parameters | File in binary or Base64 (pdf / pdfAsBase64) |
File in binary (pdf) |
| Optional parameters | ownerExternalId, password, acceptInconsistent |
ownerExternalId, password |
| Response and exception format | XML | JSON |
<?xml version="1.0" encoding="utf-8"?>
<reply xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://developer.kontomatik.com/schema/v1/statement.xsd"
status="200 OK"
user="demo-test"
ownerExternalId="1765983311" responseTimestamp="2025-12-17T14:55:41Z">
<result>
<statement id="549008776" isConsistent="true">
<owners>
<owner>
<name>Marian Etatowy</name>
<kind>OWNER</kind>
</owner>
</owners>
<period>
<startOn>2025-05-01</startOn>
<endOn>2025-05-31</endOn>
</period>
<type>MONTHLY_STATEMENT</type>
<target name="KontoBank" officialName="KontoBank" institution="KontoBank">
<accounts>
<account>
<name>Konto Standard</name>
<iban>PL26167030183854195847457003</iban>
<currencyBalance>49995.00</currencyBalance>
<currencyName>PLN</currencyName>
<activeSinceAtLeast>2025-05-06</activeSinceAtLeast>
<moneyTransactions>
<moneyTransaction>
<transactionOn>2025-05-06</transactionOn>
<bookedOn>2025-05-06</bookedOn>
<currencyAmount>-50.00</currencyAmount>
<currencyBalance>49995.00</currencyBalance>
<title>Przelew własny</title>
<party>Marian Etatowy</party>
<partyIban>PL31109024028565354825788485</partyIban>
<kind>PRZELEW WYCHODZĄCY</kind>
<status>DONE</status>
<labels>
<label>internal</label>
</labels>
</moneyTransaction>
</moneyTransactions>
<owners>
<owner>
<name>Marian Etatowy</name>
<kind>OWNER</kind>
</owner>
</owners>
</account>
</accounts>
</target>
</statement>
</result>
</reply>
{
"requestId": "019b2cd1-7679-77ca-a34b-8551fe5489d1",
"content": {
"statement": {
"id": 549008818,
"ownerExternalId": "1765983472",
"target": {
"name": "KontoBank",
"institution": "KontoBank",
"officialName": "KontoBank"
},
"type": "MONTHLY_STATEMENT",
"period": {
"start": "2025-05-01",
"end": "2025-05-31"
}
},
"owners": [
{
"name": "Marian Etatowy",
"kind": "OWNER"
}
],
"accounts": [
{
"name": "Konto Standard",
"iban": "PL26167030183854195847457003",
"currencyBalance": 49995,
"currencyName": "PLN",
"owners": [
{
"name": "Marian Etatowy",
"kind": "OWNER"
}
],
"activeSinceAtLeast": "2025-05-06",
"moneyTransactions": [
{
"transactionOn": "2025-05-06",
"bookedOn": "2025-05-06",
"currencyAmount": -50,
"currencyBalance": 49995,
"title": "Przelew własny",
"party": "Marian Etatowy",
"partyIban": "PL31109024028565354825788485",
"kind": "PRZELEW WYCHODZĄCY",
"analytics": {
"labels": [
{
"name": "internal",
"vendors": []
}
]
}
}
]
}
]
}
}
Download documents
| Old API (XML) | New API | |
|---|---|---|
| Endpoint | /v1/documents.xml |
/v1/pdf/statements |
| Required parameters | ownerExternalId |
ownerExternalId |
| Response and exception format | XML | JSON |
| Fetching the file | GET request to fileUrl |
GET request to fileUrl |
| Fetching the file exception format | XML | JSON |
<?xml version="1.0" encoding="utf-8"?>
<reply xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://developer.kontomatik.com/schema/v1/get_documents.xsd"
status="200 OK"
user="demo-test"
responseTimestamp="2026-01-08T07:58:15Z">
<documents>
<document>
<state>SUCCESSFUL</state>
<parsingDate>2026-01-08</parsingDate>
<fileUrl>https://test.api.kontomatik.com/v1/document-file?documentId=549262652</fileUrl>
</document>
<document>
<state>ERROR</state>
<parsingDate>2026-01-08</parsingDate>
<fileUrl>https://test.api.kontomatik.com/v1/document-file?documentId=549262691</fileUrl>
</document>
</documents>
</reply>
{
"requestId": "019b9c9c-64ee-791a-a89f-90ba1311db0c",
"content": {
"ownerExternalId": "1767858629",
"documents": [
{
"statementId": "549262652",
"state": "SUCCESSFUL",
"parsingDate": "2026-01-08",
"fileUrl": "https://test.api.kontomatik.com/v1/pdf/statements/549262652"
},
{
"statementId": "549262691",
"state": "ERROR",
"parsingDate": "2026-01-08",
"fileUrl": "https://test.api.kontomatik.com/v1/pdf/statements/549262691"
}
]
}
}
Handling inconsistent layouts
Inconsistent layouts are bank statement layouts where some information necessary to perform consistency checks is missing, such as balances after transactions, sum totals of the transactions or the statement was filtered by the user. In the XML version you could skip this particular check by using the acceptInconsistent parameter, which is no longer supported and has been replaced by a different approach in the new API.
For consistency with other verification mechanisms, layout inconsistency is now treated as a regular verification check. It is returned in the failedVerification list as a sort of warning (ConsistencyCheckWarning) and indicates that some consistency checks could not be performed. In the new API, to accept such documents and receive parsed data, you simply need to use the trusted mode.
Below you’ll find the key differences in handling such documents compared to the XML API.
| Old API | New API | |
|---|---|---|
| Detect inconsistent layout | InvalidStatementConsistency exception returned + specific messageor isConsistent = false (when acceptInconsistent = true was used) |
Standard mode:FailedVerifications exception + ConsistencyCheckWarning in the failedVerifications list within exception structureTrusted mode: ConsistencyCheckWarning in the failedVerifications list within content structure |
| Accept inconsistent layout | Use acceptInconsistent = true in the API call |
1. Use standard mode 2. Receive FailedVerifications + ConsistencyCheckWarning3. Check if you accept other failed verifications (if there are any) 4. Use trusted mode for the same PDF or 1. Use trusted mode 2. Receive FailedVerifications + ConsistencyCheckWarning3. Check if you accept other failed verifications (if there are any) 4. Review the parsed data 5. Save the output in your systems* In both scenarios you will be billed only once. |
*the output of the trusted parsing will be saved on Kontomatik servers regardless of whether you accept/save it on your side or not.
Billing
The billing rules are the same as they’ve always been and they are strictly related to whether we returned the parsed data or not.
| PDF Characteristic | Old API (XML) | New API (Standard Mode) | New API (Trusted Mode) |
|---|---|---|---|
| Document was parsed without issues | * Data returned * Billed |
* Data returned * Billed |
* Data returned * Billed |
| Document could be parsed but modification indicators were found | * Exception returned * Failed verifications returned * Not billed |
* Exception returned * Failed verifications returned * Not billed |
* Data returned * Failed verifications returned * Billed |
| Parsing error occurred | * Exception returned * Not billed |
* Exception returned * Not billed |
* Exception returned * Not billed |
| Parsing error occurred and modification indicators were found | * Exception returned * Failed verifications returned * Not billed |
* Exception returned * Failed verifications returned * Not billed |
* Exception returned * Failed verifications returned * Not billed |