Data Analysis
Summary
Kontomatik’s Data Analysis products, a suite of tools developed using advanced proprietary algorithms and substantial data sources, which enable automated and accurate decision-making based on end-user transactions. This article aims to give you an in-depth understanding of how to use these tools and services effectively.
Labeling
Our proprietary algorithm can dissect each transaction and assign appropriate labels from over 60 available (e.g. monthly, salary, compensation, loan, welfare, cash withdrawal, nightlife, rent). The assignment is based on the relationship uncovered by the labeling model in the historical data. The data has been handcrafted by human engineers based on the strict rules of defining each label.
Having labeled transactions gives you a clear and immediate picture of where the end-user spends their money and where they get it. To make this even more seamless, we’ve built the Insight portal that lets you see the results on the web page with a graphical interface.
Labeling is currently available in Poland, Spain, Czech Republic, and Portugal.
How to Integrate?
If you have purchased the transaction labeling feature, labels will automatically appear next to transactions imported using our Banking API. For more technical information, check out the section on Labeling in our documentation
Scoring
Having labels is a great advantage, but the end-user’s repayment score is the ultimate goal. Our machine learning algorithm has been designed to achieve exactly this. It can calculate the probability of repayment based on provided labels and suggest the decision to be made. Specifically, the algorithm returns the probability, percentile and tier (e.g. A, B, C, or D).
If you already have a scoring implemented on your side, you can still use our solution - we consider our model to be a great addition to other scoring methods that supports decision making.
Scoring is available in Poland and Spain, only for clients who also use our labeling solutions.
How to Integrate?
To get the score, call the owner-scores
endpoint using the ID you have assigned to the end-user. This will return the probability, percentile, and tier (e.g. A, B, C, or D) of the end-user’s repayment score, based on the provided labels. For more technical information, check out the section on Scoring in our documentation
Data Summary
The Data Summary is a calculation tool that provides users with a comprehensive overview of their data. It offers a wide range of metrics categorized by different criteria, all within a specified date range. You can choose between two types of summary reports:
-
Accounts Summary: This report provides metrics categorized per account, giving a clear breakdown of financial data associated with each account.
-
Owner Summary: This summary takes information from all accounts and divides it into currencies. Additionally, each currency’s data can be further broken down by month within the specified date range.
Moreover, if you have the labeling service, you can access calculations for all transactions containing available labels or a specific set of labels. This feature increases the level of analysis of the data you are interested in.
How to Integrate?
To access the Data Summary, you can choose between two types of summary reports: Accounts Summary and Owner Summary. Use the appropriate endpoint to fetch the desired summary data. For more technical information, check out the section on Data Summary in our documentation
Income Confirmation
Income Confirmation is a tool that provides a categorized summary of an end-user’s income sources over the past three months, utilizing bank transaction data obtained through Kontomatik.
When using the Income Confirmation product for a specific end-user, imported bank accounts are split by currencies. For each currency, a summary is generated, dividing data into three months: M-0 (current month), M-1 (previous month), and M-2 (two months ago). The summary for each month provides total incoming transaction amounts across various income categories such as salary, pension, and other sources of income.
Additionally, a summary for the last 90 days is provided, with totals divided by 3 to create a monthly average in the API response. The income categories are determined based on Transaction Labeling performed by Machine Learning and other algorithms, ensuring accurate and efficient calculations.
Like with Data Summary, for billing purposes, all requests to the Income Confirmation endpoint for a single end-user and the scope of imported data are treated as one event.
How to Integrate?
For Income Confirmation, simply call the owner-income-confirmation endpoint with the end-user’s identifier (ownerExternalId
) to get a categorized summary of their income sources over the past three months. For more technical information, check out the section on Income Confirmation in our documentation
FAQ
Scoring is a method that calculates probability of someone’s repayment (score) given the labels and other transaction details. Specifically, the algorithm returns the score, percentile, and tier (e.g. A, B, C, or D). The algorithm is trained using historical data and so is able to recognize very intricate relationships not easily spotted.
If you already have a scoring implemented on your side, you can still use our solution - we consider our model to be a great addition to other scoring methods that supports decision-making.
Scoring is available only for clients who also use our labeling solutions.
If you have labelling enabled, you will get labels along with the transactions automatically. You can also find them in the Insight portal in the section with imports.
To get the scoring you’ll need to perform a separate request after all desired user data has been imported - you can find out more in our documentation.
The scoring algorithm has been designed to help lenders make the decision about their clients given the labels and other transactional data. Despite the fact that the transactions contain all the information that is needed for prediction, the relationship between them and the client’s default may not be easy to uncover for a human. The scoring algorithm was trained on massive amounts of data and so is able to spot these tiny dependencies. Moreover, it is able to explain its result by indicating the labels that were important in its decision process.