Model Report
Last updated
Last updated
The Model Report contains in-depth information about the trained Model. This can help you assess the strength of the model, and decide whether it is suitable for the required application.
To access the Report for any Model, go to the Model Builder, and click on the respective button under 'Report'.
The automatic model curation shows the statistical quality of the model, as determined by our automatic analysis. It serves as an aid for assessing whether the model is useful from a data science perspective. There are three different states for each criterion; green, yellow and red. If a criterion is red, the model should be used with caution. For further information about the metrics and their thresholds, see the tooltips. In case of doubt, please Reach out to Customer Support.
The automatic model curation only checks statistical criteria. Even a completely "green" report does not guarantee a Model's commercial success.
The Model Builder's output during training is presented here in the form of logs, in a largely unfiltered manner. Users with data science experience can retrace every part of the process here. There are four different logs, not all of which are generated depending on the model configuration.
Data Prep: This log represents the data preparation process, most notably the creation of the target variable, and processing of a customer's transaction history into distinct features. In this notebook you can follow all the steps that took the data from the imported raw shape in the app to the finished flat file for modeling.
Data Quality: This log shows all the created features. Descriptive statistics are calculated for each feature. Use the search box at the top of each table to find information on a particular feature.
Value Modelling: If value modelling was conducted, you can find here detailed information about the calculated value model, such as the result of the feature selection and reduction, the importance of individual features, and the quality of the model.
Conversion Modelling: You can find here detailed information about the calculated conversion model, such as the result of the feature selection and reduction, the importance of individual features, and the quality of the model.
The ROC curve shows the overall predictive power of the Model, by assessing how reliable the ranking of customers by their conversion probability is. The diagonal of the dark blue triangle shows the share of all buyers one could expect to have covered (y-axis) when taking a random sample of a particular size (x-axis), while the outline of the light blue area shows the coverage when taking x% of the top contacts according to their model score. In the example below, for comparative target periods in the past, the top 20% of customers sorted by model score already accounted for 80% of the buyers.
The overall quality of the model is indicated by a percentage between 0 and 100%, represented by the colored areas. Any value above 50% means that the model allows for a better-than-random sorting. A good lift is usually generated from approx. 65-70%. Values between 72 and 92% are considered very good. Anything above 92% can still be desirable, but may indicate target leakage during training.
This plot shows how well the model is able to predict conversion probabilities. To do this, all customers are sorted according to their response predictions and divided into 10 equal groups. On the far left are the top 10% and on the far right are the bottom 10%. The quality of the model can be seen from two aspects:
The bars of the actual mean conversion rates (dark orange) should fall steeply off going from the left to the right. If the falloff is steep, you have a model that is good at sorting your customers by their likelihood of convering.
The bars that signal the mean predicted conversion probabilities (light orange) and the bars showing off the actual mean are about the same height. This is an indication of the prediction quality in absolute terms. This is typically much more difficult to achieve and is not necessary for all use cases.
This plot shows how well the model is able to predict a value target such as revenue. To do this, all customers are sorted according to their respective predictions and divided into 10 equal groups. On the far left are the top 10% and on the far right are the bottom 10%. The quality of the model can be seen from two aspects:
The bars that signal the mean predicted values (light orange) and the bars showing off the actual mean are about the same height. This is an indication of the prediction quality in absolute terms. This is typically much more difficult to achieve and is not necessary for all use cases.
The bars of the actual mean values (dark orange) should fall steeply off going from the left to the right. If the falloff is steep, you have a model that is good at sorting your customers by the particular value variable of interest.
This chart shows the Variables (features) that went into the model in order of their information value. The further up the variable can be found in the graph, the more impact its values have in determining the individual predictions.