Documentation
APICreate a Ticket
  • πŸ“”Documentation Guide
  • πŸ“¨Data & Engagement Platform
    • Data Model
      • Customer Data
      • Product Catalogue
      • Events
        • Onsite Events
        • Outbound Events
        • Transactions
    • App User Management
      • User Roles
      • App Users
    • Data Management
      • Data Management with API
      • Data Management with Data Feeds
        • Users
          • Import Users
          • Update or Delete a User Trait via File Upload
          • List of Standard User Traits/Attributes
        • Products
        • Events
          • Import Orders
          • Import Events
          • List of Events and Properties
      • Data Export
    • Web Tracking
      • Web Tracking v1 (Legacy)
        • Setup
        • Set up business Units
        • Tracking User Behavior
          • Identify Method
          • Page Method
          • Track Method
          • Opt-Out from Tracking
        • Testing & Debugging
      • Web Tracking v2
        • Setup using Google Tag Manager
        • Setup without a Tag Manager
        • Configuration & Config Commands
        • Tracking User Behavior
          • Identify Method
          • Update Method
          • Page Method
          • Track Method
          • Opt-Out from Tracking
        • Testing & Debugging
      • Migrate Tracking SDK v1 -> v2
      • Mobile Web Tracking
    • User Segmentation
      • Creating a Segment
      • Conditions
      • Combining Segments
    • Messages
      • Frequency Capping
      • Templates
        • Template Builder
      • Integrations
        • Channels
          • SMS
            • Twilio
              • Set up a Twilio Account
              • Set up Twilio Integration
              • Create a Campaign Message
            • Link Mobility
              • Set Up Link Mobility Account
              • Set up Link Mobility Integration
              • Create a Campaign Message
          • SFTP
            • Set up SFTP Integration
            • Create a Campaign Message
          • Direct Mail
            • Optilyz
              • Set up Optilyz Integration
              • Create a Campaign Message
          • Webhooks
            • Single Webhook
              • Set up Webhook Integration
              • Create a Campaign Message
              • Response Data and Custom Events
            • Batch Webhook
              • Set up Batch Webhook Integration
              • Create a Campaign Message
            • Zenloop via CrossEngage Webhook
              • Set up Zenloop Integration (via CrossEngage Webhook)
              • Set up Zenloop Survey
              • Create a Campaign Message
              • Set up Zenloop Survey via ESP
              • Obtain Response Data
            • Google Analytics via Webhook
          • Segment Transfer
            • Facebook
              • Set up Facebook Developer Account
            • Optimizely
              • Set up Optimizely Account
            • Google Analytics
              • Set up Google Analytics Integration
              • Create a Campaign with Google Analytics
              • Using the Google Analytics Integration
            • Airship
              • Set up an Airship Account
              • Set up an Airship Integration
              • Create a Campaign Message with Airship
          • Onsite Display
            • Trbo
              • Set up Trbo Integration
              • Create a Campaign Message in CrossEngage
              • Configure Campaign Message in Trbo
              • Obtain Response Data
          • Email
            • Mailjet
              • Set up Mailjet Integration
              • Obtain Response Data via Webhook
              • Create a Campaign Message
              • Personalize Preview Texts in Mailjet
            • Mandrill (by MailChimp)
              • Set up Mandrill Integration
              • Obtain Response Data via Webhook
              • Create a Campaign Messege
            • Inxmail
              • Set up Inxmail Integration
              • Create a Campaign Message
            • Sendgrid (by Twilio)
              • Set up SendGrid Integration
              • Obtain Response Data via Webhook
              • Create a Campaign Message
            • Mailgun
              • Set up Mailgun Integration
              • Obtain Response Data via Webhooks
              • Create a Campaign Message with Mailgun
            • Episerver (Optimizely)
              • Set up Episerver Integration
              • Create a Campaign Message with Episerver
          • Push Notifications
            • Airship
              • Set up an Airship Account
              • Set up an Airship Integration
              • Create a Campaign Message with Airship
        • Attachments
        • Delete an Integration
      • Personalization
        • Import Data
          • User Profile Data
          • Campaign Data
          • Cart Data
        • Formatting Functions
          • Date Formatting
          • Number Formatting
          • String Formatting
          • Hash Functions
        • General Helper Functions
          • Conditional Functions
          • Filtering Arrays
        • Product Helper Functions
          • Fetch from Product Feed
          • Fetch from Tracking Event
          • Fetch from User Journey
        • Misc. Helper Functions
          • Voucher Helper Function
            • Vouchers: Use Case
          • Event Helper Functions
          • Opt Out Helper Functions
        • Operators
      • Vouchers
        • Creating Vouchers
        • Using Vouchers
    • Campaign Management
      • Campaigns
        • Create an Audience Campaign
        • Create a Real-Time Campaign
        • Control Group
      • Stories
        • Building a Story
        • Use Case: Welcome Story
      • Segment Transfer
        • Create a Segment Transfer Campaign
    • Consent Management
      • Subscription/Consent Management
      • System Opt-Out/Opt-In
      • System Blacklist/Whitelist
    • Prediction Models
      • Create a new Model
      • Feature Engineering
      • SQL Filter
    • System Monitoring
      • Dashboard
        • Segment Tracker
      • Events Overview
      • Activity Log
      • Slack Notifications
        • Setting up Slack Notifications
    • Help & Support
      • System Status
      • Reach out to Customer Support
      • Suggest an Improvement
      • Privacy Policy
    • Glossary - Data & Engagement
  • πŸ“ˆPredictions Platform
    • Data Model
      • Customer Data
      • Transactions
      • Activities
    • Overview
      • Data Tab
        • Data Tables in the Predictions Platform
      • Insights Tab
      • Model Builder Tab
        • Feature Engineering
        • Custom SQL Filter
        • Model Report
      • Prediction Tab
      • Selections Tab
    • Tutorials
      • Prepare and Validate Data
      • Analyze RFM Customers
      • Create a new Model
      • Predict Campaign Profit
    • Glossary - Predictions
Powered by GitBook
On this page
  • Basic Features
  • Advanced Features
  • Understanding Features of a Model
  • Naming Convention
  • Time with respect to t0
  1. Predictions Platform
  2. Overview
  3. Model Builder Tab

Feature Engineering

In Machine Learning, a feature is an individual measurable property that serves as an input to the Model. A feature can be a field of the Customer table, such as "customer_since", or a processed or combined form of several fields of the data tables. The process of taking available data fields and transforming them into useful Features for the Model is known as Feature Selection or Feature Engineering.

CrossEngage automatically retrieves data from the Customer, Transaction and Activity tables, and processes them into useful Features, which are then fed into the Model.

The importance of a specific feature depends on the data; hence for different data with the same / different layout, the importance of features changes. The most important features are automatically selected in any case.

Basic Features

Customers

All uploaded fields (columns in tables) in the Customers Table are prepared. However, only the additional Features that are explicitly mentioned in the Model Builder go into the model, in order to ensure reliability of the Model.

Transactions

The transaction data is aggregated to calculate a summary of transactions for each customer. This can then be combined with Customer data to train the Model.

  • Recency: The time period (in weeks) between t0 (the time of modelling) and the most recent transaction

  • Duration: The time period (in weeks) between t0 (the time of modelling) and the oldest transaction

  • Frequency: Number of different invoice_ids

  • Item_count: Number of invoice items (number of records)

  • Quantity: Sum of the Quantity column

  • Revenue: Total revenue ( sum of (price*quantity) )

  • Value: Sum of sales according to the target definition specified in the Model Builder

  • Any: Boolean indicator whether transactions exists

For calculating Value for a customer:

  • Positive order_types are positive β†’ price = abs(price)

  • Negative order_types are received negatively β†’ price = -abs(price)

  • Neutral order_types are included without modificationβ†’ price = price

  • Additionally, only transactions that take into account the other target definitions that have been set, e.g. B. Filter for certain product groups or order channels

Activities

Similar to transaction data, activity data is aggregated to calculate a summary of transactions for each customer.

  • Recency: The time period (in weeks) between the most recent uploaded data and the most recent activity

  • Count: Number of activity records

  • Stock: Time-weighted sum of activity records. It can be calculated with the formula:

Stockc,interval=βˆ‘n=0N0.8tnStock_{c,interval} = \sum_{n=0}^{N} 0.8^{t_n}Stockc,interval​=n=0βˆ‘N​0.8tn​

where N is the number of interactions of Customer c in an interval, and t(n) is the number of days between the start date of the interval, and the activity n

  • Any: Boolean indicator whether activities exist

Advanced Features

The Features of transactions and activity can also be restricted in two dimensions; intervals (e.g "within the last 30 days") and categories (e.g "from product group A”). It is also possible to combine these restrictions, e.g "within the last 30 days from Product Group A".

This means that additional features can be created by calculating features with restrictions. Crossengage's automatic data preparation already provides extended basic features for a certain set of categorical properties (always assuming that the information is available in the data!).

By category

The following additional Features can be calculated from the Transactions data.

  • The top 8 Categories of the field: productgroup_id

  • The top 5 Categories of the field: order_type

  • The top 8 Categories of the field: order_channel

  • The top 3 Categories of the field: mapped_order_type

The following additional Features can be calculated from the Activity data.

  • The top 5 Categories of the field: wtr_type

  • The top 5 Categories of the field: activity_type

Understanding Features of a Model

Choose the model you wish to see the features for. Click on Model Report, and scroll down to the section: "Weight of variables for Conversion / Value model".

The "weight" of a feature is its relative importance of the feature in making the prediction.

Here you can see a list of features for this model. In this example, the top 6 features combined provide roughly 2/3rds of the predictive power of the model.

Naming Convention

Features follow the following naming scheme:

[SPECIAL] (FEATURE)_IN_(INTERVAL) [AND_(CATEGORY)_IS_(ELEMENT)]

Here, square brackets denote optional parts of the name, while round brackets contain placeholders.

  • Special: If this contains the word FILTER, this means that the entire history of the field up to t0 is used.

  • Feature: This is the name of the field the feature is created from, e.g transaction_recency, revenue.

  • Interval: The time interval filter, with respect to t0. You can find more about t0 below.

  • Category: Category filter, e.g order_type or productgroup_id.

  • Element: Value of the category, e.g sale, return or Shoes.

Time with respect to t0

The data preparation is based on points in time (t0) set in the past. Customer behavior before these points is used to draw conclusions about behavior (according to the target variable) after the respective point in time.

Generally, you can understand t0 as the time of the last transaction in the dataset. It is the point in time where the available data ends, and after which the behavior of the User is predicted.

PreviousModel Builder TabNextCustom SQL Filter

Last updated 1 year ago

πŸ“ˆ