# Data Tables in the Predictions Platform

CrossEngage's Customer Prediction Platform uses 3 types of Data:

* Customer Data
* Transaction Data
* Activity Data (Optional)

### Customer Data

The Customer Table contains details of your Customers. This table has two mandatory fields: `customer_id` (string) and `customer_since` (date). You can also add additional data present in your Customer records, which can help improve accuracy of your Models.

{% hint style="warning" %}
The `customer_id` field is the primary key of this table. Each data record must have a Unique value for this field.
{% endhint %}

<table><thead><tr><th width="187">Field Name</th><th width="118">Data Type</th><th width="124">Example</th><th>Description</th></tr></thead><tbody><tr><td>*customer_id</td><td>String</td><td>e131498</td><td>This is the key field, used for Identification of Customers. All values in this field must be Unique.</td></tr><tr><td>*customer_since</td><td>Date</td><td>2010-12-21</td><td>Date of first contact with the Customer. This field is vital for future predictions, and every customer should have a value for this field.</td></tr><tr><td>customer_gender</td><td>String</td><td>female<br>Ms.<br>LLC</td><td>Gender or Title of the Customer; Used as Categories for Pattern Recognition.</td></tr><tr><td>customer_age</td><td>Date</td><td>1997-06-22</td><td>Date of birth of Customer / Date of foundation of Customer company. Missing values are acceptable.</td></tr><tr><td>zipcode</td><td>String</td><td>50676</td><td>ZipCode of the Customer.</td></tr><tr><td>customer_country</td><td>String</td><td><p>France</p><p>DE</p></td><td>Country where the Customer is based; Used as Categories for Pattern Recognition.</td></tr><tr><td>email_ending</td><td>String</td><td>gmail.com<br>live.com</td><td>Customer's email address provider. If email is mapped to this field, the part before @ is automatically removed; Used as Categories for Pattern Recognition</td></tr><tr><td>phone</td><td>String</td><td>0176<br>0153</td><td>The starting digits of the Customer's phone number; Used as Categories for Pattern Recognition</td></tr></tbody></table>

### Transaction Data

The Transaction Table contains details of Transactions made by Customers, such as Purchases and Returns. The Transaction table has 7 Mandatory fields, as this data is required for understanding the long-term Value of every Customer for better predictions.

{% hint style="warning" %}
The primary key for this table is the combination of 4 fields: `customer_id`, `invoice_id`, `transaction_timestamp` and `item_number`. In other words, for every transaction made by a customer at a given time, each record should contain a Unique item.&#x20;
{% endhint %}

<table><thead><tr><th width="183">Field Name</th><th width="92">Data Type</th><th width="119">Example</th><th>Description</th></tr></thead><tbody><tr><td>*customer_id</td><td>string</td><td>e131498</td><td>The Customer ID; Unique identifier used for merging all tables.</td></tr><tr><td>*transaction_ timestamp</td><td>date</td><td>2013-01-17</td><td>Date and (optionally) Time of the Transaction</td></tr><tr><td>*invoice_id</td><td>string</td><td>i_1021</td><td>Unique ID of every Order; Used to track different items in the same Order, or to match Orders with Returns</td></tr><tr><td>*item_number</td><td>string</td><td>3</td><td>Item number of each item in an Order. Must not be repeated within a single Order/Invoice</td></tr><tr><td>*price</td><td>decimal</td><td>29.99<br>179</td><td>Unit Price per item. Can be Net / Gross / Discounted Price etc.</td></tr><tr><td>*order_type</td><td>string</td><td>sale<br>return<br>neutral</td><td>The Order Type ( Sale / Return / Neutral ) of each Order. By default, Sale is Positive, Return is Negative while Neutral is ignored.</td></tr><tr><td>*quantity</td><td>decimal</td><td>3.5<br>-1</td><td>Can be number of items ordered, or weight/volume. This is multiplied by price to get total value of Ordered Item.</td></tr><tr><td>gross_margin</td><td>decimal</td><td>17.04<br>29</td><td>Gross profit margin per unit. Can be used to select a Margin-based view instead of a price-based view.</td></tr><tr><td>order_channel</td><td>string</td><td><p>internet</p><p>telephone</p><p>mail</p><p>e-mail</p><p>fax</p><p>in_person</p></td><td>Information about how the order was placed. The 6 values on the left are accepted by default. For passing more values, please <a href="../../../data-and-engagement-platform/help-and-support/reach-out-to-customer-support">Reach out to Customer Support</a>.</td></tr><tr><td>productgroup_id</td><td>string</td><td>clothing<br>furniture</td><td>Category to which product belongs to. Ideally, this should not have more than 20 Unique entries.</td></tr><tr><td>productgroup_id2</td><td>string</td><td>shirts<br>shoes</td><td>A finer product group can be passed in this field to create finer NBO models in expert mode.</td></tr><tr><td>productgroup_id3</td><td>string</td><td>t-shirts<br>sneakers</td><td>In this field an even finer product group can be passed to create finer NBO models in expert mode.</td></tr><tr><td>product_id</td><td>string</td><td>p063</td><td>This field is only used if referenced in expert mode. May then be used to compute product specific models.</td></tr><tr><td>return_reason</td><td>string</td><td>NULL<br>wrong size</td><td>The reason for return or cancellation.</td></tr><tr><td>voucher</td><td>string</td><td>8WBA2TV</td><td>The presence or type of a voucher can be stored here; Used as Categories for Pattern Recognition.</td></tr><tr><td>size</td><td>string</td><td>XL<br>12L</td><td>This field can be used for Prediction Models; It also serves as a feature to analyze frequency distribution of different sizes.</td></tr><tr><td>product_supplier</td><td>string</td><td>IKEA<br>S123</td><td>This field can be used for Prediction Models; It also serves as a feature to analyze frequency distribution of different Suppliers.</td></tr></tbody></table>

### Activity Data

To improve model and forecasting quality, you can upload additional tables that contain specific activities and interactions with your customers. These tables must each contain at least a unique customer ID, a timestamp, and an activity type. Additionally, an activity\_id can be passed that links the activity to a specific action. However, this data is not required for an initial well working model and is therefore optional.

You can upload one or more of these tables:

* Online activities (customer in login area on your website / click in email)
* Inbound activities (customer calls call center, letters and emails from customer)
* Outbound activities (print mailings, catalogs, calls from a call center)
* Payment activities (customer pays an invoice, receives a dunning level, is referred to collections)

{% hint style="info" %}
In order to make use of these activities in the modelling process, it is important that they are stored in the so-called event format (just like transactions). This means that each row in an activity table describes exactly one 'event' at a time with a customer.
{% endhint %}

<table><thead><tr><th width="162">Field Name</th><th width="125">Data Type</th><th>Example</th><th>Description</th></tr></thead><tbody><tr><td>*customer_id</td><td>string</td><td>e131498</td><td>The Customer ID; Unique identifier used for merging all tables.</td></tr><tr><td>*activity_timestamp</td><td>date</td><td>2015-01-13</td><td>Date and (optionally) Time of the Transaction</td></tr><tr><td>*activity_type</td><td>string</td><td>CartAdd<br>CartCancel</td><td>Used to generate patterns for the most common types.</td></tr><tr><td>activity_subtype</td><td>string</td><td>Trousers Black<br>4RTR3022DS</td><td>Can be used to generate additional patterns.</td></tr></tbody></table>
