# Data Tables in the Predictions Platform

CrossEngage's Customer Prediction Platform uses 3 types of Data:

* Customer Data
* Transaction Data
* Activity Data (Optional)

### Customer Data

The Customer Table contains details of your Customers. This table has two mandatory fields: `customer_id` (string) and `customer_since` (date). You can also add additional data present in your Customer records, which can help improve accuracy of your Models.

{% hint style="warning" %}
The `customer_id` field is the primary key of this table. Each data record must have a Unique value for this field.
{% endhint %}

<table><thead><tr><th width="187">Field Name</th><th width="118">Data Type</th><th width="124">Example</th><th>Description</th></tr></thead><tbody><tr><td>*customer_id</td><td>String</td><td>e131498</td><td>This is the key field, used for Identification of Customers. All values in this field must be Unique.</td></tr><tr><td>*customer_since</td><td>Date</td><td>2010-12-21</td><td>Date of first contact with the Customer. This field is vital for future predictions, and every customer should have a value for this field.</td></tr><tr><td>customer_gender</td><td>String</td><td>female<br>Ms.<br>LLC</td><td>Gender or Title of the Customer; Used as Categories for Pattern Recognition.</td></tr><tr><td>customer_age</td><td>Date</td><td>1997-06-22</td><td>Date of birth of Customer / Date of foundation of Customer company. Missing values are acceptable.</td></tr><tr><td>zipcode</td><td>String</td><td>50676</td><td>ZipCode of the Customer.</td></tr><tr><td>customer_country</td><td>String</td><td><p>France</p><p>DE</p></td><td>Country where the Customer is based; Used as Categories for Pattern Recognition.</td></tr><tr><td>email_ending</td><td>String</td><td>gmail.com<br>live.com</td><td>Customer's email address provider. If email is mapped to this field, the part before @ is automatically removed; Used as Categories for Pattern Recognition</td></tr><tr><td>phone</td><td>String</td><td>0176<br>0153</td><td>The starting digits of the Customer's phone number; Used as Categories for Pattern Recognition</td></tr></tbody></table>

### Transaction Data

The Transaction Table contains details of Transactions made by Customers, such as Purchases and Returns. The Transaction table has 7 Mandatory fields, as this data is required for understanding the long-term Value of every Customer for better predictions.

{% hint style="warning" %}
The primary key for this table is the combination of 4 fields: `customer_id`, `invoice_id`, `transaction_timestamp` and `item_number`. In other words, for every transaction made by a customer at a given time, each record should contain a Unique item.&#x20;
{% endhint %}

<table><thead><tr><th width="183">Field Name</th><th width="92">Data Type</th><th width="119">Example</th><th>Description</th></tr></thead><tbody><tr><td>*customer_id</td><td>string</td><td>e131498</td><td>The Customer ID; Unique identifier used for merging all tables.</td></tr><tr><td>*transaction_ timestamp</td><td>date</td><td>2013-01-17</td><td>Date and (optionally) Time of the Transaction</td></tr><tr><td>*invoice_id</td><td>string</td><td>i_1021</td><td>Unique ID of every Order; Used to track different items in the same Order, or to match Orders with Returns</td></tr><tr><td>*item_number</td><td>string</td><td>3</td><td>Item number of each item in an Order. Must not be repeated within a single Order/Invoice</td></tr><tr><td>*price</td><td>decimal</td><td>29.99<br>179</td><td>Unit Price per item. Can be Net / Gross / Discounted Price etc.</td></tr><tr><td>*order_type</td><td>string</td><td>sale<br>return<br>neutral</td><td>The Order Type ( Sale / Return / Neutral ) of each Order. By default, Sale is Positive, Return is Negative while Neutral is ignored.</td></tr><tr><td>*quantity</td><td>decimal</td><td>3.5<br>-1</td><td>Can be number of items ordered, or weight/volume. This is multiplied by price to get total value of Ordered Item.</td></tr><tr><td>gross_margin</td><td>decimal</td><td>17.04<br>29</td><td>Gross profit margin per unit. Can be used to select a Margin-based view instead of a price-based view.</td></tr><tr><td>order_channel</td><td>string</td><td><p>internet</p><p>telephone</p><p>mail</p><p>e-mail</p><p>fax</p><p>in_person</p></td><td>Information about how the order was placed. The 6 values on the left are accepted by default. For passing more values, please <a href="/pages/vsqvoTMRDpXo7FaQqwgM">Reach out to Customer Support</a>.</td></tr><tr><td>productgroup_id</td><td>string</td><td>clothing<br>furniture</td><td>Category to which product belongs to. Ideally, this should not have more than 20 Unique entries.</td></tr><tr><td>productgroup_id2</td><td>string</td><td>shirts<br>shoes</td><td>A finer product group can be passed in this field to create finer NBO models in expert mode.</td></tr><tr><td>productgroup_id3</td><td>string</td><td>t-shirts<br>sneakers</td><td>In this field an even finer product group can be passed to create finer NBO models in expert mode.</td></tr><tr><td>product_id</td><td>string</td><td>p063</td><td>This field is only used if referenced in expert mode. May then be used to compute product specific models.</td></tr><tr><td>return_reason</td><td>string</td><td>NULL<br>wrong size</td><td>The reason for return or cancellation.</td></tr><tr><td>voucher</td><td>string</td><td>8WBA2TV</td><td>The presence or type of a voucher can be stored here; Used as Categories for Pattern Recognition.</td></tr><tr><td>size</td><td>string</td><td>XL<br>12L</td><td>This field can be used for Prediction Models; It also serves as a feature to analyze frequency distribution of different sizes.</td></tr><tr><td>product_supplier</td><td>string</td><td>IKEA<br>S123</td><td>This field can be used for Prediction Models; It also serves as a feature to analyze frequency distribution of different Suppliers.</td></tr></tbody></table>

### Activity Data

To improve model and forecasting quality, you can upload additional tables that contain specific activities and interactions with your customers. These tables must each contain at least a unique customer ID, a timestamp, and an activity type. Additionally, an activity\_id can be passed that links the activity to a specific action. However, this data is not required for an initial well working model and is therefore optional.

You can upload one or more of these tables:

* Online activities (customer in login area on your website / click in email)
* Inbound activities (customer calls call center, letters and emails from customer)
* Outbound activities (print mailings, catalogs, calls from a call center)
* Payment activities (customer pays an invoice, receives a dunning level, is referred to collections)

{% hint style="info" %}
In order to make use of these activities in the modelling process, it is important that they are stored in the so-called event format (just like transactions). This means that each row in an activity table describes exactly one 'event' at a time with a customer.
{% endhint %}

<table><thead><tr><th width="162">Field Name</th><th width="125">Data Type</th><th>Example</th><th>Description</th></tr></thead><tbody><tr><td>*customer_id</td><td>string</td><td>e131498</td><td>The Customer ID; Unique identifier used for merging all tables.</td></tr><tr><td>*activity_timestamp</td><td>date</td><td>2015-01-13</td><td>Date and (optionally) Time of the Transaction</td></tr><tr><td>*activity_type</td><td>string</td><td>CartAdd<br>CartCancel</td><td>Used to generate patterns for the most common types.</td></tr><tr><td>activity_subtype</td><td>string</td><td>Trousers Black<br>4RTR3022DS</td><td>Can be used to generate additional patterns.</td></tr></tbody></table>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.crossengage.io/predictions-platform/overview/data-tab/data-tables-in-the-predictions-platform.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
