Contacts Us



Connectors listing


Organize your Audiences

Create a Dataset

Add Source to a Dataset


How to create a new Dataset

A Dataset is a combination of several data sources, on which you could apply data preparation rules and scorings, before syncing them to other tools.

For example, you could create datasets:

Datasets can be built from your raw data sources, or from another dataset.

You can create a dataset in no code or in SQL.

Let’s focus here on “No code” dataset creation.

1. Create the dataset from the first source

Step 1 - Create a dataset

From the left Menu Dataset, you can click on “Build Dataset” at the top right.

You will be invited to choose between “No code Builder” or “SQL Builder”, and to give a name to your dataset.

Step 2 - Choose the first source

The first step is to choose the data source of your dataset.

You can choose a table from one of the Sources defined in the “Connectors” menu. It is also possible to choose an existing dataset as a source.

<aside> 💡 For files in your FTP, there is an option to choose several files at once using a regex expression on the file name.


Step 3 - Describe the first source

You are invited to:

<aside> 💡 For the real-time purpose, datasets could be fueled by API or webhooks.


Step 4 - Define the fields to import

The objective here is to choose the fields from your data source that will be imported into your dataset. You can choose all fields or select only some of them.

Octolis detects automatically most data “Types”, but we can make some mistakes. Please take the time to review the right data “Type” because a wrong data “Type” may create issues when applying a data preparation recipe on this column.

In the “Advanced settings”, you can choose if Octolis imports all the data source files each time or only the updated records since the last import to fuel the dataset. For obvious performance reasons, the default and recommended option are to import only the last updated records. This implies having a reliable “Updated at” column in your source file.

Step 5 - Map the fields with your dataset + Define the dedupe key

By default, your dataset columns have the same names that one of the data sources, and you can rename them in this step.

In the “Advanced settings” menu, you can define the “Dedupe” key. It could be the main ID of your data source records, a column “Email” or a combination like “Firstname” x ”Lastname” x ”Postal code”.

When two records imported have the same “Dedupe key”, they are merged when imported into the audience.

👉 More info on dedupe

Next step

👉 Join or merge your Audience with another Source

On this page