This document describes ideal formats for data that you ship to Webtrends Optimize from your 3rd party platforms, like Snowflake, Databricks, Bigquery, etc.

Allowed formats

We accept files in all major formats: CSV, JSON, Parquet, Avro, XML.

To make the overhead both on your side and on the WTO platform as small as possible, we recommend formats like CSV over JSON/XML as it's far more concise.

Regardless, we are able to work with any of those 5 file formats. If you need something surplus to this, please reach out for support and we may be able to accomodate.

Format Option 1: One, large file

If you wish to compile all segment/preference assignments and ship it in one file to Webtrends Optimize, you can do so in one large, wide file.

We recommend to do this only if you total data size is less than 100mb. Any more, and systems may start to struggle to open individual large files.

In this format, we would expect:

The first column to be the user_id, typically of your 3rd party system. This is what we will attempt to pair with data found on the front-end of your website.
Subsequent columns of undetermined quantity, with dynamic values.
Raw segment assignments, i.e. "the user is in this group" should have boolean values of 1 (true) or 0 (false).

Example

user_id,is_vip,multi_time_purchaser,top_category,...
100054391.1700666161,1,1,jeans,...
200597573.1698063281,1,0,dresses,...
700101943.1702233191,0,0,jeans,...
1003024100.1700667329,1,0,tshirts,...
1006114757.1698395007,1,0,formalwear,...
1009145025.1692992974,0,0,accessories,...
200086821.1617038270,1,1,dresses,...
900025983.1708772046,1,0,dresses,...
...

These being user-centric rows makes it very easy to compress and translate.

Format option 2: Separate files

If you have incremental updates, some segments that don't update often, or large volumes, it might make more sense to ship segment groups as individual records.

2.1 - Segment allocation only

When shipping "user is in segment X" style records, the only things we need are:

user_id, as the only column
file name, which we will interpret to be the name of the audience.

Example:

wto_vip_customers.csv

user_id
100054391.1700666161
200597573.1698063281
700101943.1702233191
1003024100.1700667329
1006114757.1698395007
1009145025.1692992974
200086821.1617038270
900025983.1708772046

As you can see, this is the most efficient format, and doesn't carry unnecessary values like "1" for every single record.

2.2 - Attributes

If the content of attributes are important, such as:

Date of last purchase
Favourite Category
Lifetime spend
etc.

You can instead opt for the same format as described in Option 1. But, you can ship multiple files.

These can be mixed, for example sending us some files in the format of 2.1 with just entries provided, and other files where attributes are given in a wide csv format.

FAQs

How do I get my credentials?

Please raise a support ticket, and we will send them to you.

What if I need to send data in another format?

Please ask us first, but we can typically accomodate anything.

What if I need to send data with different column names?

The above are just examples. Please check with us, but we can typically accomodate anything.

Pull Integrations - DWH - Data transfer specification