Convert & Transform: A Detailed Breakdown

 

The most substantial part of creating a Dataflow is unquestionably the Transform & Map step, and while this of course is covered in the general guide, Introduction to the Data Manager: How to set up a Dataflow, a more detailed look at the options that this feature offers in terms of converting and transforming columns may be necessary. Thus, this list.

 

1. Convert

 

The most straightforward way to adjust your data is to simply convert it from one form or format to another. In most cases, it is possible to select a fallback value that will be employed as a default if a line is empty or invalid.

 

When selecting Convert, you will have the following options:

  • Boolean

A format used for purely binary datapoints of the type true/false. In some cases, the data will only contain a 'true' where relevant, and leave other fields blank - making it particularly important to use the Fallback Value field in this case, by defining it as 'false'. Also note that ONLY values of 'true' and 'false' can be read, so if any other format is used - such as 'yes/no' or '0/1', a Replace-transformation will be required in order to make the column readable.

  • Currency

Used to convert financial data from one currency to another, based on current exchange-rates. Available currencies include Danish, Swedish and Norwegian Kroner, US Dollars, British Pounds, Euros, Swiss Francs and Polish Zloty. You must also make sure to select which kind of decimal point is used by the column - with en being a period and da being a comma. The output number is also automatically converted to the Decimal format.

  • DateTime

Used for timestamps of various kinds, this frequently-needed conversion allows you to choose from a broad selection of common date and date-time formats, or enter a custom one that fits the data. If you mouse over the 'Date time formats' info-text just above the Convert-button, a guide to creating custom DateTime formats will appear, along with a link to a more detailed list.

  • Decimal

Used for decimal-numbers, such as cash values. You must select whether the numbers in question use an english-style sepparator (period, or point), or a danish-style sepparator (comma, also used in most other non-anglophone countries.) Be careful in selecting the correct one, since major errors can otherwise occur!

  • Email

Used for converting E-Mail lists into a format readable by the system. Typos in the e-mails can cause issues, with common ones including anything that ends with .con or .vom instead of .com - hence why the 'Clear out invalid emails' option is toggled on by default. If there is a large number of these, keep in mind that the Transform option makes it possible to do a Replace while converting the data into the Email format.

  • Enum

This advanced conversion allows you to set one or more specific values, be they numbers or text-strings. Only these known values will be accepted, and anything outside of them will generate an error. Useful for cases where you want a very rigid output and no margin of error, such as output from radio-buttons or checkboxes.

  • Guid

A specific format used for personal ID's, often customer IDs. The most common example of this is likely the Cookie ID that may be used in cases when you have nothing more solid, such as e-mails or similar, to fall back on.

  • Integer

This simply converts a datastring into the Integer-format, used for clean numbers without fractions. If a number with a fraction is converted using this, whatever lies after the decimal-point is simply ignored. Often used for non-monetary counts such as inventory, sales-numbers, visits, and much, much more. If it's a number without a decimal-point in it, Integer is a solid option - unless another, more specific format supercedes it. Note that there is an upper limit in terms of numbers - above roughly 2 billion, the system cannot parse it as an Integer, and the Conversion will fail. In such a case, consider using Decimal, which has a significantly higher limit.

  • PhoneNumber

Used to convert phone-numbers into a format useable by Raptor's systems. The conversion does not allow for dashes or brackets, so if some of the values in the column are likely to contain these, using the Replace Transformation to remove them before converting may be advisible. Only numbers, spaces and + is allowed. A nation-code is also required, and a 'default' can be entered that will be added to any phone-number that lacks on - useful, if the stored phone-numbers are purely from local customers and therefore lack the nation-code. The default nation-code should be entered without 0's or +. The Clear if invalid toggle will remove any invalid phone-numbers - this will include any with dashes or brackets, as mentioned above. The converted number is stored in the international standard E164 format - this includes a +, the nation-code, and the rest of the number, all without any spaces, dashes or similar. However, when used in Activations, the system will automatically convert them into the format required by the third-party recipient.

  • Regex

This conversion allows you to apply a Regular Expression - a 'Regex' term - to the column. Only values that match this term will be retained, all other data will be discarded. For example, if you have a column featuring the full name of customers, you could use ^\w+ to isolate the first name only, or \w+$ to isolate the last name only. The output is formatted as a regular String.

  • String

A String can be virtually anything, and is the 'default' format that the system imports all columns in. In the case of text-strings, such as product descriptions or lists of names, the column will generally remain in this format. In the case of a pure Convert, there is little reason to use this option, but it is there for a reason - having utility when using Transform instead since it allows you to run Transform-functions on a String without changing its format.

 

2. Transform

 

Transform is a more complex and powerful way to convert your data into the desired form. Not only does it allow for multiple columns to be combined into a single one, it can also apply a number of unique Transformations to a selected column prior to Converting them as usual.

 

When transforming a column, you have the following options:

  • Add Default Value

Similar to the Fallback Value setting available in many Conversions, this sets a value that will be applied to any empty line in the column.

  • AES Decryption

Used for AES Encrypted columns - enter the correct Decryption Key to return the data to a readable format, which can then be Converted as normal.

  • Format

Mostly used for text-strings. You can type in whatever you like, using {0} to represent the actual column content. For example, if the column is a list of brands, and you wish to output a text-string that states "Produced by [Brand]", you would simply type 'Produced by {0}' in the field.

  • Regex Replace

Use a Regular Expression to identify values. Those that match the expression will then be replaced by the set Replace Value. Essentially a more complex version of the regular Replace Transformation.

  • Regex Select

Effectively identical to the Regex Conversion-option, allowing you to select values based on a Regular Expression before applying a different Conversion to a given column. There is also a toggle for selecting whether a complete lack of matches for the Expression returns an empty field, or an error. For example, if one has a list of payments featuring decimals and currency-type, but you want to convert it into an Integer-format showing just the whole numbers, you could use the expression \d+ to isolate just the first whole number on each line - at which point it can be easily Converted into an Integer

  • Regex Select Many

A more advanced version of Regex Select. Rather than simply taking the first valid result of the entered Expression, this allows you to combine multiple results into a usable String, with the 'Combine Matches With' field denoting the symbol that is used to sepparate them. The 'Number of Matches' field is used to select the number of results that are recorded, with zero denoting no limit, and negative numbers causing the count to start from the bottom rather than the top.

  • Replace

This transformation allows you to effectively run a search-and-replace on the column. Whatever is selected as the Current Value will be transformed into whatever is selected as the New Value. This can apply to text-strings or numbers as needed. One possible use for this is to reduce the number of invalid addresses in an e-mail list by replacing common typos such as .con and .vom with .com.

  • Select List Item

This transformation is designed to pull a single entry out of a list. You may select the Separator, which will usually but not always be a comma - while the Index field simply refers to which entry the Transformation pulls out. An Index of 3 means that the third entry on the list is kept, for example. This may be useful if a data-dump, for some technical reason, contains important datapoints such as URLs, image-links or similar intermingled with irrelevant data. Alternately, if multiple crucial datapoints are stuck in the same column, you can use this transformation multiple times on it, and thus split them into discrete columns that can then be mapped separately.

  • Substring

Allows you to select a specific segment of a column to be used, discarding the rest. Startindex denotes where to begin, while Length determines how far beyond that point to continue. For example, selecting '5' and '10' as the values will cause the first 4 digits to be discarded, the 10 that follows to be retained, and anything after that to be similarly ignored. Can, for example, be used to pick a relevant value out of an otherwise regular URL.

  • Truthy

Transform any simple, binary listing into a clear-cut True/False statement - often useful for in-stock designators and similar, which can arrive in a number of different formats. The 'Truthy' variant will set any value other than 0 as True, and 0/blank spaces as False.

  • Falsy

Transform any simple, binary listing into a clear-cut True/False statement - often useful for in-stock designators and similar, which can arrive in a number of different formats. The 'Falsy' variant will set any value other than 0 as False, and 0/blank spaces as True.

  • URL Parameter

This transformation is used to pick specific parameters out of URLs, helping to identify click-sources, platforms, and so on. Simply enter the type of parameter you wish to scan the URLs for - such as UTM parameters or a Google Click ID - in the text-box, and the value of the selected parameter will be the output when running the dataflow. You can also toggle the 'Ignore Malformed URLs' radio-button if you wish - in such cases, flawed URLs will simply result in an empty line. The radio button 'Ignore empty values' can be switched on, if you wish to leave out URL parameters that have empty values. 

 

💡Example:

What are URL Parameters? URL Parameters get tacked onto an URL depending on circumstances, extending it. They generally take the form of {Parameter}={value}. For example, the value of the parameter utm_source might inform you that a customer arrived at the product by clicking on a link in an e-mail, or on Google. The utm_campaign parameter might provide the ID of the active campaign if the customer came via clicking on an ad. And if someone arrived via Google, there will be a gclid parameter - a Google Click ID.

 

3. Combine

 

The Combine-option provides an intermediary step between Transform and Convert, which joins two or more columns into one. Each column involved can be individually Transformed beforehand, and the final result can be Converted into the desired format at the end.

 

When combining columns, you have the following options:

  • Fallback

By selecting a specific value, with 'empty' being an obvious option, you can set the system to use the next column in the order instead. For example, if Column 1 is empty, the corresponding line from Column 2 will be used instead, if both are empty then Column 3 is used... A 'backup' fallback can also still be set, which will then be utilized if all of the included columns are empty.

  • Format

Similar to the Format Transformation, this allows you to similarly enter a desired string, inserting all the selected columns wherever desired by using {0}, {1}, {2}, etc. to denote column 1, 2, 3 and so forth. Keep in mind that it always counts from 0. Example: You are selling clothes, and column 1 contains the type of item, while column 2 is the color. You could thus enter 'A {0} {1}' to get output such as 'A red dress' or 'A blue shirt'.

  • Join

The values are of the columns are simply combined in a single field, using the selected symbol - such as a comma or dash - to denote the point where they are are joined. One popular use for this is to generate unique Keys for Point-Of-Sale data, in cases where Order Headers aren't available, and multiple order-lines use the same Order ID. For example, using Join to combine Order ID and Product ID will provide a unique Key for each line.

  • Key Generator

This uses the selected values to generate a unique ID Key, similar to the usecase described for Join above - albeit through a somewhat more complex method. It can act as an alterantive to Join in this regard, but is primarily meant to be used with OrderLine Schema, specifically. For more details on OrderLine Schema and the way the Key Generator is used in their application, see this guide.