Introduction
This guide aims to provide details of all the Aggregation Operators available in Data Manager, the situations under which they can be used, and what they are for. For a broader treatment of Aggregation mode as a whole, see this guide.
The Operators available to you will depend on both the input column and the output column - while a regular mapping requires the two to match, Aggregation sometimes converts the values into a new column-type. Thus, depending on your target column and the selected source, you will be shown only the Operators that make that match possible. If no match is possible, the Aggregation Operator drop-down will read 'Column type not supported!' Perhaps the Convert & Transform features can help with that...
Operators
Count
Available for all Input Types. Requires Integer Output Type. This Aggregation discards the actual values of the columns, and instead simply count the total number of lines. Possible uses includes turning a list of distinct items into a quantity.
Distinct Count
Available for all Input Types. Requires Integer Output Type. Similar to Count, but it only counts unique values, ignoring any identical lines in the aggregated table. Can be used to, for example, count the number of unique items in a list of purchases.
Average
Available for Decimal Input and Output Type only. Averages the values in the selected columns. In the case of the Integer Type, the result will be rounded to the closest whole number. Uses include finding the average unit-price from the items in an order.
Sum
Available for Integer or Decimal Input Types. Outputs same Type. Adds up the values in the selected columns. Likely one of the most commonly-used forms of Aggregation, especially for Decimal-type data. Commonly used for adding up the total value of an order.
Max
Available for Integer or Decimal Input Types. Outputs same Type. Takes the highest available value for the line. Applied to a subtotal, it can be used to check what the highest-priced item in the purchase is, for example.
Min
Available for Integer or Decimal Input Types. Outputs same Type. Takes the lowest available value for the line. Similar to Max, it is often used on subtotals, though in this case to check what the lowest-cost item in the purchase is.
Latest
Available for DateTime Input and Output Type only. Similar to the Max Operator, it takes the highest available DateTime value for the line. Can be used for order-line timestamps, effectively erasing unimportant distinctions that might otherwise arise from the order-lines being one or two seconds apart.
Earliest
Available for DateTime Input and Output Type only. Similar to the Min Operator, it takes the lowest available DateTime value for the line. Has a similar application as Latest - the difference is rather situational.
Distinct List
Available for String or ENUM Input Type. Only outputs String Type. This aggregation simple combines the presented values into a single field without altering them in any way. Handy for product-numbers, names, ID's, etc.
First Occurrence
Available for String or ENUM Input Type. Outputs same Type. This condenses the entire column into a single line, which will contain the first value of the column. Mostly used to convert a set of order-lines into a single-line Header, generally aggregating by timestamp.
Last Occurrence
Available for String or ENUM Input Type. Outputs same Type. This condenses the entire column into a single line, which will contain the last value of the column. It is used in largely the same fashion as First Occurrence.