Skip to content

Latest commit

 

History

History
84 lines (75 loc) · 7.45 KB

File metadata and controls

84 lines (75 loc) · 7.45 KB
title Aggregate Functions in the Mapping Data Flow
titleSuffix Azure Data Factory & Azure Synapse
description Learn about aggregate functions in mapping data flows.
author kromerm
ms.author makromer
ms.subservice data-flows
ms.custom synapse
ms.topic concept-article
ms.date 01/05/2024

Aggregate functions in mapping data flows

[!INCLUDEappliesto-adf-asa-md]

[!INCLUDEdata-flow-preamble]

This article provides details about aggregate functions supported by Azure Data Factory and Azure Synapse Analytics in mapping data flows.

Aggregate function list

The following functions are available only in aggregate, pivot, unpivot, and window transformations.

Aggregate function Task
approxDistinctCount Gets the approximate aggregate count of distinct values for a column. The optional second parameter is to control the estimation error.
avg Gets the average of values of a column.
avgIf Gets the average of values of a column, based on criteria.
collect Collects all values of the expression in the aggregated group into an array. During this process, you can collect and transform structures to alternate structures. The number of items is equal to the number of rows in that group and can contain null values. The number of collected items should be small.
collectUnique Collects all values of the expression in the aggregated group into a unique array. During this process, you can collect and transform structures to alternate structures. The number of items is smaller or equal to the number of rows in that group and can contain null values. The number of collected items should be small.
count Gets the aggregate count of values. If the optional columns are specified, it ignores NULL values in the count.
countAll Gets the aggregate count of values, including NULL values.
countDistinct Gets the aggregate count of distinct values of a set of columns.
countAllDistinct Gets the aggregate count of distinct values of a set of columns, including NULL values.
countIf Gets the aggregate count of values, based on criteria. If the optional column is specified, it ignores NULL values in the count.
covariancePopulation Gets the population covariance between two columns.
covariancePopulationIf Gets the population covariance of two columns, based on criteria.
covarianceSample Gets the sample covariance of two columns.
covarianceSampleIf Gets the sample covariance of two columns, based on criteria.
first Gets the first value of a column group. If the second parameter ignoreNulls is omitted, false is assumed.
isDistinct Finds if a column or set of columns is distinct. It doesn't count null as a distinct value.
kurtosis Gets the kurtosis of a column.
kurtosisIf Gets the kurtosis of a column, based on criteria.
last Gets the last value of a column group. If the second parameter ignoreNulls is omitted, false is assumed.
max Gets the maximum value of a column.
maxIf Gets the maximum value of a column, based on criteria.
mean Gets the mean of values of a column. Same as AVG.
meanIf Gets the mean of values of a column, based on criteria. Same as avgIf.
min Gets the minimum value of a column.
minIf Gets the minimum value of a column, based on criteria.
skewness Gets the skewness of a column.
skewnessIf Gets the skewness of a column, based on criteria.
stddev Gets the standard deviation of a column.
stddevIf Gets the standard deviation of a column, based on criteria.
stddevPopulation Gets the population standard deviation of a column.
stddevPopulationIf Gets the population standard deviation of a column, based on criteria.
stddevSample Gets the sample standard deviation of a column.
stddevSampleIf Gets the sample standard deviation of a column, based on criteria.
sum Gets the aggregate sum of a numeric column.
sumDistinct Gets the aggregate sum of distinct values of a numeric column.
sumDistinctIf Gets the aggregate sum of a numeric column, based on criteria. The condition can be based on any column.
sumIf Gets the aggregate sum of a numeric column, based on criteria. The condition can be based on any column.
topN Gets the top N values for this column.
variance Gets the variance of a column.
varianceIf Gets the variance of a column, based on criteria.
variancePopulation Gets the population variance of a column.
variancePopulationIf Gets the population variance of a column, based on criteria.
varianceSample Gets the unbiased variance of a column.
varianceSampleIf Gets the unbiased variance of a column, based on criteria.

Related content