title	Surrogate key transformation in mapping data flow
titleSuffix	Azure Data Factory & Azure Synapse
description	Learn how to use the mapping data flow Surrogate Key Transformation to generate sequential key values in Azure Data Factory and Synapse Analytics.
author	kromerm
ms.author	makromer
ms.reviewer	daperlov
ms.subservice	data-flows
ms.topic	concept-article
ms.custom	synapse
ms.date	05/15/2024

Surrogate key transformation in mapping data flow

[!INCLUDEappliesto-adf-asa-md]

[!INCLUDEdata-flow-preamble]

Use the surrogate key transformation to add an incrementing key value to each row of data. This is useful when designing dimension tables in a star schema analytical data model. In a star schema, each member in your dimension tables requires a unique key that is a non-business key.

Configuration

:::image type="content" source="media/data-flow/surrogate.png" alt-text="Surrogate Key Transform":::

Key column: The name of the generated surrogate key column.

Start value: The lowest key value that will be generated.

Increment keys from existing sources

To start your sequence from a value that exists in a source, we recommend to use a cache sink to save that value and use a derived column transformation to add the two values together. Use a cached lookup to get the output and append it to the generated key. For more information, learn about cache sinks and cached lookups.

:::image type="content" source="media/data-flow/cached-lookup-example.png" alt-text="Surrogate Key lookup":::

Increment from existing maximum value

To seed the key value with the previous max, there are two techniques that you can use based on where your source data is.

Database sources

Use a SQL query option to select MAX() from your source. For example, Select MAX(<surrogateKeyName>) as maxval from <sourceTable>.

:::image type="content" source="media/data-flow/surrogate-key-max-database.png" alt-text="Surrogate Key Query":::

File sources

If your previous max value is in a file, use the max() function in the aggregate transformation to get the previous max value:

:::image type="content" source="media/data-flow/surrogate-key-max-file.png" alt-text="Surrogate Key File":::

In both cases, you will need to write to a cache sink and lookup the value.

Data flow script

Syntax

<incomingStream> 
    keyGenerate(
        output(<surrogateColumnName> as long),
        startAt: <number>L
    ) ~> <surrogateKeyTransformationName>

Example

:::image type="content" source="media/data-flow/surrogate.png" alt-text="Surrogate Key Transform":::

The data flow script for the above surrogate key configuration is in the code snippet below.

AggregateDayStats
    keyGenerate(
        output(key as long),
        startAt: 1L
    ) ~> SurrogateKey1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surrogate key transformation in mapping data flow

Configuration

Increment keys from existing sources

Increment from existing maximum value

Database sources

File sources

Data flow script

Syntax

Example

Related content

FilesExpand file tree

data-flow-surrogate-key.md

Latest commit

History

data-flow-surrogate-key.md

File metadata and controls

Surrogate key transformation in mapping data flow

Configuration

Increment keys from existing sources

Increment from existing maximum value

Database sources

File sources

Data flow script

Syntax

Example

Related content