Skip to content

Commit 95c1735

Browse files
committed
2: more dataflow articles
1 parent 0cd24cc commit 95c1735

2 files changed

Lines changed: 534 additions & 0 deletions

File tree

Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
---
2+
title: Data flow graphs overview
3+
description: Learn about data flow graphs in Azure IoT Operations, including built-in transforms for mapping, filtering, branching, windowing, and enrichment.
4+
author: sethmanheim
5+
ms.author: sethm
6+
ms.service: azure-iot-operations
7+
ms.subservice: azure-data-flows
8+
ms.topic: concept-article
9+
ms.date: 03/13/2026
10+
ai-usage: ai-assisted
11+
12+
---
13+
14+
# Data flow graphs overview
15+
16+
[!INCLUDE [kubernetes-management-preview-note](../includes/kubernetes-management-preview-note.md)]
17+
18+
Data flow graphs give you a flexible way to process data as it moves through Azure IoT Operations. A standard [data flow](overview-dataflow.md) follows a fixed enrich, filter, map sequence. A data flow graph lets you compose transforms in any order, branch into parallel paths, and aggregate data over time windows.
19+
20+
A data flow graph is defined by the `DataflowGraph` Kubernetes custom resource. Inside it, you wire together sources, transforms, and destinations to build processing pipelines that match your scenario.
21+
22+
> [!IMPORTANT]
23+
> Data flow graphs currently support only MQTT, Kafka, and OpenTelemetry endpoints. Other endpoint types like Data Lake, Microsoft Fabric OneLake, Azure Data Explorer, and Local Storage aren't supported. For more information, see [Known issues](../troubleshoot/known-issues.md#data-flow-graphs-only-support-specific-endpoint-types).
24+
25+
## Data flows vs. data flow graphs
26+
27+
Azure IoT Operations provides two ways to process data in a pipeline:
28+
29+
| Capability | Data flows | Data flow graphs |
30+
|-----------|-----------|-----------------|
31+
| Pipeline shape | Fixed: enrich, filter, map | Flexible: any order, branching, merging |
32+
| Transform types | Map, filter, enrich | Map, filter, branch, concat, window, enrich |
33+
| Time-based aggregation | Not available | Window transforms with tumbling windows |
34+
| Conditional routing | Not available | Branch and concat transforms |
35+
| Endpoint support | All endpoint types | MQTT, Kafka, and OpenTelemetry only |
36+
| Status | Generally available | Preview |
37+
38+
For new projects that use supported endpoint types, we recommend data flow graphs. Data flows remain fully supported for all scenarios, and they support the full range of endpoint types.
39+
40+
## Available transforms
41+
42+
Each transform is a pre-built processing step that you configure with rules and chain with other transforms inside a `DataflowGraph` resource.
43+
44+
| Transform | What it does | Learn more |
45+
|-----------|-------------|------------|
46+
| **Map** | Rename, restructure, compute, and copy fields | [Transform data with map](howto-dataflow-graphs-map.md) |
47+
| **Filter** | Drop messages that match a condition | [Filter and route data](howto-dataflow-graphs-filter-route.md) |
48+
| **Branch** | Route each message to a `true` or `false` path based on a condition | [Filter and route data](howto-dataflow-graphs-filter-route.md#branch-transform) |
49+
| **Concat** | Merge two or more paths back into one | [Filter and route data](howto-dataflow-graphs-filter-route.md#merge-paths-with-concat) |
50+
| **Window** | Collect messages over a time interval, then aggregate | [Aggregate data over time](howto-dataflow-graphs-window.md) |
51+
52+
All transforms share an [expression language](concept-dataflow-graphs-expressions.md) for operators, functions, and field references. You can also [enrich](howto-dataflow-graphs-enrich.md) messages with external data from a state store in map, filter, and branch transforms.
53+
54+
## How transforms compose
55+
56+
Transforms connect in sequence inside a `DataflowGraph` resource:
57+
58+
`Source ──→ Transform A ──→ Transform B ──→ … ──→ Destination`
59+
60+
Branch transforms split the flow into parallel paths, and concat transforms merge them back:
61+
62+
```
63+
┌── true ──→ Transform X ─┐
64+
Source ──→ Branch ──────┤ ├──→ Concat ──→ Destination
65+
└── false ──→ Transform Y ─┘
66+
```
67+
68+
You can chain any number of transforms in any order. A pipeline with a single map transform is as valid as one that filters, branches, maps each path differently, merges, and then aggregates over a time window.
69+
70+
## How configuration works
71+
72+
Each transform in a data flow graph references a pre-built artifact pulled from a container registry. You configure the transform by passing rules as JSON through the `configuration` section of the graph resource.
73+
74+
A default registry endpoint named `default` pointing to `mcr.microsoft.com` is created automatically when you deploy Azure IoT Operations. The built-in transforms use this endpoint to pull artifacts from Microsoft Container Registry. No extra registry setup is needed.
75+
76+
Here's a complete example that reads temperature data from an MQTT topic, converts Celsius to Fahrenheit with a map transform, and publishes the result:
77+
78+
# [Operations experience](#tab/portal)
79+
80+
![Screenshot of the operations experience showing a data flow graph example with source, transform, and destination.](media/concept-dataflow-graphs/dataflow-graph-example.png)
81+
82+
In the Operations experience:
83+
84+
1. Select **Data flow graph** > **Create data flow graph**.
85+
1. Add a **source** with the default endpoint and topic `telemetry/temperature`.
86+
1. Add a **map** transform. Configure a rule with input `temperature`, output `temperature_f`, and expression `cToF($1)`.
87+
1. Add a **destination** with the default endpoint and topic `telemetry/converted`.
88+
1. Connect: source → map → destination.
89+
1. Select **Save**.
90+
91+
# [Bicep](#tab/bicep)
92+
93+
```bicep
94+
resource dataflowGraph 'Microsoft.IoTOperations/instances/dataflowProfiles/dataflowGraphs@2025-10-01' = {
95+
name: 'temperature-conversion'
96+
parent: dataflowProfile
97+
properties: {
98+
profileRef: dataflowProfileName
99+
mode: 'Enabled'
100+
nodes: [
101+
{
102+
nodeType: 'Source'
103+
name: 'sensors'
104+
sourceSettings: {
105+
endpointRef: 'default'
106+
dataSources: [ 'telemetry/temperature' ]
107+
}
108+
}
109+
{
110+
nodeType: 'Graph'
111+
name: 'convert'
112+
graphSettings: {
113+
registryEndpointRef: 'default'
114+
artifact: 'azureiotoperations/graph-dataflow-map:1.0.0'
115+
configuration: [
116+
{
117+
key: 'rules'
118+
value: '{"map":[{"inputs":["temperature"],"output":"temperature_f","expression":"cToF($1)"}]}'
119+
}
120+
]
121+
}
122+
}
123+
{
124+
nodeType: 'Destination'
125+
name: 'output'
126+
destinationSettings: {
127+
endpointRef: 'default'
128+
dataDestination: 'telemetry/converted'
129+
}
130+
}
131+
]
132+
nodeConnections: [
133+
{ from: { name: 'sensors' }, to: { name: 'convert' } }
134+
{ from: { name: 'convert' }, to: { name: 'output' } }
135+
]
136+
}
137+
}
138+
```
139+
140+
# [Kubernetes (preview)](#tab/kubernetes)
141+
142+
```yaml
143+
apiVersion: connectivity.iotoperations.azure.com/v1
144+
kind: DataflowGraph
145+
metadata:
146+
name: temperature-conversion
147+
namespace: azure-iot-operations
148+
spec:
149+
profileRef: default
150+
nodes:
151+
- nodeType: Source
152+
name: sensors
153+
sourceSettings:
154+
endpointRef: default
155+
dataSources:
156+
- telemetry/temperature
157+
158+
- nodeType: Graph
159+
name: convert
160+
graphSettings:
161+
registryEndpointRef: default
162+
artifact: azureiotoperations/graph-dataflow-map:1.0.0
163+
configuration:
164+
- key: rules
165+
value: |
166+
{
167+
"map": [
168+
{
169+
"inputs": ["temperature"],
170+
"output": "temperature_f",
171+
"expression": "cToF($1)"
172+
}
173+
]
174+
}
175+
176+
- nodeType: Destination
177+
name: output
178+
destinationSettings:
179+
endpointRef: default
180+
dataDestination: telemetry/converted
181+
182+
nodeConnections:
183+
- from: { name: sensors }
184+
to: { name: convert }
185+
- from: { name: convert }
186+
to: { name: output }
187+
```
188+
189+
---
190+
191+
The pipeline defines three elements: a source, a transform (indicated by `nodeType: Graph`), and a destination. The connections describe how data flows between them. The transform's `configuration` passes rules as a JSON string under the `rules` key.
192+
193+
In the how-to articles that follow, examples focus on the transform rules themselves. For a step-by-step guide to creating a data flow graph, see [Create a data flow graph](howto-create-dataflow-graph.md).
194+
195+
## Built-in transforms vs. WASM transforms
196+
197+
Data flow graphs support two kinds of transforms:
198+
199+
- **Built-in transforms** are pre-built by Microsoft (map, filter, branch, concat, window). You configure them with rules. No coding required.
200+
- **WASM transforms** are custom WebAssembly modules that developers build and deploy. Use them when you need logic that the built-in transforms don't cover.
201+
202+
Both kinds of transforms run inside the same `DataflowGraph` resource and can be mixed in a single pipeline. For information on building and deploying custom transforms, see [Use WASM transforms in data flow graphs](howto-dataflow-graph-wasm.md).
203+
204+
## Error handling
205+
206+
When a transform encounters an error while processing a message (for example, a missing field or an invalid expression), the message is dropped and an error is logged. The pipeline continues processing subsequent messages.
207+
208+
Common causes of processing errors:
209+
210+
- A field referenced in a rule's `inputs` doesn't exist in the message.
211+
- A filter or branch expression returns a non-boolean value.
212+
- An expression references an incompatible data type (for example, using a JSON object in arithmetic).
213+
- A state store used for enrichment is unreachable.
214+
215+
To monitor for processing errors, check the pod logs for the data flow graph or use the metrics endpoints. For more information, see [Configure observability and monitoring](../configure-observability-monitoring/howto-configure-observability.md).
216+
217+
## Performance guidance
218+
219+
Each transform in the pipeline adds processing overhead. Keep these guidelines in mind:
220+
221+
- **Prefer fewer transforms with more rules.** If you have many transformation rules that operate on the same structure, put them in a single map transform rather than creating separate transforms for each rule.
222+
- **Use multiple transforms when the logic is distinct.** Separate transforms make sense when different processing steps are fundamentally different (filtering vs. mapping vs. aggregating).
223+
- **Keep related rules together.** A single map transform can handle field renaming, restructuring, computed fields, and metadata transformations all at once.
224+
225+
## Prerequisites
226+
227+
To use data flow graphs, you need:
228+
229+
- An Azure IoT Operations instance deployed on an Arc-enabled Kubernetes cluster. For more information, see [Deploy Azure IoT Operations](../deploy-iot-ops/howto-deploy-iot-operations.md).
230+
- The default registry endpoint that points to `mcr.microsoft.com`, which is created automatically during deployment.
231+
232+
## Next steps
233+
234+
- [Data flows vs. data flow graphs](overview-dataflow-comparison.md)
235+
- [Create a data flow graph](howto-create-dataflow-graph.md)
236+
- [Transform data with map](howto-dataflow-graphs-map.md)
237+
- [Filter and route data](howto-dataflow-graphs-filter-route.md)
238+
- [Aggregate data over time](howto-dataflow-graphs-window.md)
239+
- [Enrich with external data](howto-dataflow-graphs-enrich.md)
240+
- [Expressions reference](concept-dataflow-graphs-expressions.md)
241+
- [Route messages to different topics](howto-dataflow-graphs-topic-routing.md)
242+
- [Expressions reference](concept-dataflow-graphs-expressions.md)
243+
- [Use WASM transforms in data flow graphs](howto-dataflow-graph-wasm.md)

0 commit comments

Comments
 (0)