Skip to content

Commit f5922e5

Browse files
authored
Merge pull request #313556 from sethmanheim/regrp2
2: more dataflow articles
2 parents ec47871 + d681e58 commit f5922e5

2 files changed

Lines changed: 526 additions & 0 deletions

File tree

Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
---
2+
title: Data flow graphs overview
3+
description: Learn about data flow graphs in Azure IoT Operations, including built-in transforms for mapping, filtering, branching, windowing, and enrichment.
4+
author: sethmanheim
5+
ms.author: sethm
6+
ms.service: azure-iot-operations
7+
ms.subservice: azure-data-flows
8+
ms.topic: concept-article
9+
ms.date: 03/13/2026
10+
ai-usage: ai-assisted
11+
12+
---
13+
14+
# Data flow graphs overview
15+
16+
[!INCLUDE [kubernetes-management-preview-note](../includes/kubernetes-management-preview-note.md)]
17+
18+
Data flow graphs give you a flexible way to process data as it moves through Azure IoT Operations. A standard [data flow](overview-dataflow.md) follows a fixed enrich, filter, map sequence. A data flow graph lets you compose transforms in any order, branch into parallel paths, and aggregate data over time windows.
19+
20+
A data flow graph is defined by the `DataflowGraph` Kubernetes custom resource. Inside it, you wire together sources, transforms, and destinations to build processing pipelines that match your scenario.
21+
22+
> [!IMPORTANT]
23+
> Data flow graphs currently support only MQTT, Kafka, and OpenTelemetry endpoints. Other endpoint types like Data Lake, Microsoft Fabric OneLake, Azure Data Explorer, and Local Storage aren't supported. For more information, see [Known issues](../troubleshoot/known-issues.md#data-flow-graphs-only-support-specific-endpoint-types).
24+
25+
## Data flows vs. data flow graphs
26+
27+
Azure IoT Operations provides two ways to process data in a pipeline:
28+
29+
| Capability | Data flows | Data flow graphs |
30+
|-----------|-----------|-----------------|
31+
| Pipeline shape | Fixed: enrich, filter, map | Flexible: any order, branching, merging |
32+
| Transform types | Map, filter, enrich | Map, filter, branch, concat, window, enrich |
33+
| Time-based aggregation | Not available | Window transforms with tumbling windows |
34+
| Conditional routing | Not available | Branch and concat transforms |
35+
| Endpoint support | All endpoint types | MQTT, Kafka, and OpenTelemetry only |
36+
| Status | Generally available | Preview |
37+
38+
For new projects that use supported endpoint types, we recommend data flow graphs. Data flows remain fully supported for all scenarios, and they support the full range of endpoint types.
39+
40+
## Available transforms
41+
42+
Each transform is a pre-built processing step that you configure with rules and chain with other transforms inside a `DataflowGraph` resource.
43+
44+
<!-- | Transform | What it does | Learn more |
45+
|-----------|-------------|------------|
46+
| **Map** | Rename, restructure, compute, and copy fields | [Transform data with map](howto-dataflow-graphs-map.md) |
47+
| **Filter** | Drop messages that match a condition | [Filter and route data](howto-dataflow-graphs-filter-route.md) |
48+
| **Branch** | Route each message to a `true` or `false` path based on a condition | [Filter and route data](howto-dataflow-graphs-filter-route.md#branch-transform) |
49+
| **Concat** | Merge two or more paths back into one | [Filter and route data](howto-dataflow-graphs-filter-route.md#merge-paths-with-concat) |
50+
| **Window** | Collect messages over a time interval, then aggregate | [Aggregate data over time](howto-dataflow-graphs-window.md) | -->
51+
52+
<!-- All transforms share an [expression language](concept-dataflow-graphs-expressions.md) for operators, functions, and field references. You can also [enrich](howto-dataflow-graphs-enrich.md) messages with external data from a state store in map, filter, and branch transforms. -->
53+
54+
## How transforms compose
55+
56+
Transforms connect in sequence inside a `DataflowGraph` resource: **Source > Transform A > Transform B > … > Destination**.
57+
58+
Branch transforms split the flow into parallel paths, and concat transforms merge them back.
59+
60+
You can chain any number of transforms in any order. A pipeline with a single map transform is as valid as one that filters, branches, maps each path differently, merges, and then aggregates over a time window.
61+
62+
## How configuration works
63+
64+
Each transform in a data flow graph references a pre-built artifact pulled from a container registry. You configure the transform by passing rules as JSON through the `configuration` section of the graph resource.
65+
66+
A default registry endpoint named `default` pointing to `mcr.microsoft.com` is created automatically when you deploy Azure IoT Operations. The built-in transforms use this endpoint to pull artifacts from Microsoft Container Registry. No extra registry setup is needed.
67+
68+
Here's a complete example that reads temperature data from an MQTT topic, converts Celsius to Fahrenheit with a map transform, and publishes the result:
69+
70+
# [Operations experience](#tab/portal)
71+
72+
<!-- ![Screenshot of the operations experience showing a data flow graph example with source, transform, and destination.](media/concept-dataflow-graphs/dataflow-graph-example.png) -->
73+
74+
In the Operations experience:
75+
76+
1. Select **Data flow graph** > **Create data flow graph**.
77+
1. Add a **source** with the default endpoint and topic `telemetry/temperature`.
78+
1. Add a **map** transform. Configure a rule with input `temperature`, output `temperature_f`, and expression `cToF($1)`.
79+
1. Add a **destination** with the default endpoint and topic `telemetry/converted`.
80+
1. Connect: source → map → destination.
81+
1. Select **Save**.
82+
83+
# [Bicep](#tab/bicep)
84+
85+
```bicep
86+
resource dataflowGraph 'Microsoft.IoTOperations/instances/dataflowProfiles/dataflowGraphs@2025-10-01' = {
87+
name: 'temperature-conversion'
88+
parent: dataflowProfile
89+
properties: {
90+
profileRef: dataflowProfileName
91+
mode: 'Enabled'
92+
nodes: [
93+
{
94+
nodeType: 'Source'
95+
name: 'sensors'
96+
sourceSettings: {
97+
endpointRef: 'default'
98+
dataSources: [ 'telemetry/temperature' ]
99+
}
100+
}
101+
{
102+
nodeType: 'Graph'
103+
name: 'convert'
104+
graphSettings: {
105+
registryEndpointRef: 'default'
106+
artifact: 'azureiotoperations/graph-dataflow-map:1.0.0'
107+
configuration: [
108+
{
109+
key: 'rules'
110+
value: '{"map":[{"inputs":["temperature"],"output":"temperature_f","expression":"cToF($1)"}]}'
111+
}
112+
]
113+
}
114+
}
115+
{
116+
nodeType: 'Destination'
117+
name: 'output'
118+
destinationSettings: {
119+
endpointRef: 'default'
120+
dataDestination: 'telemetry/converted'
121+
}
122+
}
123+
]
124+
nodeConnections: [
125+
{ from: { name: 'sensors' }, to: { name: 'convert' } }
126+
{ from: { name: 'convert' }, to: { name: 'output' } }
127+
]
128+
}
129+
}
130+
```
131+
132+
# [Kubernetes (preview)](#tab/kubernetes)
133+
134+
```yaml
135+
apiVersion: connectivity.iotoperations.azure.com/v1
136+
kind: DataflowGraph
137+
metadata:
138+
name: temperature-conversion
139+
namespace: azure-iot-operations
140+
spec:
141+
profileRef: default
142+
nodes:
143+
- nodeType: Source
144+
name: sensors
145+
sourceSettings:
146+
endpointRef: default
147+
dataSources:
148+
- telemetry/temperature
149+
150+
- nodeType: Graph
151+
name: convert
152+
graphSettings:
153+
registryEndpointRef: default
154+
artifact: azureiotoperations/graph-dataflow-map:1.0.0
155+
configuration:
156+
- key: rules
157+
value: |
158+
{
159+
"map": [
160+
{
161+
"inputs": ["temperature"],
162+
"output": "temperature_f",
163+
"expression": "cToF($1)"
164+
}
165+
]
166+
}
167+
168+
- nodeType: Destination
169+
name: output
170+
destinationSettings:
171+
endpointRef: default
172+
dataDestination: telemetry/converted
173+
174+
nodeConnections:
175+
- from: { name: sensors }
176+
to: { name: convert }
177+
- from: { name: convert }
178+
to: { name: output }
179+
```
180+
181+
---
182+
183+
The pipeline defines three elements: a source, a transform (indicated by `nodeType: Graph`), and a destination. The connections describe how data flows between them. The transform's `configuration` passes rules as a JSON string under the `rules` key.
184+
185+
<!-- In the how-to articles that follow, examples focus on the transform rules themselves. For a step-by-step guide to creating a data flow graph, see [Create a data flow graph](howto-create-dataflow-graph.md). -->
186+
187+
## Built-in transforms vs. WASM transforms
188+
189+
Data flow graphs support two kinds of transforms:
190+
191+
- **Built-in transforms** are pre-built by Microsoft (map, filter, branch, concat, window). You configure them with rules. No coding required.
192+
- **WASM transforms** are custom WebAssembly modules that developers build and deploy. Use them when you need logic that the built-in transforms don't cover.
193+
194+
Both kinds of transforms run inside the same `DataflowGraph` resource and can be mixed in a single pipeline. For information on building and deploying custom transforms, see [Use WASM transforms in data flow graphs](howto-dataflow-graph-wasm.md).
195+
196+
## Error handling
197+
198+
When a transform encounters an error while processing a message (for example, a missing field or an invalid expression), the message is dropped and an error is logged. The pipeline continues processing subsequent messages.
199+
200+
Common causes of processing errors:
201+
202+
- A field referenced in a rule's `inputs` doesn't exist in the message.
203+
- A filter or branch expression returns a non-boolean value.
204+
- An expression references an incompatible data type (for example, using a JSON object in arithmetic).
205+
- A state store used for enrichment is unreachable.
206+
207+
To monitor for processing errors, check the pod logs for the data flow graph or use the metrics endpoints. For more information, see [Configure observability and monitoring](../configure-observability-monitoring/howto-configure-observability.md).
208+
209+
## Performance guidance
210+
211+
Each transform in the pipeline adds processing overhead. Keep these guidelines in mind:
212+
213+
- **Prefer fewer transforms with more rules.** If you have many transformation rules that operate on the same structure, put them in a single map transform rather than creating separate transforms for each rule.
214+
- **Use multiple transforms when the logic is distinct.** Separate transforms make sense when different processing steps are fundamentally different (filtering vs. mapping vs. aggregating).
215+
- **Keep related rules together.** A single map transform can handle field renaming, restructuring, computed fields, and metadata transformations all at once.
216+
217+
## Prerequisites
218+
219+
To use data flow graphs, you need:
220+
221+
- An Azure IoT Operations instance deployed on an Arc-enabled Kubernetes cluster. For more information, see [Deploy Azure IoT Operations](../deploy-iot-ops/howto-deploy-iot-operations.md).
222+
- The default registry endpoint that points to `mcr.microsoft.com`, which is created automatically during deployment.
223+
224+
## Next steps
225+
226+
<!-- - [Data flows vs. data flow graphs](overview-dataflow-comparison.md)
227+
- [Create a data flow graph](howto-create-dataflow-graph.md)
228+
- [Transform data with map](howto-dataflow-graphs-map.md)
229+
- [Filter and route data](howto-dataflow-graphs-filter-route.md)
230+
- [Aggregate data over time](howto-dataflow-graphs-window.md)
231+
- [Enrich with external data](howto-dataflow-graphs-enrich.md)
232+
- [Expressions reference](concept-dataflow-graphs-expressions.md)
233+
- [Route messages to different topics](howto-dataflow-graphs-topic-routing.md)
234+
- [Expressions reference](concept-dataflow-graphs-expressions.md)
235+
- [Use WASM transforms in data flow graphs](howto-dataflow-graph-wasm.md) -->

0 commit comments

Comments
 (0)