|
| 1 | +--- |
| 2 | +title: Design a Graph Schema for Graph in Microsoft Fabric |
| 3 | +description: Learn best practices for designing a graph schema in Microsoft Fabric, including how to choose node types, edge types, key columns, and properties. |
| 4 | +ms.date: 04/10/2026 |
| 5 | +ms.topic: how-to |
| 6 | +ms.reviewer: wangwilliam |
| 7 | +ai-usage: ai-assisted |
| 8 | +--- |
| 9 | + |
| 10 | +# Design a graph schema in Microsoft Fabric |
| 11 | + |
| 12 | +[!INCLUDE [feature-preview](./includes/feature-preview-note.md)] |
| 13 | + |
| 14 | +A graph schema is the collection of node types, edge types, and their properties that define the structure of your graph. A well-designed graph schema makes your data easier to query, maintain, and extend. This article provides best practices for turning tabular data in a lakehouse into an effective [labeled property graph](graph-data-models.md) in Microsoft Fabric. |
| 15 | + |
| 16 | +Use these guidelines before you start modeling in the graph model editor. For step-by-step instructions on creating nodes and edges, see the [graph tutorial](tutorial-introduction.md). Examples in this article use the [Adventure Works sample dataset](sample-datasets.md). |
| 17 | + |
| 18 | +> [!IMPORTANT] |
| 19 | +> Graph currently doesn't support schema evolution. After you model your data, the structure of nodes, edges, and properties is fixed. Structural changes, such as adding properties, modifying labels, or changing relationship types, require you to create a new graph model and reload all data. This process takes time and consumes capacity, so plan your schema thoroughly before you start modeling. |
| 20 | +
|
| 21 | +## Prerequisites |
| 22 | + |
| 23 | +- A [Fabric workspace](../fundamentals/create-workspaces.md) with a lakehouse that contains your source tables. |
| 24 | +- Familiarity with the [graph model editor](tutorial-introduction.md). |
| 25 | +- Optional: The [Adventure Works sample dataset](sample-datasets.md) to follow the examples in this article. |
| 26 | + |
| 27 | +## Understand node types and edge types |
| 28 | + |
| 29 | +Before you design a schema, understand these core concepts: |
| 30 | + |
| 31 | +A **node type** defines a kind of entity in your graph, such as a customer, product, or order. It consists of: |
| 32 | + |
| 33 | +- A **label**, which is the name that identifies this category of node. For example, `Customer`. You use the label in queries to refer to nodes of this type. |
| 34 | +- A **mapping table**, which is the lakehouse table that provides the source data for the node type. For example, the *adventureworks_customers* table. |
| 35 | +- A **key column** that uniquely identifies each node (labeled **ID** in the graph model editor). For example, `CustomerID_K`. |
| 36 | +- **Properties**, which are columns from the table that become attributes on each node. For example, `FirstName`, `LastName`, and `EmailAddress`. |
| 37 | + |
| 38 | +A **node** is an individual instance of a node type - one row in the mapping table. For example, each row in *adventureworks_customers* becomes a `Customer` node. |
| 39 | + |
| 40 | +An **edge type** defines a kind of relationship between two node types. It consists of: |
| 41 | + |
| 42 | +- A **label**, which is the name that identifies this category of relationship. For example, `purchases`. |
| 43 | +- A **mapping table** that contains the relationship data between the source and target nodes. For example, the *adventureworks_orders* table. |
| 44 | +- A **source node type** and a **target node type** that the edge connects. For example, `Customer` as the source and `Order` as the target. |
| 45 | + |
| 46 | +An **edge** is an individual instance of an edge type - one row in the mapping table that connects two specific nodes. |
| 47 | + |
| 48 | +> [!NOTE] |
| 49 | +> In the graph model editor, the **Add node** and **Add edge** buttons create node types and edge types, not individual nodes or edges. |
| 50 | +
|
| 51 | +## Identify entities and relationships |
| 52 | + |
| 53 | +Start by identifying the *entities* (things) and *relationships* (connections) in your data. Entities become node types. Connections between entities become edge types. |
| 54 | + |
| 55 | +Ask these questions about your source tables: |
| 56 | + |
| 57 | +- **What are the primary entities?** Rows that represent distinct real-world things are candidates for node types. For example, customers, products, orders, and employees. |
| 58 | +- **How do these entities relate to each other?** Columns that reference rows in another table (foreign keys) suggest edge types. For example, `CustomerID_FK` in an `orders` table points to the `customers` table, which suggests modeling a `purchases` edge. |
| 59 | +- **Are there embedded entities?** A column inside a table might represent a distinct entity worth extracting into its own node type. For an example, see [Choose node types](#choose-node-types). For a step-by-step walkthrough, see [Add multiple node and edge types from one mapping table](tutorial-model-node-edge-from-same-table.md). |
| 60 | + |
| 61 | +## Choose node types |
| 62 | + |
| 63 | +Create a node type for each entity that you need to query or traverse independently. Use these guidelines: |
| 64 | + |
| 65 | +| Make the entity a **node type** when... | Keep it as a **property** when... | |
| 66 | +| --- | --- | |
| 67 | +| You need to traverse to or through it. | It's descriptive metadata you only read, not traverse. | |
| 68 | +| Multiple entities share a relationship with it. | It's unique to the entity it belongs to. | |
| 69 | +| You need to match or group by it directly in queries. | You only filter by it as a property of another entity. | |
| 70 | + |
| 71 | +**Example:** In the Adventure Works dataset, `Country` starts as a column on the `employees` table. If you need to query "which employees live in the same country?" or "which countries have the most employees?", extract `Country` into its own node type. If you only need to display an employee's country as a label, leave it as a property. |
| 72 | + |
| 73 | +## Choose key columns |
| 74 | + |
| 75 | +Every node type requires a key column (or compound key) that uniquely identifies each node. Choose keys carefully: |
| 76 | + |
| 77 | +- **Use existing unique identifiers** from your source tables. For example, `CustomerID_K` or `ProductID_K`. |
| 78 | +- **Avoid surrogate keys that lack business meaning** unless no natural key exists. For example, prefer `CustomerID` over an auto-incrementing row number. |
| 79 | +- **Use compound keys** when a single column doesn't guarantee uniqueness. For example, a `ProductVersion` node might need both `ProductID` and `VersionNumber` as its key. |
| 80 | +- **Match data types** between key columns and the foreign key columns used in edge mappings. Mismatched types cause edge creation failures. |
| 81 | + |
| 82 | +> [!TIP] |
| 83 | +> Define [node key constraints](gql-graph-types.md#set-up-node-key-constraints) to enable the query engine to perform direct lookups on key properties. This optimization speeds up queries that look up specific nodes by key. |
| 84 | +
|
| 85 | +## Choose edge types |
| 86 | + |
| 87 | +Edge types define the relationships between node types. Each edge type connects a source node type to a target node type through a mapping table. |
| 88 | + |
| 89 | +Follow these guidelines: |
| 90 | + |
| 91 | +- **Use descriptive labels** that read as verbs or verb phrases. For example, `purchases`, `sells`, `livesIn`, and `belongsTo`. A well-named edge makes queries easier to read. |
| 92 | +- **Consider direction carefully.** Edges in graph are directed. Choose the direction that best represents the real-world relationship. For example, `Customer` --*purchases*--> `Order` reads more naturally than `Order` --*purchasedBy*--> `Customer`. |
| 93 | +- **Give distinct names to edge types that connect different node type pairs.** If both "employee sells order" and "customer purchases order" connect to `Order`, name them `sells` and `purchases` rather than giving both the same label. For more information, see [edge creation limitations](limitations.md#edge-creation). |
| 94 | +- **Add properties to edge types** when the relationship itself has attributes. For example, a `quantity` on a `contains` edge or an `orderDate` on a `purchases` edge. |
| 95 | + |
| 96 | +> [!IMPORTANT] |
| 97 | +> The mapping table for an edge must contain columns that match the key columns of both the source and target node types in values and data type. Tables that you use to create node types can also serve as edge mapping tables if they meet this requirement. |
| 98 | +
|
| 99 | +## Remove unnecessary properties |
| 100 | + |
| 101 | +When you create a node type from a mapping table, every column in the table becomes a property by default. Remove properties that you don't need for queries or analysis. |
| 102 | + |
| 103 | +Excessive properties increase storage, slow queries, and make the graph harder to maintain. For each node type, keep only properties that are: |
| 104 | + |
| 105 | +- Required for the uniqueness of the node (key columns) |
| 106 | +- Used in `WHERE` filters or `RETURN` projections in your queries |
| 107 | +- Needed for downstream analysis or visualization |
| 108 | + |
| 109 | +For more information on how property count affects query performance, see [Return only the properties you need](gql-query-performance.md#return-only-the-properties-you-need). |
| 110 | + |
| 111 | +## Choose data types |
| 112 | + |
| 113 | +Select the most specific data type for each property. The right types improve both storage efficiency and query performance: |
| 114 | + |
| 115 | +- Use `INT` or `UINT64` for numeric identifiers and counts. Numeric comparisons are faster than string comparisons. |
| 116 | +- Use `ZONED DATETIME` for timestamps instead of string-formatted dates. |
| 117 | +- Use `BOOLEAN` for true/false flags instead of string values like `"yes"` or `"no"`. |
| 118 | + |
| 119 | +For the full list of supported types, see [Current limitations — Data types](limitations.md#data-types). |
| 120 | + |
| 121 | +## Common tabular-to-graph patterns |
| 122 | + |
| 123 | +The following table summarizes how some common tabular data structures translate to graph elements: |
| 124 | + |
| 125 | +| Tabular structure | Graph result | Example | |
| 126 | +| --- | --- | --- | |
| 127 | +| **One-to-many:** Parent table + child table with foreign key | Two node types connected by an edge type. | `Customer` --*purchases*--> `Order` | |
| 128 | +| **Many-to-many:** Junction table linking two tables | Edge type between two node types. | `Vendor` --*produces*--> `Product` | |
| 129 | +| **Embedded entity:** Column representing a shared entity | Extracted node type with edge. | `Employee` --*livesIn*--> `Country` | |
| 130 | +| **Hierarchy:** Chain of parent-child tables | Node types linked by edges at each level. | `Product` --*isOfType*--> `Subcategory` --*belongsTo*--> `Category` | |
| 131 | + |
| 132 | +For a step-by-step walkthrough of the embedded entity pattern, see [Add multiple node and edge types from one mapping table](tutorial-model-node-edge-from-same-table.md). |
| 133 | + |
| 134 | +## Related content |
| 135 | + |
| 136 | +- [Tutorial: Introduction to graph](tutorial-introduction.md) |
| 137 | +- [GQL graph types](gql-graph-types.md) |
| 138 | +- [Optimize GQL query performance](gql-query-performance.md) |
| 139 | +- [Labeled property graphs](graph-data-models.md) |
| 140 | +- [Current limitations](limitations.md) |
0 commit comments