You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure Data Factory and Azure Synapse Analytics pipelines support the following data stores and formats via Copy, Data Flow, Look up, Get Metadata, and Delete activities. Click each data store to learn the supported capabilities and the corresponding configurations in details.
17
+
Azure Data Factory and Azure Synapse Analytics pipelines support the following data stores and formats via Copy, Data Flow, Look up, Get Metadata, and Delete activities. Select each data store to learn the supported capabilities and the corresponding configurations in details.
Azure Data Factory and Synapse pipelines can reach broader set of data stores than the list mentioned above. If you need to move data to/from a data store that is not in the service built-in connector list, here are some extensible options:
25
+
Azure Data Factory and Synapse pipelines can reach broader set of data stores than the list mentioned above. If you need to move data to/from a data store that isn't in the service built-in connector list, here are some extensible options:
26
26
- For database and data warehouse, usually you can find a corresponding ODBC driver, with which you can use [generic ODBC connector](connector-odbc.md).
27
27
- For SaaS applications:
28
28
- If it provides RESTful APIs, you can use [generic REST connector](connector-rest.md).
29
29
- If it has OData feed, you can use [generic OData connector](connector-odata.md).
30
30
- If it provides SOAP APIs, you can use [generic HTTP connector](connector-http.md).
31
31
- If it has ODBC driver, you can use [generic ODBC connector](connector-odbc.md).
32
-
- For others, check if you can load data to or expose data as any supported data stores, e.g. Azure Blob/File/FTP/SFTP/etc, then let the service pick up from there. You can invoke custom data loading mechanism via [Azure Function](control-flow-azure-function-activity.md), [Custom activity](transform-data-using-dotnet-custom-activity.md), [Databricks](transform-data-databricks-notebook.md)/[HDInsight](transform-data-using-hadoop-hive.md), [Web activity](control-flow-web-activity.md), etc.
32
+
- For others, check if you can load data to or expose data as any supported data stores, for example, Azure Blob/File/FTP/SFTP/etc, then let the service pick up from there. You can invoke custom data loading mechanism via [Azure Function](control-flow-azure-function-activity.md), [Custom activity](transform-data-using-dotnet-custom-activity.md), [Databricks](transform-data-databricks-notebook.md)/[HDInsight](transform-data-using-hadoop-hive.md), [Web activity](control-flow-web-activity.md), etc.
In Azure Data Factory and Synapse pipelines, you can use the Copy activity to copy data among data stores located on-premises and in the cloud. After you copy the data, you can use other activities to further transform and analyze it. You can also use the Copy activity to publish transformation and analysis results for business intelligence (BI) and application consumption.
@@ -27,7 +26,7 @@ The Copy activity is executed on an [integration runtime](concepts-integration-r
27
26
An integration runtime needs to be associated with each source and sink data store. For information about how the Copy activity determines which integration runtime to use, see [Determining which IR to use](concepts-integration-runtime.md#determining-which-ir-to-use).
28
27
29
28
> [!NOTE]
30
-
> You cannot use more than one self-hosted integration runtime within the same Copy activity. The source and sink for the activity must be connected with the same self-hosted integration runtime.
29
+
> You can't use more than one self-hosted integration runtime within the same Copy activity. The source and sink for the activity must be connected with the same self-hosted integration runtime.
31
30
32
31
To copy data from a source to a sink, the service that runs the Copy activity performs these steps:
33
32
@@ -68,7 +67,7 @@ You can use the Copy activity to copy files as-is between two file-based data st
68
67
69
68
## Supported regions
70
69
71
-
The service that enables the Copy activity is available globally in the regions and geographies listed in [Azure integration runtime locations](concepts-integration-runtime.md#integration-runtime-location). The globally available topology ensures efficient data movement that usually avoids cross-region hops. See [Products by region](https://azure.microsoft.com/regions/#services) to check the availability of Data Factory, Synapse Workspaces and data movement in a specific region.
70
+
The service that enables the Copy activity is available globally in the regions and geographies listed in [Azure integration runtime locations](concepts-integration-runtime.md#integration-runtime-location). The globally available topology ensures efficient data movement that usually avoids cross-region hops. See [Products by region](https://azure.microsoft.com/regions/#services) to check the availability of Data Factory, Synapse Workspaces, and data movement in a specific region.
72
71
73
72
## Configuration
74
73
@@ -161,20 +160,20 @@ The [copy activity monitoring](copy-activity-monitoring.md) experience shows you
161
160
162
161
## Resume from last failed run
163
162
164
-
Copy activity supports resume from last failed run when you copy large size of files as-is with binary format between file-based stores and choose to preserve the folder/file hierarchy from source to sink, e.g. to migrate data from Amazon S3 to Azure Data Lake Storage Gen2. It applies to the following file-based connectors: [Amazon S3](connector-amazon-simple-storage-service.md), [Amazon S3 Compatible Storage](connector-amazon-s3-compatible-storage.md)[Azure Blob](connector-azure-blob-storage.md), [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md), [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md), [Azure Files](connector-azure-file-storage.md), [File System](connector-file-system.md), [FTP](connector-ftp.md), [Google Cloud Storage](connector-google-cloud-storage.md), [HDFS](connector-hdfs.md), [Oracle Cloud Storage](connector-oracle-cloud-storage.md) and [SFTP](connector-sftp.md).
163
+
Copy activity supports resume from last failed run when you copy large size of files as-is with binary format between file-based stores and choose to preserve the folder/file hierarchy from source to sink, for example, to migrate data from Amazon S3 to Azure Data Lake Storage Gen2. It applies to the following file-based connectors: [Amazon S3](connector-amazon-simple-storage-service.md), [Amazon S3 Compatible Storage](connector-amazon-s3-compatible-storage.md)[Azure Blob](connector-azure-blob-storage.md), [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md), [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md), [Azure Files](connector-azure-file-storage.md), [File System](connector-file-system.md), [FTP](connector-ftp.md), [Google Cloud Storage](connector-google-cloud-storage.md), [HDFS](connector-hdfs.md), [Oracle Cloud Storage](connector-oracle-cloud-storage.md), and [SFTP](connector-sftp.md).
165
164
166
-
You can leverage the copy activity resume in the following two ways:
165
+
You can use the copy activity resume in the following two ways:
167
166
168
-
-**Activity level retry:** You can set retry count on copy activity. During the pipeline execution, if this copy activity run fails, the next automatic retry will start from last trial's failure point.
169
-
-**Rerun from failed activity:** After pipeline execution completion, you can also trigger a rerun from the failed activity in the ADF UI monitoring view or programmatically. If the failed activity is a copy activity, the pipeline will not only rerun from this activity, but also resume from the previous run's failure point.
167
+
-**Activity level retry:** You can set retry count on copy activity. During the pipeline execution, if this copy activity run fails, the next automatic retry starts from last trial's failure point.
168
+
-**Rerun from failed activity:** After pipeline execution completion, you can also trigger a rerun from the failed activity in the ADF UI monitoring view or programmatically. If the failed activity is a copy activity, the pipeline won't only rerun from this activity, but also resume from the previous run's failure point.
- Resume happens at file level. If copy activity fails when copying a file, in next run, this specific file will be re-copied.
176
-
- For resume to work properly, do not change the copy activity settings between the reruns.
177
-
- When you copy data from Amazon S3, Azure Blob, Azure Data Lake Storage Gen2 and Google Cloud Storage, copy activity can resume from arbitrary number of copied files. While for the rest of file-based connectors as source, currently copy activity supports resume from a limited number of files, usually at the range of tens of thousands and varies depending on the length of the file paths; files beyond this number will be re-copied during reruns.
174
+
- Resume happens at file level. If copy activity fails when copying a file, in next run, this specific file will be recopied.
175
+
- For resume to work properly, don't change the copy activity settings between the reruns.
176
+
- When you copy data from Amazon S3, Azure Blob, Azure Data Lake Storage Gen2, and Google Cloud Storage, copy activity can resume from arbitrary number of copied files. While for the rest of file-based connectors as source, currently copy activity supports resume from a limited number of files, usually at the range of tens of thousands and varies depending on the length of the file paths; files beyond this number will be recopied during reruns.
178
177
179
178
For other scenarios than binary file copy, copy activity rerun starts from the beginning.
180
179
@@ -187,8 +186,8 @@ While copying data from source to sink, in scenarios like data lake migration, y
187
186
188
187
## Add metadata tags to file based sink
189
188
When the sink is Azure Storage based (Azure data lake storage or Azure Blob Storage), we can opt to add some metadata to the files. These metadata will be appearing as part of the file properties as Key-Value pairs.
190
-
For all the types of file based sinks, you can add metadata involving dynamic content using the pipeline parameters, system variables, functions and variables.
191
-
In addition to this, for binary file based sink, you have the option to add Last Modified datetime (of the source file) using the keyword $$LASTMODIFIED, as well as custom values as a metadata to the sink file.
189
+
For all the types of file based sinks, you can add metadata involving dynamic content using the pipeline parameters, system variables, functions, and variables.
190
+
In addition to this, for binary file based sink, you have the option to add Last Modified datetime (of the source file) using the keyword $$LASTMODIFIED, and custom values as a metadata to the sink file.
192
191
193
192
## Schema and data type mapping
194
193
@@ -263,7 +262,7 @@ To configure it programmatically, add the `additionalColumns` property in your c
263
262
264
263
## Auto create sink tables
265
264
266
-
When you copy data into SQL database/Azure Synapse Analytics, if the destination table does not exist, copy activity supports automatically creating it based on the source data. It aims to help you quickly get started to load the data and evaluate SQL database/Azure Synapse Analytics. After the data ingestion, you can review and adjust the sink table schema according to your needs.
265
+
When you copy data into SQL database/Azure Synapse Analytics, if the destination table doesn't exist, copy activity supports automatically creating it based on the source data. It aims to help you quickly get started to load the data and evaluate SQL database/Azure Synapse Analytics. After the data ingestion, you can review and adjust the sink table schema according to your needs.
267
266
268
267
This feature is supported when copying data from any source into the following sink data stores. You can find the option on *ADF authoring UI* -> *Copy activity sink* -> *Table option* -> *Auto create table*, or via `tableOption` property in copy activity sink payload.
269
268
@@ -280,10 +279,10 @@ By default, the Copy activity stops copying data and returns a failure when sour
280
279
281
280
## Data consistency verification
282
281
283
-
When you move data from source to destination store, copy activity provides an option for you to do additional data consistency verification to ensure the data is not only successfully copied from source to destination store, but also verified to be consistent between source and destination store. Once inconsistent files have been found during the data movement, you can either abort the copy activity or continue to copy the rest by enabling fault tolerance setting to skip inconsistent files. You can get the skipped file names by enabling session log setting in copy activity. See [Data consistency verification in copy activity](copy-activity-data-consistency.md) for details.
282
+
When you move data from source to destination store, copy activity provides an option for you to do extra data consistency verification to ensure the data isn't only successfully copied from source to destination store, but also verified to be consistent between source and destination store. Once inconsistent files have been found during the data movement, you can either abort the copy activity or continue to copy the rest by enabling fault tolerance setting to skip inconsistent files. You can get the skipped file names by enabling session log setting in copy activity. See [Data consistency verification in copy activity](copy-activity-data-consistency.md) for details.
284
283
285
284
## Session log
286
-
You can log your copied file names, which can help you to further ensure the data is not only successfully copied from source to destination store, but also consistent between source and destination store by reviewing the copy activity session logs. See [Session sign in copy activity](copy-activity-log.md) for details.
285
+
You can log your copied file names, which can help you to further ensure the data isn't only successfully copied from source to destination store, but also consistent between source and destination store by reviewing the copy activity session logs. See [Session sign in copy activity](copy-activity-log.md) for details.
287
286
288
287
## Related content
289
288
See the following quickstarts, tutorials, and samples:
Copy file name to clipboardExpand all lines: articles/data-factory/includes/data-factory-v2-connector-overview.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,9 @@ ms.date: 07/12/2024
5
5
ms.author: jianleishen
6
6
---
7
7
8
+
> [!NOTE]
9
+
> Connectors marked *Preview* are available to try, but are not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
10
+
8
11
| Category | Data store |[Copy activity](../copy-activity-overview.md) (source/sink) |[Mapping Data Flow](../concepts-data-flow-overview.md) (source/sink) |[Lookup Activity](../control-flow-lookup-activity.md)|[Get Metadata Activity](../control-flow-get-metadata-activity.md)/[Validation Activity](../control-flow-validation-activity.md)|[Delete Activity](../delete-activity.md)|[Managed private endpoint](../managed-virtual-network-private-endpoint.md#managed-private-endpoints)|
> Connectors marked *Preview* are available to try, but are not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
0 commit comments