What is Azure Data Factory?

What is Azure Data Factory?

Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation.

Azure Data Factory (ADF) does not store any data itself. It allows you to create data-driven workflows to orchestrate the movement of data between supported data stores and then process the data using compute services in other regions or in an on-premise environment. It also allows you to monitor and manage workflows using both programmatic and UI mechanisms.

 

Ø Azure Data Factory use cases

ADF can be used for:

· Supporting data migrations

· Getting data from a client’s server or online data to an Azure Data Lake

· Carrying out various data integration processes

· Integrating data from different ERP systems and loading it into Azure Synapse for reporting

 

 

 

 

Ø How does Azure Data Factory work?

The Data Factory service allows you to create data pipelines that move and transform data and then run the pipelines on a specified schedule (hourly, daily, weekly, etc.). This means the data that is consumed and produced by workflows is time-sliced data, and we can specify the pipeline mode as scheduled (once a day) or one time.

Azure Data Factory pipelines (data-driven workflows) typically perform three steps.

1: Connect and Collect

Connect to all the required sources of data and processing such as SaaS services, file shares, FTP, and web services. Then,  move the data as needed to a centralized location for subsequent processing by using the Copy Activity in a data pipeline to move data from both on-premise and cloud source data stores to a centralization data store in the cloud for further analysis.

2: Transform and Enrich

Once data is present in a centralized data store in the cloud, it is transformed using compute services such as HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Machine Learning.

 

3: Publish

Deliver transformed data from the cloud to on-premise sources like SQL Server or keep it in your cloud storage sources for consumption by BI and analytics tools and other applications.

 

Ø Azure Data Factory pricing

With Data Factory, you pay only for what you need. In fact, pricing for data pipeline is calculated based on:

· Pipeline orchestration and execution;

· Data flow execution and debugging;

· Number of Data Factory operations such as create pipelines and pipeline monitoring.

 

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *