What is Azure Data Factory?
What is Azure Data Factory?
Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation.
Azure Data Factory (ADF) does not store any data itself. It allows you to create data-driven workflows to orchestrate the movement of data between supported data stores and then process the data using compute services in other regions or in an on-premise environment. It also allows you to monitor and manage workflows using both programmatic and UI mechanisms.
Ø Azure Data Factory use cases
ADF can be used for:
· Supporting data migrations
· Getting data from a client’s server or online data to an Azure Data Lake
· Carrying out various data integration processes
· Integrating data from different ERP systems and loading it into Azure Synapse for reporting
Ø How does Azure Data Factory work?
The Data Factory service allows you to create data pipelines that move and transform data and then run the pipelines on a specified schedule (hourly, daily, weekly, etc.). This means the data that is consumed and produced by workflows is time-sliced data, and we can specify the pipeline mode as scheduled (once a day) or one time.
Azure Data Factory pipelines (data-driven workflows) typically perform three steps.
1: Connect and Collect
Connect to all the required sources of data and processing such as SaaS services, file shares, FTP, and web services. Then, move the data as needed to a centralized location for subsequent processing by using the Copy Activity in a data pipeline to move data from both on-premise and cloud source data stores to a centralization data store in the cloud for further analysis.
2: Transform and Enrich
Once data is present in a centralized data store in the cloud, it is transformed using compute services such as HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Machine Learning.
3: Publish
Deliver transformed data from the cloud to on-premise sources like SQL Server or keep it in your cloud storage sources for consumption by BI and analytics tools and other applications.
Ø Azure Data Factory pricing
With Data Factory, you pay only for what you need. In fact, pricing for data pipeline is calculated based on:
· Pipeline orchestration and execution;
· Data flow execution and debugging;
· Number of Data Factory operations such as create pipelines and pipeline monitoring.
Leave a Reply
Want to join the discussion?Feel free to contribute!