What is Azure Data Factory and why does it exist? To understand the justification behind Azure Data Factory, you must first understand that data is a constant in business, and the longer a business is around, the more data it will collect, store and have available for its use. And therein lies the problem – large amounts of data, both structured and unstructured – can be unwieldy and hard to use, making it a challenging but critical commodity for many organizations.
Azure Data Factory Can Help Azure Cloud Users
Azure Data Factory (ADF) is a service. It lets companies transform all their raw big data from relational, non-relational and other storage systems; and integrate it for use with data-driven workflows to help companies map strategies, attain goals and drive business value from the data they possess. By putting such data into usable context, it can provide meaningful insights to analysts, data scientists and business decision makers.
Azure Data Factory is a managed, cloud-based data integration service that's built for complex hybrid extract-transform-load (ETL), extract-load-transform (ELT) and data integration projects. The service uses a drag and drop interface for ease of use.
"With visual tools, you can iteratively build, debug, deploy, operationalize and monitor your big data pipelines," Gaurav Malhotra, senior program manager for Microsoft's Azure Data Factory, wrote in an April 9 post on the Azure Blog. "Now, you can follow industry-leading best practices to do continuous integration and deployment for your ETL/ELT workflows to multiple environments (Dev, Test, PROD, etc.). Essentially, you can incorporate the practice of testing for your codebase changes and push the tested changes to a Test or Prod environment automatically."
Using Azure Data Factory, companies can create and schedule data-driven workflows, which are called pipelines, that can ingest data from disparate data stores. ADF can process and transform the data by using compute services such as Azure HDInsight Hadoop, Spark, Azure Data Lake Analytics and Azure Machine Learning.
Users can also publish output data to data stores such as Azure SQL Data Warehouse for use by business intelligence (BI) applications. The idea is, that by using Azure Data Factory, raw data can be organized by users into meaningful data stores and data lakes for better business decisions.
An example of this might involve a gaming company that collects petabytes of game logs that are produced by games in the cloud. The company wants to analyze the logs to gain insights into customer preferences, demographics and usage behavior, as well as identify up-sell and cross-sell opportunities, develop compelling new features, drive business growth and provide improved customer experiences.
To analyze the logs, the company can use reference data such as customer information, game information and marketing campaign information that is in an on-premises data store. The company can combine it with additional log data that it has in a cloud data store. To extract insights, it can process the joined data by using a Spark cluster in the cloud (Azure HDInsight) and publish the transformed data into a cloud data warehouse such as Azure SQL Data Warehouse to build a report on top of it. The workflow can be automated so it can be monitored and managed each day and can be executed when files land in a blob store container.
Pricing for Azure Data Factory services is based on usage and has no setup, upfront or termination costs. It is based on the number of activities that are run, the volume of data that is moved, the number of compute hours required for SQL Server Integration Services (SSIS) and whether a pipeline is active or not.
Although Azure Data Factory simplifies the processing of large data volumes, it does not make Azure management, Azure migration or Azure monitoring easier. To simplify Azure management, try 5nine Cloud Manager. 5nine Cloud Manager simplifies and unifies Azure management by providing a familiar, easy-to-use interface that evokes the private cloud experience and keeps your teams at their productive peak.
With 5nine Cloud Manager, you can
- Easily connect to multiple existing Azure subscriptions
- Migrate VMs from a private cloud to Azure
- Back up workloads to the public cloud
- Create and manage Azure VMs from a single console
- Directly connect to Azure VMs through a console similar to Private Cloud
- Deallocate Azure VMs, so you are not paying for unused resources
- Also, dozens of other features that you would only find in the Microsoft Azure Portal are now accessible from the same application that can manage your on-premises data center