Specialized Data Factory Processing
Takeaway
For highly specific use cases requiring advanced customization, enhanced security, or improved performance, the Product-Live Data Factory platform can offload certain task processing to an external infrastructure. Common scenarios include:
- Connecting to a private system that is not publicly accessible
- Processing highly sensitive data that must remain secure
- Performing resource-intensive tasks requiring specialized hardware
- Executing tasks not natively supported by the Data Factory platform
Overview
The Data Factory platform can delegate specific task processing to an external infrastructure. In such cases, the platform handles task orchestration within a job, while the execution and result submission of custom tasks are managed by the task creator.
Custom Task
A custom task refers to a task not natively supported by the Data Factory platform. The task creator is responsible for executing the task and returning the result to the platform.
Example
The example below demonstrates a job execution containing a custom task. The job consists of two tasks: the first is a native task (e.g., exporting data from a specific table), and the second is a custom task (e.g., integrating the exported data into a specific system).
Requirements
To use the specialized processing feature, the following prerequisites must be met:
- Access to the Data Factory platform, a pipeline, and a valid token for platform interaction.
- Capability to develop and manage the application performing the custom task (an example is provided below), as well as the ability to execute this application in a production environment (e.g., on your own servers or a cloud provider like Azure or AWS).
Implementation
Interoperability between your applications and the Product-Live Data Factory platform is achieved through APIs. The required interactions include:
- Retrieving tasks for processing by your application
- Submitting the processed results back to the platform
You can use any programming language for these operations. Generate an SDK using the OpenAPI definition of our API, or leverage our Node.js/TypeScript SDK.
A sample implementation using the NestJS framework and TypeScript is available here (Product-Live/data-factory-task-example).