Workflows
Workflows in ExtremeXP define the structure and execution flow of computational tasks. A workflow consists of task definitions, data flow connections, and execution order specifications. Workflows can be inherited and specialized to create variants with different implementations while maintaining the same structure.
The workflow DSL provides a structured way to:
- Define task execution order and dependencies
- Configure task implementations and external dependencies
- Set up data flow between tasks and external files
- Create workflow variants through inheritance
- Specify input and output data paths
Key Concepts
- Linear Execution: Tasks execute in the order defined by the flow specification
- Task Implementation: Each task references a specific implementation class
- Dependency Management: Tasks can specify external file dependencies using glob patterns
- Data Flow: Explicit connections between task inputs/outputs and workflow data
- Workflow Inheritance: The
from
keyword allows workflows to inherit structure and behavior from base workflows - Task Override: Derived workflows can override specific task implementations while maintaining the same execution flow
- Implementation Variants: Different implementations of the same task can be swapped to create workflow variants for testing or different use cases
- Flexible Data Paths: Input and output data can be configured to use different file paths
Basic Workflow Structure
workflow DemoWP5Workflow {
START -> Task1 -> Task2 -> END;
task Task1 {
implementation "demo_tasks/DemoWP5Task1";
}
task Task2;
define input data InputFile;
define output data OutputFile;
configure data InputFile {
path "demo_datasets/titanic.json";
}
configure data OutputFile {
path "output/test_local/titanic_once_more.json";
}
InputFile --> Task1.DemoWP5Task1InputFile;
Task1.DemoWP5Task1OutputFile --> Task2.DemoWP5Task2InputFile;
Task2.DemoWP5Task2OutputFile --> OutputFile;
}
Advanced Workflow Features
Workflow Inheritance and Specialization
workflow DemoWP5AssembledWorkflow1 from DemoWP5Workflow {
task Task2 {
implementation "demo_tasks/DemoWP5Task2V1";
}
}
workflow DemoWP5AssembledWorkflow2 from DemoWP5Workflow {
task Task2 {
implementation "demo_tasks/DemoWP5Task2V2";
}
}
Task-Only Workflows
workflow UserInteraction {
START -> Task1 -> Task2 -> Task3 -> END;
task Task1 {
implementation "Task1";
}
task Task2 {
implementation "Task2";
}
task Task3 {
implementation "Task3";
}
}
workflow AssembledUserInteraction from UserInteraction {
// Inherits all tasks without modifications
}
Workflow Structure Breakdown
1. Workflow Declaration
workflow DemoWP5Workflow {
DemoWP5Workflow
. All workflow components are enclosed within the curly braces.
2. Task Execution Flow
Linear Execution
START -> Task1 -> Task2 -> END;
START
, executes Task1
, then Task2
, and ends at END
. This creates a linear execution pipeline.
Conditional Execution
Task1 ?-> "check_task1_result" ? Task2 : Task3 -> Task4;
check_task1_result
is a condition specified as a Python function in the module linked by thePYTHON_CONDITIONS
option inconfig.py
Task2
is the task to be executed if the condition returns trueTask3
is the task to be executed if the condition returns falseTask4
is executed after eitherTask2
orTask3
This creates a branching execution flow based on dynamic conditions evaluated at runtime.
Conditional Task Execution
Conditional execution allows workflows to adapt their behavior based on runtime conditions, enabling dynamic workflow paths and decision-making logic within the execution flow.
3. Task Definitions
Basic Task Definition
task Task1 {
implementation "demo_tasks/DemoWP5Task1";
}
task Task2;
Task1
is defined with a specific implementation classTask2
is declared without an explicit implementation (uses default or can be overridden in derived workflows)
4. Data Definitions
define input data InputFile;
define output data OutputFile;
InputFile
- data that enters the workflowOutputFile
- data that exits the workflow
5. Data Configuration
configure data InputFile {
path "demo_datasets/titanic.json";
}
configure data OutputFile {
path "output/test_local/titanic_once_more.json";
}
InputFile
reads from the specified pathOutputFile
writes to the specified path
Note
For more examples on Data Configuration check out Github Examples
6. Data Flow Connections
InputFile --> Task1.DemoWP5Task1InputFile;
Task1.DemoWP5Task1OutputFile --> Task2.DemoWP5Task2InputFile;
Task2.DemoWP5Task2OutputFile --> OutputFile;
- Workflow input connects to Task1's input port
- Task1's output connects to Task2's input port
- Task2's output connects to the workflow output
This creates a complete data processing pipeline where data flows from the input file through both tasks to the output file.
Workflow Inheritance and Specialization
7. Basic Inheritance
workflow DemoWP5AssembledWorkflow1 from DemoWP5Workflow {
task Task2 {
implementation "demo_tasks/DemoWP5Task2V1";
}
}
DemoWP5Workflow
and overrides:
Task2
implementation is changed toDemoWP5Task2V1
- All other aspects (data flow, configurations) remain the same as the base workflow
8. Multiple Variants
workflow DemoWP5AssembledWorkflow2 from DemoWP5Workflow {
task Task2 {
implementation "demo_tasks/DemoWP5Task2V2";
}
}
DemoWP5Workflow
and overrides:
Task2
implementation is changed toDemoWP5Task2V2
- This demonstrates how multiple variants can be created from the same base workflow
9. Empty Inheritance
workflow AssembledUserInteraction from UserInteraction {
// Inherits all tasks and configurations without modifications
}
Workflow Variants
Workflow inheritance is particularly useful for creating different versions of the same workflow with alternative task implementations, allowing for A/B testing or different algorithmic approaches.
Task Naming Convention
Task input and output ports follow the naming convention TaskName.TaskNameInputFile
and TaskName.TaskNameOutputFile
, making data flow connections