Airflow Scheduler

Models can be scheduled on Airflow for automated DAG creation and execution.

When the scheduler is selected as Airflow, a model has the following additional settings:

DAG Name: Unique name to assign to Airflow DAG (defaults to model-[id])
Deployer: Deployer module to use for creating model DAG with the following options, which can be extended:
- DAG_Tensor: Used for TensorFlow models, deployable to k8s or dockers, is default
- DAG_Spark: Used for PySpark models, deployable to Spark clusters, k8s or dockers
- DAG_Custom: Used for other models, deployable to k8s or dockers
Deployment Mode: Deployment approach for executing actual model on a remote server
- Kubernetes: With "k8s:image" and "k8s:venv" alternatives, deploying on a k8s cluster using a prebuilt image or venv with related parameters
- Docker: With "docker:image" and "docker:venv" alternatives, deploying on a remote docker using a prebuilt image or venv with related parameters
- Spark: With "spark:submit" and "spark:ssh" alternatives, deploying on a Spark cluster using spark-submit locally or after SSH to a cluster node
Deployment Parameters: Spark, k8s and docker specific parameters to pass on to deployer for execution (e.g. node pool)
Model Connections: List of connections defined in Airflow whose connection descriptors should be passed on to the model executor (to allow central management of connections)
Load Model Data: Whether DAG should load latest model configuration before each run
Notify on Update: Whether DAG should automatically update model version after each run, triggering journals / pulses for other processes (typically used for backend API inference model updates with new trainings)
Process Parameters: Deployment mode specific parameters to pass on to model for execution
DAG Args: Airflow DAG arguments to pass on (e.g. schedule_interval)

PreviousScheduler Platforms NextComplex Event Processing

Last updated 1 year ago