What makes Airflow great?¶
- Write workflows as if you’re writing programs
- Jobs can pass parameters to other jobs downstream
- Logic within workflows (instead of logic hidden ‘inside’ a program)
- Handle errors and failures gracefully
- Community and community support, size of community
- Ease of deployment of workflow changes (continuous integration)
- Job testing through airflow itself
- Low requirements for hardware setup
- built-in authentication details with encrypted passwords and extra details
- Easy environmental awareness through airflow variables
- Resource pooling for shared resources
- The most common of tasks already implemented and usable
- Accessibility of log files and other meta-data through the web GUI
- Supporting documentation
- Growing user base and contributions
- Extensibility of the framework
- Ease of installation and automated redeployments
- Dynamic DAGs
- Conditional execution in job flow diagrams
- Easy to reprocess historical jobs by date
- Usability of the web interface, low number of clicks to get to where you want
- Detailed info about landing times, processing times and SLA misses
- Temporarily turning workflows on and off
- Easy to re-run processing for specific intervals
- Can (re)run only parts of the workflow and dependent tasks
- Schedules are defined in code, not in a separate tool and database
- Can run tasks based on whether the previous run succeeded or failed
- Implement trigger rules for tasks
- AJAX/Rest API for job manipulation
- Jobs/tasks are run in a context, the scheduler passes in the necessary details
- Can verify what is running on airflow and see the actual code
- Work gets distributed across your cluster at the task level, not at the DAG level