What makes Airflow great?

  • Write workflows as if you’re writing programs
  • Jobs can pass parameters to other jobs downstream
  • Logic within workflows (instead of logic hidden ‘inside’ a program)
  • Handle errors and failures gracefully
  • Community and community support, size of community
  • Ease of deployment of workflow changes (continuous integration)
  • Job testing through airflow itself
  • Low requirements for hardware setup
  • built-in authentication details with encrypted passwords and extra details
  • Easy environmental awareness through airflow variables
  • Resource pooling for shared resources
  • The most common of tasks already implemented and usable
  • Accessibility of log files and other meta-data through the web GUI
  • Supporting documentation
  • Growing user base and contributions
  • Extensibility of the framework
  • Ease of installation and automated redeployments
  • Dynamic DAGs
  • Conditional execution in job flow diagrams
  • Easy to reprocess historical jobs by date
  • Usability of the web interface, low number of clicks to get to where you want
  • Detailed info about landing times, processing times and SLA misses
  • Temporarily turning workflows on and off
  • Easy to re-run processing for specific intervals
  • Can (re)run only parts of the workflow and dependent tasks
  • Schedules are defined in code, not in a separate tool and database
  • Can run tasks based on whether the previous run succeeded or failed
  • Implement trigger rules for tasks
  • AJAX/Rest API for job manipulation
  • Jobs/tasks are run in a context, the scheduler passes in the necessary details
  • Can verify what is running on airflow and see the actual code
  • Work gets distributed across your cluster at the task level, not at the DAG level