Airflow Configurations

Can Airflow Do this?

If you're a Airflow developer the configuration page in the documentation is a gold mine. Most of the times someone asks Can Airflow do this? the answer lies in the configurations page.

Infact I have this page bookmarked in my browser forever. There are so many configurations to tweak different airflow components.

Catchup

Ensure all the dags run only the future runs irrespective of their start date. You can also set this at each dag level.

Scheduler Performance

There is a whole page in Airflow docs on which config params to tweak to use the available resources effectively

Database

load_default_connections

**** This is a new configuration in 2.3.0 but also an odd one out. Setting this to true will create default Airflow connections in the metadata DB

max_db_retries

sql_alchemy_conn

sql_alchemy_connect_args

sql_alchemy_engine_args

You can pass a variety of parameters to sqlalchemy engine. These can be defined as a dictionary

sql_alchemy_max_overflow

Manage sqlalchemy pool using

  • sql_alchemy_pool_enabled,

  • sql_alchemy_pool_pre_ping

  • sql_alchemy_pool_recycle

  • sql_alchemy_pool_size

sql_alchemy_schema

sql_engine_collation_for_ids

sql_engine_encoding


Last updated