Airflow Configurations
Can Airflow Do this?
If you're a Airflow developer the configuration page in the documentation is a gold mine. Most of the times someone asks Can Airflow do this? the answer lies in the configurations page.
Infact I have this page bookmarked in my browser forever. There are so many configurations to tweak different airflow components.
Catchup
Ensure all the dags run only the future runs irrespective of their start date. You can also set this at each dag level.

Scheduler Performance

There is a whole page in Airflow docs on which config params to tweak to use the available resources effectively

Database

load_default_connections
This is a new configuration in 2.3.0 but also an odd one out. Setting this to true will create default Airflow connections in the metadata DB
max_db_retries
sql_alchemy_conn
sql_alchemy_connect_args
sql_alchemy_engine_args
You can pass a variety of parameters to sqlalchemy engine. These can be defined as a dictionary
sql_alchemy_max_overflow
Manage sqlalchemy pool using
  • sql_alchemy_pool_enabled,
  • sql_alchemy_pool_pre_ping
  • sql_alchemy_pool_recycle
  • sql_alchemy_pool_size
sql_alchemy_schema
sql_engine_collation_for_ids
sql_engine_encoding
Copy link
On this page
Scheduler Performance
Database