Setting Up Apache Airflow with Docker Compose and Reverse Proxy
Apache Airflow® is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows. Airflow’s extensible Python framework enables you to build workflows connecting with virtually any technology. A web interface helps manage the state of your workflows. Airflow is deployable in many ways, varying from a single process on your laptop to a distributed setup to support even the biggest workflows. In this guide, I will walk you through how I configured Airflow with Docker Compose, set up environment variables, secured it with Cloudflare SSL, and more.
Creating the Base Directory
First, create a base directory to organize all Airflow-related files
mkdir airflow
cd airflow
Downloading Docker Compose File
To deploy Airflow on Docker Compose, you should fetch docker-compose.yaml
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.10.5/docker-compose.yaml'
Renaming the Compose File
From July 2023 Compose V1 stopped receiving updates. We strongly advise upgrading to a newer version of Docker Compose, supplied docker-compose.yaml
may not function accurately within Compose V1., rename docker-compose.yaml
to compose.yaml
for compatibility with the latest version:
mv docker-compose.yaml compose.yaml
Generating a Fernet Key
Airflow uses Fernet to encrypt passwords in the connection configuration and the variable configuration. It guarantees that a password encrypted using it cannot be manipulated or read without the key. Fernet is an implementation of symmetric (also known as “secret key”) authenticated cryptography. I generated one using Python
from cryptography.fernet import Fernet
fernet_key = Fernet.generate_key()
print(fernet_key.decode()) # Keep this key secure!
Save this key, as it will be used in the configuration later.
Updating compose.yaml
To add custom dependencies via a requirements.txt
file, I modified compose.yaml
nano compose.yaml
I then commented out the image:
line and enabled the build:
option
# image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.10.5}
build: .
Updating Environment Variables
Under the x-airflow-common
section, I added and modified the following
AIRFLOW__WEBSERVER__BASE_URL: 'http://localhost:8080'
AIRFLOW__LOGGING__BASE_LOG_FOLDER: '/opt/airflow/logs'
AIRFLOW__LOGGING__REMOTE_LOGGING: 'false'
AIRFLOW__CORE__DAGS_FOLDER: '/opt/airflow/dags'
AIRFLOW__CORE__PLUGINS_FOLDER: '/opt/airflow/plugins'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__WEBSERVER__X_FRAME_ENABLED: 'false'
AIRFLOW__WEBSERVER__CONFIG_FILE: '/opt/airflow/config/webserver_config.py'
AIRFLOW__CORE__FERNET_KEY: '<paste-generated-fernet-key>'
AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX: 'true'
Creating a Dockerfile
Since I needed to install additional dependencies, I created a Dockerfile
in the same folder compose.yaml
file is with content similar to
FROM apache/airflow:2.10.5
ADD requirements.txt .
RUN pip install apache-airflow==2.10.5 -r requirements.txt
It is the best practice to install apache-airflow in the same version as the one that comes from the original image. This way you can be sure that pip
will not try to downgrade or upgrade apache airflow while installing other requirements, which might happen in case you try to add a dependency that conflicts with the version of apache-airflow that you are using.
Adding Dependencies
I created a requirements.txt
file and added the necessary packages
requests
pymongo
Creating Required Directories
I created directories for DAGs, logs, plugins, and config
mkdir -p ./dags ./logs ./plugins ./config
Setting Up the Environment File
I generated an .env
file to set the Airflow user ID
echo -e "AIRFLOW_UID=$(id -u)" > .env
Then, I manually added the Airflow admin credentials
_AIRFLOW_WWW_USER_USERNAME=admin
_AIRFLOW_WWW_USER_PASSWORD=admin123
Initializing the Database
On all operating systems, you need to run database migrations and create the first user account. To do this, run
docker compose up airflow-init
Starting Airflow
To start all Airflow services in detached mode, I ran
docker compose up -d
Setting Up Reverse Proxy with Cloudflare SSL
To secure my Airflow instance, I set up an Nginx reverse proxy and enabled Cloudflare Origin CA SSL. I followed this blog to set it up.
Nginx Configuration
I used the following Nginx configuration to route traffic securely
server {
listen 80;
server_name airflow.karthiks.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
ssl_certificate /etc/ssl/airflow/fullchain.pem;
ssl_certificate_key /etc/ssl/airflow/privkey.pem;
ssl_trusted_certificate /etc/ssl/cloudflare/origin_ca_rsa_root.crt;
server_name airflow.karthiks.com;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
Conclusion
By following these steps, I successfully set up Apache Airflow with Docker Compose, customized the environment, and secured it using Nginx with Cloudflare SSL. Now, my Airflow instance runs efficiently and securely in a production-ready setup.
If you found this guide helpful, leave a clap and share it with others setting up Airflow!