Minimum hardware requirements for Apache Airflow cluster
Solution 1
I have had no issues using very small instances in pseudo-distributed mode (32 parallel workers; Postgres backend):
- RAM 4096 MB
- CPU 1000 MHz
- VCPUs 2 VCPUs
- Disk 40 GB
If you want distributed mode, you should be more than fine with that if you keep it homogenous. Airflow shouldn't really do heavy lifting anyways; push the workload out to other things (Spark, EMR, BigQuery, etc).
You will also have to run some kind of messaging queue, like RabbitMQ. I think they take Redis too. However, this doesn't really dramatically impact how you size.
Solution 2
We are running the airflow in AWS with below config
t2.small --> airflow scheduler and webserver
db.t2.small --> postgres for metastore
The parallelism parameter in airflow.cfg is set to 10 and there are around 10 users who access airflow UI
All we do from airflow is ssh to other instances and run the code from there
Comments
-
Duleendra almost 2 years
What are the minimum hardware requirements for setting up an Apache Airflow cluster.
Eg. RAM, CPU, Disk etc for different types of nodes in the cluster.