How do you run a worker with AWS Elastic Beanstalk?
Solution 1
As @chris-wheadon suggested in his comment, you should try to run celery as a deamon in the background. AWS Elastic Beanstalk uses supervisord already to run some deamon processes. So you can leverage that to run celeryd and avoid creating a custom AMI for this. It works nicely for me.
What I do is to programatically add a celeryd config file to the instance after the app is deployed to it by EB. The tricky part is that the file needs to set the required environmental variables for the deamon (such as AWS access keys if you use S3 or other services in your app).
Below there is a copy of the script that I use, add this script to your .ebextensions
folder that configures your EB environment.
The setup script creates a file in the /opt/elasticbeanstalk/hooks/appdeploy/post/
folder (documentation) that lives on all EB instances. Any shell script in there will be executed post deployment. The shell script that is placed there works as follows:
- In the
celeryenv
variable, the virutalenv environment is stored in a format that follows the supervisord notation. This is a comma separated list of env variables. - Then the script creates a variable
celeryconf
that contains the configuration file as a string, which includes the previously parsed env variables. - This variable is then piped into a file called
celeryd.conf
, a supervisord configuration file for the celery daemon. - Finally, the path to the newly created config file is added to the
main
supervisord.conf
file, if it is not already there.
Here is a copy of the script:
files:
"/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/env bash
# Get django environment variables
celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
celeryenv=${celeryenv%?}
# Create celery configuraiton script
celeryconf="[program:celeryd]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery worker -A myappname --loglevel=INFO
directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-worker.log
stderr_logfile=/var/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
environment=$celeryenv"
# Create the celery supervisord conf script
echo "$celeryconf" | tee /opt/python/etc/celery.conf
# Add configuration script to supervisord conf (if not there already)
if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
then
echo "[include]" | tee -a /opt/python/etc/supervisord.conf
echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
fi
# Reread the supervisord config
supervisorctl -c /opt/python/etc/supervisord.conf reread
# Update supervisord in cache without restarting all services
supervisorctl -c /opt/python/etc/supervisord.conf update
# Start/Restart celeryd through supervisord
supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
Solution 2
I was trying to do something similar in PHP however for whatever reason I couldn't keep the worker running. I switched to a AMI on an EC2 server and have had success ever since.
Solution 3
For those using Elasticbeanstalk with Rails & Sidekiq. Here's a collection of ebextensions that ultimately did the trick for me:
https://gist.github.com/ctrlaltdylan/f75b2e38bbbf725acb6d48283fc2f174
Related videos on Youtube
Maxime P
Updated on June 06, 2022Comments
-
Maxime P about 2 years
I am launching a Django application on AWS Elastic Beanstalk. I'd like to run a background task or worker in order to run celery.
I can not find if it is possible or not. If yes how could it be achieved?
Here is what I am doing right now, but this is producing an event type error every time.
container_commands: 01_syncdb: command: "django-admin.py syncdb --noinput" leader_only: true 50_sqs_email: command: "./manage.py celery worker --loglevel=info" leader_only: true
-
EsseTi over 11 yearswhat kind of error do you have?
-
Chris Wheadon over 11 yearsI suspect you need to run celery in daemon mode: docs.celeryproject.org/en/latest/tutorials/… which would require a custom AMI for your beanstalk. This is not for the fainthearted as suggested here: docs.aws.amazon.com/elasticbeanstalk/latest/dg/…
-
Zaar Hai over 11 yearsI think you can find an answer here: stackoverflow.com/questions/12813586/…
-
DataGreed about 4 yearsIf you want something lighter than celery, you can try pypi.org/project/django-eb-sqs-worker package - it uses Amazon SQS for queueing tasks.
-
-
Admin almost 10 yearsThank you for posting this! Celery and EB have been a challenge, but your solution seems to work! I found an issue however: if there's a
%
sign in an environment variable supervisord throws a formatting error. I believe%
is escaped by adding an additional%
, like%%
. Is there any way to format the env vars to add that extra%
to all%
? github.com/Supervisor/supervisor/issues/291 -
yellowcap almost 10 yearsIn that case you could add an additional find/replace piece to the part where the environmental variables are parsed. For instance,
sed 's/%/%%/g'
will replace any%
with%%
. The command chain at the beginning of the script does a bunch of string replacements to make the env vars list supervisord compatible. So try adding it after the first command:cat /opt/python/current/env | tr '\n' ',' | sed 's/%/%%/g' | ...
-
neurix almost 9 years@yellowcap Thank you for the great and detailed answer!
-
AliBZ almost 8 yearsThis definitely works but there are some issues with it. If you do this, your web and worker instances are tied to each other. So if the load on your workers increases, you are scaling both your web and workers instances. The other issue is if you have a celery beat task, you will end up with duplicate tasks if you scale up. You must only have 1 instance running your celery beat. I know the second issue is not related to what this question is about, but a project with celery workers can have celery beat as well.
-
yellowcap almost 8 yearsYes of course ideally you would have two separate instances running! The above setup is useful if you don't have the resources to buy several servers and you want to squeeze out as much as you can from each instance. I am running a low traffic Django app on a single small instance, for that it works great. And even if you have several instances, you might not want to "reserve" one just for the worker. That depends entirely on the use case. Agreed on the celery beat side, that would duplicate tasks so it would not be a good solution for celery beat if you have multiple instances.
-
Cagatay Barin almost 8 yearsI've created a script named "99-celery.config" and copied your script but it didn't work. Can you help me? Should I configure anything about supervisor on my local computer? stackoverflow.com/questions/38566456/…
-
Evan Chu almost 8 yearssomehow in my ec2, supervisorctl is not available as a command...but I got it working, thanks a bunch. OP should accept this answer.
-
Dr Manhattan almost 8 yearsfor the duplicate tasks, use a central cache server like redis or memcached and create a lock so that other instances dont reun the same task twice
-
smentek over 7 yearsThis is great help but like it was mentioned scalability requires execution on main node only. So container_commands should be used instead since it allows usage of leader_only option. I used 2 commands. First for creating the bash file, then second for executing it. This is my solution for django app: stackoverflow.com/questions/41161691/…)
-
Paul Wasson over 7 yearsYour code worked fine until I decide to migrate some variables which were in my settings.py to my Elastic Beanstalk environment properties. Indeed, I have the following error when the script is called : for \'environment\' is badly formatted'>: file: /usr/lib64/python2.7/xmlrpclib.py line: 800 celeryd: ERROR (no such process) Thanks for the help.
-
Yasser Sinjab over 4 yearsI did the same too