HEALTHCHECK in ECS Container
Solution 1
it seems the documentation on AWS is extremely misleading. When using the ECS portal one should type only comma delimited commands, i.e.:
echo,hello world
Solution 2
I also found the docs misleading. Here's a couple of healthchecks that worked for me:
CMD,curl,--fail,http://localhost:80/status.php
or
CMD-SHELL,SCRIPT_NAME=status.php,SCRIPT_FILENAME=/var/www/html/status.php,REQUEST_METHOD=GET,cgi-fcgi,-bind,-connect,localhost:9000
And you can dig into what's happening on the instance with:
docker inspect 284ce427a3fd --format='{{json .Config.Healthcheck}}' | jq
docker inspect 284ce427a3fd --format='{{json .State.Health}}' | jq
Related videos on Youtube
norbitheeviljester
This is a place for your advertisement. I'm an experienced Java developer, with high curiosity about other JVM technologies like Scala or Groovy. I'm a fan of clean code, good programming rules and principles, Domain driven design and other architectural fads.
Updated on September 18, 2022Comments
-
norbitheeviljester over 1 year
I'm struggling setting up the correct HEALTHCHECK for a Container inside Task Definition in Amazon ECS.
I've tried multiple solutions and can't get the simplest "always true" healthcheck to actually work.
My Healthcheck looks like this:
[ "CMD-SHELL", "echo"]
which to my understanding should always produce a healthy container but unfortunately I always get an UNHEALTHY status. When I do a docker inspect on the unhealthy container I get the following
[ec2-user@ip-10-0-0-77 ~]$ docker inspect 8f14979ae4eb [ { "Id": "8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e", "Created": "2018-05-15T08:55:50.399791936Z", "Path": "/bin/sh", "Args": [ "-c", "echo \"The application will start in ${JHIPSTER_SLEEP}s...\" && sleep ${JHIPSTER_SLEEP} && java ${JAVA_OPTS} -Djava.security.egd=file:/dev/./urandom -jar /app.war" ], "State": { "Status": "running", "Running": true, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 783, "ExitCode": 0, "Error": "", "StartedAt": "2018-05-15T08:55:51.049068973Z", "FinishedAt": "0001-01-01T00:00:00Z", "Health": { "Status": "starting", "FailingStreak": 2, "Log": [ { "Start": "2018-05-15T09:00:51.049533205Z", "End": "2018-05-15T09:00:51.197542821Z", "ExitCode": -1, "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown" }, { "Start": "2018-05-15T09:05:51.202360089Z", "End": "2018-05-15T09:05:51.296293315Z", "ExitCode": -1, "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown" } ] } }, "Image": "sha256:72cafeeceda0db9170eebb0992c98afaaaf7d2f744a328bd8ceb18804ea0c941", "ResolvConfPath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/resolv.conf", "HostnamePath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/hostname", "HostsPath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/hosts", "LogPath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e-json.log", "Name": "/ecs-hit-backend-task-21-hit-backend-container-f0afe6f9a694b6fcfb01", "RestartCount": 0, "Driver": "devicemapper", "Platform": "linux", "MountLabel": "", "ProcessLabel": "", "AppArmorProfile": "", "ExecIDs": null, "HostConfig": { "Binds": null, "ContainerIDFile": "", "LogConfig": { "Type": "json-file", "Config": {} }, "NetworkMode": "default", "PortBindings": { "8080/tcp": [ { "HostIp": "", "HostPort": "443" } ] }, "RestartPolicy": { "Name": "", "MaximumRetryCount": 0 }, "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": null, "CapAdd": null, "CapDrop": null, "Dns": null, "DnsOptions": null, "DnsSearch": null, "ExtraHosts": null, "GroupAdd": null, "IpcMode": "shareable", "Cgroup": "", "Links": null, "OomScoreAdj": 0, "PidMode": "", "Privileged": false, "PublishAllPorts": false, "ReadonlyRootfs": false, "SecurityOpt": null, "UTSMode": "", "UsernsMode": "", "ShmSize": 67108864, "Runtime": "runc", "ConsoleSize": [ 0, 0 ], "Isolation": "", "CpuShares": 2, "Memory": 1073741824, "NanoCpus": 0, "CgroupParent": "/ecs/4c81e8c4-de44-4c20-ab37-f8360b8ce639", "BlkioWeight": 0, "BlkioWeightDevice": null, "BlkioDeviceReadBps": null, "BlkioDeviceWriteBps": null, "BlkioDeviceReadIOps": null, "BlkioDeviceWriteIOps": null, "CpuPeriod": 0, "CpuQuota": 0, "CpuRealtimePeriod": 0, "CpuRealtimeRuntime": 0, "CpusetCpus": "", "CpusetMems": "", "Devices": null, "DeviceCgroupRules": null, "DiskQuota": 0, "KernelMemory": 0, "MemoryReservation": 0, "MemorySwap": 2147483648, "MemorySwappiness": 0, "OomKillDisable": false, "PidsLimit": 0, "Ulimits": [ { "Name": "nofile", "Hard": 4096, "Soft": 1024 } ], "CpuCount": 0, "CpuPercent": 0, "IOMaximumIOps": 0, "IOMaximumBandwidth": 0 }, "GraphDriver": { "Data": { "DeviceId": "4392", "DeviceName": "docker-202:1-263287-a309a6780a0a4e0f2da29705109433ba9be5b7a602e4198b42a83e84e8aa8cc8", "DeviceSize": "10737418240" }, "Name": "devicemapper" }, "Mounts": [], "Config": { "Hostname": "8f14979ae4eb", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "ExposedPorts": { "8080/tcp": {} }, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/java-1.8-openjdk/jre/bin:/usr/lib/jvm/java-1.8-openjdk/bin", "LANG=C.UTF-8", "JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk/jre", "JAVA_VERSION=8u151", "JAVA_ALPINE_VERSION=8.151.12-r0", "SPRING_OUTPUT_ANSI_ENABLED=ALWAYS", "JHIPSTER_SLEEP=0", "JAVA_OPTS=" ], "Cmd": [ "/bin/sh", "-c", "echo \"The application will start in ${JHIPSTER_SLEEP}s...\" && sleep ${JHIPSTER_SLEEP} && java ${JAVA_OPTS} -Djava.security.egd=file:/dev/./urandom -jar /app.war" ], "Healthcheck": { "Test": [ "CMD", "[ \"CMD-SHELL\"", "\"echo\""]" ], "Interval": 300000000000, "Timeout": 60000000000, "StartPeriod": 300000000000, "Retries": 10 }, "ArgsEscaped": true, "Image": "401402660647.dkr.ecr.eu-central-1.amazonaws.com/hit_backend:b6f196e.dirty", "Volumes": null, "WorkingDir": "", "Entrypoint": null, "OnBuild": null, "Labels": { "com.amazonaws.ecs.cluster": "hit-ecs-cluster", "com.amazonaws.ecs.container-name": "hit-backend-container", "com.amazonaws.ecs.task-arn": "arn:aws:ecs:eu-central-1:401402660647:task/4c81e8c4-de44-4c20-ab37-f8360b8ce639", "com.amazonaws.ecs.task-definition-family": "hit-backend-task", "com.amazonaws.ecs.task-definition-version": "21" } }, "NetworkSettings": { "Bridge": "", "SandboxID": "e3d450f7592c56382d6103b310f2c45c87877c078cd489a897be6ddc45ff77dc", "HairpinMode": false, "LinkLocalIPv6Address": "", "LinkLocalIPv6PrefixLen": 0, "Ports": { "8080/tcp": [ { "HostIp": "0.0.0.0", "HostPort": "443" } ] }, "SandboxKey": "/var/run/docker/netns/e3d450f7592c", "SecondaryIPAddresses": null, "SecondaryIPv6Addresses": null, "EndpointID": "200429fb30b4ea42dd9bd5250a70675435b4020859ef8c98ec60dac31398b83d", "Gateway": "172.17.0.1", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAddress": "172.17.0.2", "IPPrefixLen": 16, "IPv6Gateway": "", "MacAddress": "02:42:ac:11:00:02", "Networks": { "bridge": { "IPAMConfig": null, "Links": null, "Aliases": null, "NetworkID": "a1f7250c4c4be44b8f10385be14d47049418967f7cccaa322852daf0954cae73", "EndpointID": "200429fb30b4ea42dd9bd5250a70675435b4020859ef8c98ec60dac31398b83d", "Gateway": "172.17.0.1", "IPAddress": "172.17.0.2", "IPPrefixLen": 16, "IPv6Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "MacAddress": "02:42:ac:11:00:02", "DriverOpts": null } } } } ]
What seems to be the problem is the following line:
"Health": { "Status": "starting", "FailingStreak": 2, "Log": [ { "Start": "2018-05-15T09:00:51.049533205Z", "End": "2018-05-15T09:00:51.197542821Z", "ExitCode": -1, "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown" }, { "Start": "2018-05-15T09:05:51.202360089Z", "End": "2018-05-15T09:05:51.296293315Z", "ExitCode": -1, "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown" } ] }
I have tried various configurations of the healthcheck (CMD instead of CMD-SHELL, /bin/sh, just "echo" etc.) nothing seems to work.
What is the minimal always true healthcheck for Amazon ECS?
-
c4f4t0r about 5 yearsuse kubernetes :)
-
rodrigo-silveira about 5 yearsThank you. Spent a good 6 hours yelling at my computer trying to figure this out. How would you express
CMD python /app/healthcheck.py || exit 1
? Specifically, how do I express the|| exit 1
part? Or is this not needed? -
Martti Laine over 3 yearsMight be obvious, but could save someone some debugging time: make sure your container has curl included. E.g. alpine images do not by default.
-
paultop6 over 2 yearsIf possible, I would change the /app/healthcheck.py to return 0 status on success, or 1 status on failure, using sys.exit(status). Would also like to thank norbitheeviljester, been running around in circles for hours with this one.