HEALTHCHECK in ECS Container

5,337

Solution 1

it seems the documentation on AWS is extremely misleading. When using the ECS portal one should type only comma delimited commands, i.e.:

echo,hello world

Solution 2

I also found the docs misleading. Here's a couple of healthchecks that worked for me:

CMD,curl,--fail,http://localhost:80/status.php

curl localhost

or

CMD-SHELL,SCRIPT_NAME=status.php,SCRIPT_FILENAME=/var/www/html/status.php,REQUEST_METHOD=GET,cgi-fcgi,-bind,-connect,localhost:9000

cgi-fcgi php status

And you can dig into what's happening on the instance with:

docker inspect 284ce427a3fd --format='{{json .Config.Healthcheck}}' | jq
docker inspect 284ce427a3fd --format='{{json .State.Health}}' | jq
Share:
5,337

Related videos on Youtube

norbitheeviljester
Author by

norbitheeviljester

This is a place for your advertisement. I'm an experienced Java developer, with high curiosity about other JVM technologies like Scala or Groovy. I'm a fan of clean code, good programming rules and principles, Domain driven design and other architectural fads.

Updated on September 18, 2022

Comments

  • norbitheeviljester
    norbitheeviljester over 1 year

    I'm struggling setting up the correct HEALTHCHECK for a Container inside Task Definition in Amazon ECS.

    I've tried multiple solutions and can't get the simplest "always true" healthcheck to actually work.

    My Healthcheck looks like this:

    [ "CMD-SHELL", "echo"]
    

    which to my understanding should always produce a healthy container but unfortunately I always get an UNHEALTHY status. When I do a docker inspect on the unhealthy container I get the following

    [ec2-user@ip-10-0-0-77 ~]$ docker inspect 8f14979ae4eb
    [
        {
            "Id": "8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e",
            "Created": "2018-05-15T08:55:50.399791936Z",
            "Path": "/bin/sh",
            "Args": [
                "-c",
                "echo \"The application will start in ${JHIPSTER_SLEEP}s...\" &&     sleep ${JHIPSTER_SLEEP} &&     java ${JAVA_OPTS} -Djava.security.egd=file:/dev/./urandom -jar /app.war"
            ],
            "State": {
                "Status": "running",
                "Running": true,
                "Paused": false,
                "Restarting": false,
                "OOMKilled": false,
                "Dead": false,
                "Pid": 783,
                "ExitCode": 0,
                "Error": "",
                "StartedAt": "2018-05-15T08:55:51.049068973Z",
                "FinishedAt": "0001-01-01T00:00:00Z",
                "Health": {
                    "Status": "starting",
                    "FailingStreak": 2,
                    "Log": [
                        {
                            "Start": "2018-05-15T09:00:51.049533205Z",
                            "End": "2018-05-15T09:00:51.197542821Z",
                            "ExitCode": -1,
                            "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown"
                        },
                        {
                            "Start": "2018-05-15T09:05:51.202360089Z",
                            "End": "2018-05-15T09:05:51.296293315Z",
                            "ExitCode": -1,
                            "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown"
                        }
                    ]
                }
            },
            "Image": "sha256:72cafeeceda0db9170eebb0992c98afaaaf7d2f744a328bd8ceb18804ea0c941",
            "ResolvConfPath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/resolv.conf",
            "HostnamePath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/hostname",
            "HostsPath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/hosts",
            "LogPath": "/var/lib/docker/containers/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e/8f14979ae4eb4e16ec26a4ac886d29b29f5666e5f00d41c56d25f5efe0c7d57e-json.log",
            "Name": "/ecs-hit-backend-task-21-hit-backend-container-f0afe6f9a694b6fcfb01",
            "RestartCount": 0,
            "Driver": "devicemapper",
            "Platform": "linux",
            "MountLabel": "",
            "ProcessLabel": "",
            "AppArmorProfile": "",
            "ExecIDs": null,
            "HostConfig": {
                "Binds": null,
                "ContainerIDFile": "",
                "LogConfig": {
                    "Type": "json-file",
                    "Config": {}
                },
                "NetworkMode": "default",
                "PortBindings": {
                    "8080/tcp": [
                        {
                            "HostIp": "",
                            "HostPort": "443"
                        }
                    ]
                },
                "RestartPolicy": {
                    "Name": "",
                    "MaximumRetryCount": 0
                },
                "AutoRemove": false,
                "VolumeDriver": "",
                "VolumesFrom": null,
                "CapAdd": null,
                "CapDrop": null,
                "Dns": null,
                "DnsOptions": null,
                "DnsSearch": null,
                "ExtraHosts": null,
                "GroupAdd": null,
                "IpcMode": "shareable",
                "Cgroup": "",
                "Links": null,
                "OomScoreAdj": 0,
                "PidMode": "",
                "Privileged": false,
                "PublishAllPorts": false,
                "ReadonlyRootfs": false,
                "SecurityOpt": null,
                "UTSMode": "",
                "UsernsMode": "",
                "ShmSize": 67108864,
                "Runtime": "runc",
                "ConsoleSize": [
                    0,
                    0
                ],
                "Isolation": "",
                "CpuShares": 2,
                "Memory": 1073741824,
                "NanoCpus": 0,
                "CgroupParent": "/ecs/4c81e8c4-de44-4c20-ab37-f8360b8ce639",
                "BlkioWeight": 0,
                "BlkioWeightDevice": null,
                "BlkioDeviceReadBps": null,
                "BlkioDeviceWriteBps": null,
                "BlkioDeviceReadIOps": null,
                "BlkioDeviceWriteIOps": null,
                "CpuPeriod": 0,
                "CpuQuota": 0,
                "CpuRealtimePeriod": 0,
                "CpuRealtimeRuntime": 0,
                "CpusetCpus": "",
                "CpusetMems": "",
                "Devices": null,
                "DeviceCgroupRules": null,
                "DiskQuota": 0,
                "KernelMemory": 0,
                "MemoryReservation": 0,
                "MemorySwap": 2147483648,
                "MemorySwappiness": 0,
                "OomKillDisable": false,
                "PidsLimit": 0,
                "Ulimits": [
                    {
                        "Name": "nofile",
                        "Hard": 4096,
                        "Soft": 1024
                    }
                ],
                "CpuCount": 0,
                "CpuPercent": 0,
                "IOMaximumIOps": 0,
                "IOMaximumBandwidth": 0
            },
            "GraphDriver": {
                "Data": {
                    "DeviceId": "4392",
                    "DeviceName": "docker-202:1-263287-a309a6780a0a4e0f2da29705109433ba9be5b7a602e4198b42a83e84e8aa8cc8",
                    "DeviceSize": "10737418240"
                },
                "Name": "devicemapper"
            },
            "Mounts": [],
            "Config": {
                "Hostname": "8f14979ae4eb",
                "Domainname": "",
                "User": "",
                "AttachStdin": false,
                "AttachStdout": false,
                "AttachStderr": false,
                "ExposedPorts": {
                    "8080/tcp": {}
                },
                "Tty": false,
                "OpenStdin": false,
                "StdinOnce": false,
                "Env": [
                    "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/java-1.8-openjdk/jre/bin:/usr/lib/jvm/java-1.8-openjdk/bin",
                    "LANG=C.UTF-8",
                    "JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk/jre",
                    "JAVA_VERSION=8u151",
                    "JAVA_ALPINE_VERSION=8.151.12-r0",
                    "SPRING_OUTPUT_ANSI_ENABLED=ALWAYS",
                    "JHIPSTER_SLEEP=0",
                    "JAVA_OPTS="
                ],
                "Cmd": [
                    "/bin/sh",
                    "-c",
                    "echo \"The application will start in ${JHIPSTER_SLEEP}s...\" &&     sleep ${JHIPSTER_SLEEP} &&     java ${JAVA_OPTS} -Djava.security.egd=file:/dev/./urandom -jar /app.war"
                ],
                "Healthcheck": {
                    "Test": [
                        "CMD",
                        "[ \"CMD-SHELL\"",
                        "\"echo\""]"
                    ],
                    "Interval": 300000000000,
                    "Timeout": 60000000000,
                    "StartPeriod": 300000000000,
                    "Retries": 10
                },
                "ArgsEscaped": true,
                "Image": "401402660647.dkr.ecr.eu-central-1.amazonaws.com/hit_backend:b6f196e.dirty",
                "Volumes": null,
                "WorkingDir": "",
                "Entrypoint": null,
                "OnBuild": null,
                "Labels": {
                    "com.amazonaws.ecs.cluster": "hit-ecs-cluster",
                    "com.amazonaws.ecs.container-name": "hit-backend-container",
                    "com.amazonaws.ecs.task-arn": "arn:aws:ecs:eu-central-1:401402660647:task/4c81e8c4-de44-4c20-ab37-f8360b8ce639",
                    "com.amazonaws.ecs.task-definition-family": "hit-backend-task",
                    "com.amazonaws.ecs.task-definition-version": "21"
                }
            },
            "NetworkSettings": {
                "Bridge": "",
                "SandboxID": "e3d450f7592c56382d6103b310f2c45c87877c078cd489a897be6ddc45ff77dc",
                "HairpinMode": false,
                "LinkLocalIPv6Address": "",
                "LinkLocalIPv6PrefixLen": 0,
                "Ports": {
                    "8080/tcp": [
                        {
                            "HostIp": "0.0.0.0",
                            "HostPort": "443"
                        }
                    ]
                },
                "SandboxKey": "/var/run/docker/netns/e3d450f7592c",
                "SecondaryIPAddresses": null,
                "SecondaryIPv6Addresses": null,
                "EndpointID": "200429fb30b4ea42dd9bd5250a70675435b4020859ef8c98ec60dac31398b83d",
                "Gateway": "172.17.0.1",
                "GlobalIPv6Address": "",
                "GlobalIPv6PrefixLen": 0,
                "IPAddress": "172.17.0.2",
                "IPPrefixLen": 16,
                "IPv6Gateway": "",
                "MacAddress": "02:42:ac:11:00:02",
                "Networks": {
                    "bridge": {
                        "IPAMConfig": null,
                        "Links": null,
                        "Aliases": null,
                        "NetworkID": "a1f7250c4c4be44b8f10385be14d47049418967f7cccaa322852daf0954cae73",
                        "EndpointID": "200429fb30b4ea42dd9bd5250a70675435b4020859ef8c98ec60dac31398b83d",
                        "Gateway": "172.17.0.1",
                        "IPAddress": "172.17.0.2",
                        "IPPrefixLen": 16,
                        "IPv6Gateway": "",
                        "GlobalIPv6Address": "",
                        "GlobalIPv6PrefixLen": 0,
                        "MacAddress": "02:42:ac:11:00:02",
                        "DriverOpts": null
                    }
                }
            }
        }
    ]
    

    What seems to be the problem is the following line:

    "Health": {
                    "Status": "starting",
                    "FailingStreak": 2,
                    "Log": [
                        {
                            "Start": "2018-05-15T09:00:51.049533205Z",
                            "End": "2018-05-15T09:00:51.197542821Z",
                            "ExitCode": -1,
                            "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown"
                        },
                        {
                            "Start": "2018-05-15T09:05:51.202360089Z",
                            "End": "2018-05-15T09:05:51.296293315Z",
                            "ExitCode": -1,
                            "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"[ \\\\\\\"CMD-SHELL\\\\\\\"\\\": executable file not found in $PATH\": unknown"
                        }
                    ]
                }
    

    I have tried various configurations of the healthcheck (CMD instead of CMD-SHELL, /bin/sh, just "echo" etc.) nothing seems to work.

    What is the minimal always true healthcheck for Amazon ECS?

  • c4f4t0r
    c4f4t0r about 5 years
    use kubernetes :)
  • rodrigo-silveira
    rodrigo-silveira about 5 years
    Thank you. Spent a good 6 hours yelling at my computer trying to figure this out. How would you express CMD python /app/healthcheck.py || exit 1? Specifically, how do I express the || exit 1 part? Or is this not needed?
  • Martti Laine
    Martti Laine over 3 years
    Might be obvious, but could save someone some debugging time: make sure your container has curl included. E.g. alpine images do not by default.
  • paultop6
    paultop6 over 2 years
    If possible, I would change the /app/healthcheck.py to return 0 status on success, or 1 status on failure, using sys.exit(status). Would also like to thank norbitheeviljester, been running around in circles for hours with this one.