Create AMI image as part of a cloudformation stack

amazon-ec2 amazon-ami amazon-cloudformation

22,338

Solution 1

Yes, you can create an AMI from an EC2 instance within a CloudFormation template by implementing a Custom Resource that calls the CreateImage API on create (and calls the DeregisterImage and DeleteSnapshot APIs on delete).

Since AMIs can sometimes take a long time to create, a Lambda-backed Custom Resource will need to re-invoke itself if the wait has not completed before the Lambda function times out.

Here's a complete example:

Description: Create an AMI from an EC2 instance.
Parameters:
  ImageId:
    Description: Image ID for base EC2 instance.
    Type: AWS::EC2::Image::Id
    # amzn-ami-hvm-2016.09.1.20161221-x86_64-gp2
    Default: ami-9be6f38c
  InstanceType:
    Description: Instance type to launch EC2 instances.
    Type: String
    Default: m3.medium
    AllowedValues: [ m3.medium, m3.large, m3.xlarge, m3.2xlarge ]
Resources:
  # Completes when the instance is fully provisioned and ready for AMI creation.
  AMICreate:
    Type: AWS::CloudFormation::WaitCondition
    CreationPolicy:
      ResourceSignal:
        Timeout: PT10M
  Instance:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: !Ref ImageId
      InstanceType: !Ref InstanceType
      UserData:
        "Fn::Base64": !Sub |
          #!/bin/bash -x
          yum -y install mysql # provisioning example
          /opt/aws/bin/cfn-signal \
            -e $? \
            --stack ${AWS::StackName} \
            --region ${AWS::Region} \
            --resource AMICreate
          shutdown -h now
  AMI:
    Type: Custom::AMI
    DependsOn: AMICreate
    Properties:
      ServiceToken: !GetAtt AMIFunction.Arn
      InstanceId: !Ref Instance
  AMIFunction:
    Type: AWS::Lambda::Function
    Properties:
      Handler: index.handler
      Role: !GetAtt LambdaExecutionRole.Arn
      Code:
        ZipFile: !Sub |
          var response = require('cfn-response');
          var AWS = require('aws-sdk');
          exports.handler = function(event, context) {
            console.log("Request received:\n", JSON.stringify(event));
            var physicalId = event.PhysicalResourceId;
            function success(data) {
              return response.send(event, context, response.SUCCESS, data, physicalId);
            }
            function failed(e) {
              return response.send(event, context, response.FAILED, e, physicalId);
            }
            // Call ec2.waitFor, continuing if not finished before Lambda function timeout.
            function wait(waiter) {
              console.log("Waiting: ", JSON.stringify(waiter));
              event.waiter = waiter;
              event.PhysicalResourceId = physicalId;
              var request = ec2.waitFor(waiter.state, waiter.params);
              setTimeout(()=>{
                request.abort();
                console.log("Timeout reached, continuing function. Params:\n", JSON.stringify(event));
                var lambda = new AWS.Lambda();
                lambda.invoke({
                  FunctionName: context.invokedFunctionArn,
                  InvocationType: 'Event',
                  Payload: JSON.stringify(event)
                }).promise().then((data)=>context.done()).catch((err)=>context.fail(err));
              }, context.getRemainingTimeInMillis() - 5000);
              return request.promise().catch((err)=>
                (err.code == 'RequestAbortedError') ?
                  new Promise(()=>context.done()) :
                  Promise.reject(err)
              );
            }
            var ec2 = new AWS.EC2(),
                instanceId = event.ResourceProperties.InstanceId;
            if (event.waiter) {
              wait(event.waiter).then((data)=>success({})).catch((err)=>failed(err));
            } else if (event.RequestType == 'Create' || event.RequestType == 'Update') {
              if (!instanceId) { failed('InstanceID required'); }
              ec2.waitFor('instanceStopped', {InstanceIds: [instanceId]}).promise()
              .then((data)=>
                ec2.createImage({
                  InstanceId: instanceId,
                  Name: event.RequestId
                }).promise()
              ).then((data)=>
                wait({
                  state: 'imageAvailable',
                  params: {ImageIds: [physicalId = data.ImageId]}
                })
              ).then((data)=>success({})).catch((err)=>failed(err));
            } else if (event.RequestType == 'Delete') {
              if (physicalId.indexOf('ami-') !== 0) { return success({});}
              ec2.describeImages({ImageIds: [physicalId]}).promise()
              .then((data)=>
                (data.Images.length == 0) ? success({}) :
                ec2.deregisterImage({ImageId: physicalId}).promise()
              ).then((data)=>
                ec2.describeSnapshots({Filters: [{
                  Name: 'description',
                  Values: ["*" + physicalId + "*"]
                }]}).promise()
              ).then((data)=>
                (data.Snapshots.length === 0) ? success({}) :
                ec2.deleteSnapshot({SnapshotId: data.Snapshots[0].SnapshotId}).promise()
              ).then((data)=>success({})).catch((err)=>failed(err));
            }
          };
      Runtime: nodejs4.3
      Timeout: 300
  LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal: {Service: [lambda.amazonaws.com]}
          Action: ['sts:AssumeRole']
      Path: /
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      - arn:aws:iam::aws:policy/service-role/AWSLambdaRole
      Policies:
      - PolicyName: EC2Policy
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
            - Effect: Allow
              Action:
              - 'ec2:DescribeInstances'
              - 'ec2:DescribeImages'
              - 'ec2:CreateImage'
              - 'ec2:DeregisterImage'
              - 'ec2:DescribeSnapshots'
              - 'ec2:DeleteSnapshot'
              Resource: ['*']
Outputs:
  AMI:
    Value: !Ref AMI

Solution 2

For what it's worth, here's Python variant of wjordan's AMIFunction definition in the original answer. All other resources in the original yaml remain unchanged:

AMIFunction:
  Type: AWS::Lambda::Function
  Properties:
    Handler: index.handler
    Role: !GetAtt LambdaExecutionRole.Arn
    Code:
      ZipFile: !Sub |
        import logging
        import cfnresponse
        import json
        import boto3
        from threading import Timer
        from botocore.exceptions import WaiterError

        logger = logging.getLogger()
        logger.setLevel(logging.INFO)

        def handler(event, context):

          ec2 = boto3.resource('ec2')
          physicalId = event['PhysicalResourceId'] if 'PhysicalResourceId' in event else None

          def success(data={}):
            cfnresponse.send(event, context, cfnresponse.SUCCESS, data, physicalId)

          def failed(e):
            cfnresponse.send(event, context, cfnresponse.FAILED, str(e), physicalId)

          logger.info('Request received: %s\n' % json.dumps(event))

          try:
            instanceId = event['ResourceProperties']['InstanceId']
            if (not instanceId):
              raise 'InstanceID required'

            if not 'RequestType' in event:
              success({'Data': 'Unhandled request type'})
              return

            if event['RequestType'] == 'Delete':
              if (not physicalId.startswith('ami-')):
                raise 'Unknown PhysicalId: %s' % physicalId

              ec2client = boto3.client('ec2')
              images = ec2client.describe_images(ImageIds=[physicalId])
              for image in images['Images']:
                ec2.Image(image['ImageId']).deregister()
                snapshots = ([bdm['Ebs']['SnapshotId'] 
                              for bdm in image['BlockDeviceMappings'] 
                              if 'Ebs' in bdm and 'SnapshotId' in bdm['Ebs']])
                for snapshot in snapshots:
                  ec2.Snapshot(snapshot).delete()

              success({'Data': 'OK'})
            elif event['RequestType'] in set(['Create', 'Update']):
              if not physicalId:  # AMI creation has not been requested yet
                instance = ec2.Instance(instanceId)
                instance.wait_until_stopped()

                image = instance.create_image(Name="Automatic from CloudFormation stack ${AWS::StackName}")

                physicalId = image.image_id
              else:
                logger.info('Continuing in awaiting image available: %s\n' % physicalId)

              ec2client = boto3.client('ec2')
              waiter = ec2client.get_waiter('image_available')

              try:
                waiter.wait(ImageIds=[physicalId], WaiterConfig={'Delay': 30, 'MaxAttempts': 6})
              except WaiterError as e:
                # Request the same event but set PhysicalResourceId so that the AMI is not created again
                event['PhysicalResourceId'] = physicalId
                logger.info('Timeout reached, continuing function: %s\n' % json.dumps(event))
                lambda_client = boto3.client('lambda')
                lambda_client.invoke(FunctionName=context.invoked_function_arn, 
                                      InvocationType='Event',
                                      Payload=json.dumps(event))
                return

              success({'Data': 'OK'})
            else:
              success({'Data': 'OK'})
          except Exception as e:
            failed(e)
    Runtime: python2.7
    Timeout: 300

Solution 3

No.
I suppose Yes. Once the stack you can use the "Update Stack" operation. You need to provide the full JSON template of the initial stack + your changes in that same file (Changed AMI) I would run this in a test environment first (not production), as I'm not really sure what the operation does to the existing instances.

Why not create an AMI initially outside cloudformation and then use that AMI in your final cloudformation template ?

Another option is to write some automation to create two cloudformation stacks and you can delete the first one once the AMI that you've created is finalized.

Solution 4

While @wjdordan's solution is good for simple use cases, updating the User Data will not update the AMI.

(DISCLAIMER: I am the original author) cloudformation-ami aims at allowing you to declare AMIs in CloudFormation that can be reliably created, updated and deleted. Using cloudformation-ami You can declare custom AMIs like this:

MyAMI:
  Type: Custom::AMI
  Properties:
    ServiceToken: !ImportValue AMILambdaFunctionArn
    Image:
      Name: my-image
      Description: some description for the image
    TemplateInstance:
      ImageId: ami-467ca739
      IamInstanceProfile:
        Arn: arn:aws:iam::1234567890:instance-profile/MyProfile-ASDNSDLKJ
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash -x
          yum -y install mysql # provisioning example
          # Signal that the instance is ready
          INSTANCE_ID=`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`
          aws ec2 create-tags --resources $INSTANCE_ID --tags Key=UserDataFinished,Value=true --region ${AWS::Region}
      KeyName: my-key
      InstanceType: t2.nano
      SecurityGroupIds:
      - sg-d7bf78b0
      SubnetId: subnet-ba03aa91
      BlockDeviceMappings:
      - DeviceName: "/dev/xvda"
        Ebs:
          VolumeSize: '10'
          VolumeType: gp2

View more solutions

22,338

Author by

Admin

Updated on June 15, 2020

Comments

Admin almost 4 years

I want to create an EC2 cloudformation stack which basically can be described in the following steps:

1.- Launch instance

2.- Provision the instance

3.- Stop the instance and create an AMI image out of it

4.- Create an autoscaling group with the created AMI image as source to launch new instances.

Basically I can do 1 and 2 in one cloudformation template and 4 in a second template. What I don't seem able to do is to create an AMI image from an instance inside a cloudformation template, which basically generates the problem of having to manually remove the AMI if I want to remove the stack.

That being said, my questions are:

1.- Is there a way to create an AMI image from an instance INSIDE the cloudformation template?

2.- If the answer to 1 is no, is there a way to add an AMI image (or any other resource for that matter) to make it part of a completed stack?

EDIT:

Just to clarify, I've already solved the problem of creating the AMI and using it in a cloudformation template, I just can't create the AMI INSIDE the cloudformation template or add it somehow to the created stack.

As I commented on Rico's answer, what I do now is use an ansible playbook which basically has 3 steps:

1.- Create a base instance with a cloudformation template

2.- Create, using ansible, an AMI of the instance created on step 1

3.- Create the rest of the stack (ELB, autoscaling groups, etc) with a second cloudformation template that updates the one created on step 1, and that uses the AMI created on step 2 to launch instances.

This is how I manage it now, but I wanted to know if there's any way to create an AMI INSIDE a cloudformation template or if it's possible to add the created AMI to the stack (something like telling the stack, "Hey, this belongs to you as well, so handle it").
Admin about 10 years

Rico, if I'm not mistaken (and as I'm doing it right now I don't think I am), you can modify a stack after creation, by updating it. The idea of creating the AMI outside cloudformation is how I'm handling it right now. Basically I use an ansible playbook with 3 steps: 1.- Create an instance with cloudformation 2.- Create an AMI of that instance with ansible 3.- Create the rest of the stack (updating the one created) using the AMI created with ansible My question actually points to making the AMI part of the stack or as part of the cloudformation steps. I'll update my question to clarify.
Rico about 10 years

@dibits Got ya. So I've changed my answer for number 2. Now that I remember there's an "Update Stack" operation. You need to provide the full JSON template of the initial stack + your changes in that same file.
Admin about 10 years

If I understand what you mean, then I would just be updating the AMI id in the cloudformation template, but I wouldn't be merging the AMI image into the stack. Maybe to clarify even further, since I'm doing all in one stack, I want to be able to delete that stack and have the AMI create deregistered automagically with the stack (as it happens with instances, elbs and so forth).
Rico about 10 years

I believe so. I haven't tried it myself. Would be curious to see what the result is.
wjordan almost 6 years

For what it's worth, it's actually relatively straightforward to extend my answer to update the AMI after updating the UserData, just by creating new WaitCondition and Custom::AMI resources (or renaming the Logical ID of the existing definitions, which will create a new one and destroy the old). For my own non-simple use-case, I wrap my CloudFormation template with an ERB template and include part of the commit hash in the Logical ID, e.g., AMICreate<%=commit%> and AMI<%=commit%>.
spg almost 6 years

Yes that makes sense! Thanks for the tip
nanestev over 5 years

Great template! Thanks for sharing.
Jaydeep Ranipa over 5 years

I however want to terminate the instance just after creating the image. In the above function I added line boto3.resource('ec2').Instance(instanceId).terminate() just before first success() call. But it gives an error "Invalid Response object: Value of property Data must be an object". Any idea?
Kaymaz about 5 years

From 2 years ago but I don't understand when the lambda is executed?
mibollma over 4 years

What is this about "physicalId.indexOf('ami-')...". Care to elaborate?
wjordan over 4 years

The resource's PhysicalResourceId is set to the ImageId returned by ec2.createImage once that operation completes, and all valid image IDs begin with ami- prefix. physicalId.indexOf('ami-') will equal 0 when the physical ID is set to a valid Image ID, which means the Image needs to be deleted (via ec2.deregisterImage) when the resource is deleted. If the physical resource was never set to a valid Image ID before the resource is deleted, the deregisterImage operation can be skipped.