AWS Rolling Replace Kubernetes Nodes without App errors/downtimes

We do a few things to keep the apps on our clusters from crashing during rolling instance replacements, so we can keep the overall noise and deployment errors low. I just wanted to write them up since some of them are not very intuitive and the combination of all of them makes for a really safe cluster updates.

  • Setup PodDisruptionBudget for all deployments so they stay available no matter what combination of nodes we take down
  • When  AWS AutoScalingGroup wants to update (before it takes down instances) we sends a autoscaling hook to SQS and drain the instance, then signal to continue (separate lille app we built but very simple overall)
  • When a new instances is booted, we taint it so no pod lands on a broken node (infrastructure pods with tolerations like CNI are booted quicker too since the node only has a few pods in to begin with)
  • Aws rollingupdate waits for the “all good” signal that we send when the node is finally ready and only then we untaint (nice side effect is also that nodes that never get ready stop the RollingUpdate and causes a Rollback)
  • We determine ready with custom plugins in node-problem-detector which has a few issues but overall works alright.

Parallelizing SparkleFormation execution

We generate each templates CloudFormation .json file and check them in to spot subtle diffs when refactoring. We also validate all configs on every PR. Doing this serially takes a lot of time, but it can be parallelized easily, while also avoiding ruby boot overhead.

This took us from ~5 minutes runtime to 30s, enjoy!

desc "Validate templates via cloudformation api"
task :validate do
  each_template do |template|
    execute_sfn_command :validate, template
  end
end

desc "Generates cloudformation json files"
task :generate, [:pattern] do |_t, args|
  pattern = /#{args[:pattern]}/ if args[:pattern]
  previous = Dir["generated/*.json"]

  used = each_template do |template|
    next if pattern && template !~ pattern
    output = execute_sfn_command :print, template

    generated = "generated/#{File.basename(template.sub('.rb', '.json'))}"
    File.write generated, output
    generated
  end

  (previous - used).each { |f| File.unlink(f) } unless pattern
end

...

require 'parallel'
require 'timeout'

def each_template(&block)
  # preload slow requires
  require 'bogo-cli'
  require 'sfn'

  # run in parallel, but isolated to avoid caches from being reused
  options = {in_processes: 10, progress: "Progress", isolation: true}
  Parallel.map(Dir["templates/**/*.rb"], options, &block)
end

private

def execute_sfn_command(command, template, *args)
  Timeout.timeout(20) do
    capture_stdout do
      Sfn::Command.const_get(command.capitalize).new({
        defaults: true, # do not ask questions about parameters
        file: template,
        retry: {type: :flat, interval: 1} # we will run into rate limits, ignore them quickly
      }, args).execute!
    end
  end
rescue StandardError
  # give users context when something failed
  warn "bundle exec sfn #{command} #{args.join(" ")} --defaults -f #{template}"
  raise
end

def capture_stdout
  old = $stdout
  $stdout = StringIO.new
  yield
  $stdout.string
ensure
  $stdout = old
end

AWS CloudFormation SNSTopic EventSourceMapping

Cloudformation refuses to set up an EventSourceMapping, but it works in the UI,
because under the hood the UI creates a push based setup, which can be re-created via
Cloudformation.

{
  "Resources": {
    "SNSTopic": {
      "Type": "AWS::SNS::Topic",
      "Properties": {
        "Subscription": [
          {
            "Endpoint": {
              "Fn::GetAtt": [
                "Function",
                "Arn"
              ]
            },
            "Protocol": "lambda"
          }
        ]
      }
    },
    "Function": {
      "Type": "AWS::Lambda::Function",
      "Properties": {
        "Handler": "index.handler",
        "Role": {
          "Fn::GetAtt": [
            "LambdaExecutionRole",
            "Arn"
          ]
        },
        "Code": {
          "ZipFile": {
            "Fn::Join": [
              "",
              [
                "exports.handler = function(event, context) {",
                "context.done(null,event)",
                "};"
              ]
            ]
          }
        },
        "Runtime": "nodejs",
        "Timeout": "25"
      }
    },
    "LambdaInvokePermission": {
      "Type": "AWS::Lambda::Permission",
      "Properties": {
        "FunctionName": {
          "Fn::GetAtt": [
            "Function",
            "Arn"
          ]
        },
        "Action": "lambda:InvokeFunction",
        "Principal": "sns.amazonaws.com",
        "SourceArn": {
          "Ref": "SNSTopic"
        }
      }
    }
  },
  "LambdaExecutionRole": {
    "Type": "AWS::IAM::Role",
    "Properties": {
      "AssumeRolePolicyDocument": {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
              "Service": "lambda.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
          }
        ]
      },
      "Policies": [
        {
          "PolicyName": "root",
          "PolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
              {
                "Effect": "Allow",
                "Action": [
                  "logs:*"
                ],
                "Resource": "arn:aws:logs:*:*:*"
              },
              {
                "Effect": "Allow",
                "Action": [
                  "sns:Subscribe"
                ],
                "Resource": [
                  "*"
                ]
              }
            ]
          }
        }
      ]
    }
  }
}

Sending configuration into a AWS Lambda created via Cloudformation

Lambdas can only have static code (see code upload via cloudformation), so passing in DynamoDB table names/SNS topic ARNs etc is not possible. But there is a neat workaround:

Make the lambda read the stacks output.

# my-stack.json
"Outputs": {
  "World": {
    "Value": {
      "Ref": "MySnsTopic"
    }
  },
  ....
}

var AWS = require('aws-sdk');
var stack = context.invokedFunctionArn.match(/:function:(.*)-.*-.*/)[1];

exports.handler = function(event, context) {
  var cf = new AWS.CloudFormation();
  cf.describeStacks({"StackName": stack}, function(err, data){
    if (err) context.done(err, 'Error!');
    else {
      var config = {}
      data.Stacks[0].Outputs.map(function(out){ config[out.OutputKey] = out.OutputValue });
      context.succeed('hello ' + config.World)
    }
  })
};

# output
"hello arn:aws:sns:ap-northeast-1:8132302344234:my-stack-MySnSTopic-211Z1K3GAGK9"