AWS Codepipeline: Alert on Stage Failure

We’ve been using AWS Codepipeline for some time now and for the most part it’s a great managed service. Easy to get started with and pretty simple to use.

That being said, it does lack some features out of the box that most CICD systems have ready for you. The one I’ll be tackling today is alerting on a stage failure.

Out of the box Codepipeline won’t alert you when there’s a failure at a stage. Unless you go in and literally look at it in the console, you won’t know that anything is broken. For example when I started working on this blog entry, I checked one of the pipelines that delivers to our test environment, and found it in a failed state.

In this case the failure is because our Opsworks stacks are set to turn off test instances when not during business hours, but for almost any other failure I would want to alert the team responsible for making the change that failed.

For a solution, we’ll use these resources

  • AWS Lambda
  • Boto3
  • AWS SNS Topics
  • Cloudformation
First we’ll need a Lambda function that can get a list of pipelines that are in our account, scan their stages, detect failures, and produce alerts. Below is a basic example of what we’re using. I’m far from a python expert, so I understand that there are improvements that could be made for error handling.
import boto3
import logging
import os

def lambda_handler(event, context):
# Get a cloudwatch logger
logger = logging.getLogger('mvp-alert-on-cp-failure')
logger.setLevel(logging.DEBUG)

sns_topic_arn = os.environ['TOPIC_ARN']

# Obtain boto3 resources
logger.info('Getting boto 3 resources')
code_pipeline_client = boto3.client('codepipeline')
sns_client = boto3.client('sns')

logger.debug('Getting pipelines')
for pipeline in code_pipeline_client.list_pipelines()['pipelines']:
logger.debug('Checking pipeline ' + pipeline['name'] + ' for failures')
for stage in code_pipeline_client.get_pipeline_state(name=pipeline['name'])['stageStates']:
logger.debug('Checking stage ' + stage['stageName'] + ' for failures')
if 'latestExecution' in stage and stage['latestExecution']['status'] == 'Failed':
logger.debug('Stage failed! Sending SNS notification to ' + sns_topic_arn)
failed_actions = ''
for action in stage['actionStates']:
logger.debug(action)
logger.debug('Checking action ' + action['actionName'] + ' for failures')
if 'latestExecution' in action and action['latestExecution']['status'] == 'Failed':
logger.debug('Action failed!')
failed_actions += action['actionName']
logger.debug('Publishing failure alert: ' + pipeline['name'] + '|' + stage['stageName'] + '|' + action['actionName'])
logger.debug('Publishing failure alert: ' + pipeline['name'] + '|' + stage['stageName'] + '|' + failed_actions)
alert_subject = 'Codepipeline failure in ' + pipeline['name'] + ' at stage ' + stage['stageName']
alert_message = 'Codepipeline failure in ' + pipeline['name'] + ' at stage ' + stage['stageName'] + '. Failed actions: ' + failed_actions
logger.debug('Sending SNS notification')
sns_client.publish(TopicArn=sns_topic_arn,Subject=alert_subject,Message=alert_message)

return "And we're done!"

If you’re looking closely, you’re probably wondering what the environment variable named “TOPIC_ARN” is, which leads us to the next piece: A cloudformation template to create this lambda function.

The Cloudformation template needs to do a few things.

  1. Create the Lambda function. I’ve chosen to do this using AWS Serverless Application Model.
  2. Create an IAM Role for the Lambda function to execute under
  3. Create IAM policies that will give the IAM role read access to your pipelines, and publish access to your SNS topic
  4. Create an SNS topic with a list of individuals you want to get the email
The only really new fangled Cloudformation feature I’m using here is AWS SAM, the rest of these have existed for quite a while. In my opinion one of the main ideas behind AWS SAM is to package your entire Serverless Function in a single Cloudformation template, so the example below does all four of these steps.
#############################################
### Lambda function to alert on pipeline failures
#############################################

LambdaAlertCPTestFail:
Type: AWS::Serverless::Function
Properties:
Handler: mvp-alert-on-cp-failure.lambda_handler
Role: !GetAtt IAMRoleAlertOnCPTestFailure.Arn
Runtime: python2.7
Timeout: 300
Events:
CheckEvery30Minutes:
Type: Schedule
Properties:
Schedule: cron(0/30 12-23 ? * MON-FRI *)
Environment:
Variables:
STAGE_NAME: Test
TOPIC_ARN: !Ref CodePipelineTestStageFailureTopic
CodePipelineTestStageFailureTopic:
Type: "AWS::SNS::Topic"
Properties:
DisplayName: MvpPipelineFailure
Subscription:
-
Endpoint: 'pipelineCurator@example.com'
Protocol: 'email'
TopicName: MvpPipelineFailure
IAMPolicyPublishToTestFailureTopic:
Type: "AWS::IAM::Policy"
DependsOn: MoveToPHIIAMRole
Properties:
PolicyName: !Sub "Role=AlertOnCPTestFailure,Env=${AccountParameter},Service=SNS,Rights=Publish"
PolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Action:
- "sns:Publish"
Resource:
- !Ref CodePipelineTestStageFailureTopic
Roles:
- !Ref IAMRoleAlertOnCPTestFailure
IAMPolicyGetPipelineStatus:
Type: "AWS::IAM::Policy"
DependsOn: MoveToPHIIAMRole
Properties:
PolicyName: !Sub "Role=AlertOnCPTestFailure,Env=${AccountParameter},Service=CodePipeline,Rights=R"
PolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Action:
- "codepipeline:GetPipeline"
- "codepipeline:GetPipelineState"
- "codepipeline:ListPipelines"
Resource:
- "*"
Roles:
- !Ref IAMRoleAlertOnCPTestFailure
IAMRoleAlertOnCPTestFailure:
Type: "AWS::IAM::Role"
Properties:
RoleName: !Sub "Role=AlertOnCPTestFailure,Env=${AccountParameter},Service=Lambda"
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Principal:
Service:
- "lambda.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: "/"

#############################################
### End of pipeline failure alerting lambda function
#############################################

And that’s about it. A couple notes on the Cloudformation template:

Alert Frequency

I’m using a cron expression as my schedule, currently set to go off every half hour during business hours, because we don’t have over night staff that would be able to look at Pipeline failures. You can easily up the frequency with something like

cron(0/5 12-23 ? * MON-FRI *)

Lambda Environment Variables

One of the announcements from reInvent I was most excited about was AWS Lambda environment variables. This is a pretty magical feature that lets you pass in values to your Lambda functions. In this case, I’m using it to pass an SNS topic ARN that’s being created in the same Cloudformation template into the Lambda function.

Long story short, that means we can create resources in AWS and pass references to them into code without having to have a way to search for them or put their values into source code.

      Environment:
Variables:
STAGE_NAME: Test
TOPIC_ARN: !Ref CodePipelineTestStageFailureTopic

Flowerboxes

The CFT this example comes from contains multiple pipeline management functions, so the flower boxes (“###############”) at the beginning and end of the Lambda Function definition are our way of keeping resources for each lambda function separated.

SNS Notifications

When you create an SNS topic with an email, the user will have to register with the topic. They’ll get an email and have to click the link to allow the notifications.

Snippets

These are snippets I pulled out of our pipeline management Cloudformation stack. Obviously you’ll have to put them into a Cloudformation template that references the SAM Cloudformation Transform and has a valid header like the one below:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:......

Happy alerting!

Building CodePipelines with Cloudformation: What’s my configuration?

My company started using AWS Codepipeline as a somewhat reluctant PoC. It’s not a full featured CICD service, but it is incredibly cost effective and easy to get started with. Amazon’s recent release of invoking Lambda functions makes it much more flexible.

We’ve been using Codepipeline for several months now, and with it starting to look like a longer term solution for us some of the AWS Console limitations are becoming prohibitive. For example you can’t move an action around in a stage in the console. Your only option it to delete and recreate the action where you wanted it to be.

Fortunately, most of these struggles are solved by creating your Pipelines in Cloudformation!

The AWS Cloudformation Template reference will get you started with the basics, but it leaves a lot of question marks around the different Action provider types and how to format stages and actions.

For example, we have an Action that deploys a Cloudformation stack. I had the action definition below:

            -
Name: Deploy-Stack
InputArtifacts:
-
Name: our-cft
ActionTypeId:
Category: Deploy
Owner: AWS
Version: 1
Provider: Cloudformation
Configuration:
ActionMode: CREATE_UPDATE
StackName: Our-Staging-Stack
TemplatePath: our-cft::our-template-name.yml
TemplateConfiguration: our-cft::stackConfig.json
Capabilities: CAPABILITY_NAMED_IAM
RoleArn: arn:aws:iam::/RoleName
RunOrder: 2

And I was getting a very vague error about the Cloudformation provider not being available. I opened a case with AWS Support only to find out that it was a casing issue, and the Provider needed to be “CloudFormation”.

Which was helpful, but still left me with a lot of questions like, “What casing to I need to use for Codebuild? Codedeploy? What are the Configuration items for Invoking Lambda functions?”

After spending some time searching the internet, and finding painfully few examples, I turned to my trusty friend the AWS Powershell Tools. My guess is that we can figure out what values to put into our Cloudformation template by digging out what an existing pipeline has for configuration.

It turns out the Powershell tools have a number of commandlets for interacting with Codepipeline. The one we’ll use today is Get-CPPipeline.

To find out more I ran

get-help get-cppipeline

Which shows the information below for the syntax of the command

So let’s give it a shot!

I can already guess the stages property is what I’ll be interested in next, so let’s pull one out to see what it looks like.

I know that this stage has a Cloudformation stack that it deploys, so let’s see if I can figure out what the configuration looks like from there. After some digging, I found the ActionTypeId attribute

And there it is! CloudFormation with all the correct capitalization. My guess is confirmed and this seems like a valid plan. I threw together this powershell function to print out information about an existing pipeline. Obviously you’ll need to use “Set-AWSCredentials” to pick a profile or IAM account that has access.

function print-pipeline($pipeline) {
write-host "Pipeline name: $($pipeline.Name)"
foreach ($stage in $pipeline.stages) {
write-host "Stage: $($stage.Name)"
foreach($action in $stage.Actions) {
Write-host "Action: $($Action.Name)"
write-host "Action Type ID: $($action.ActionTypeId.Category) - $($action.ActionTypeId.Provider)"
write-host "Configurations: "
write-host $action.Configuration
write-host ""
}
Write-host ""
}

}

set-awscredentials -profilename
set-defaultawsregion us-east-1

$pipeline = get-cppipeline -name

print-pipeline $pipeline

A couple notes, the Configuration attribute looks like it’s printing out as a list, but I wasn’t able to iterate over it. My guess is this is still early enough on in AWS supporting interactions with Codepipeline that some of the functions needed aren’t supported.

Lastly, here a some snippets of Action Types we use, pulled out of a couple of our templates.

Using S3 as an artifact source:

        - 
Name: Source
Actions:
-
Name: your-action-name
ActionTypeId:
Category: Source
Owner: AWS
Version: 1
Provider: S3
OutputArtifacts:
-
Name: your-output-name
Configuration:
S3Bucket: your-bucket-name
S3ObjectKey: our/s3/key.zip
RunOrder: 1

Using AWS Codebuild

          Name: Build
Actions:
-
Name: YourActionName
InputArtifacts:
-
Name: your-source-code
OutputArtifacts:
-
Name: your-compiled-code
ActionTypeId:
Category: Build
Owner: AWS
Version: 1
Provider: CodeBuild
Configuration:
ProjectName: your-code-build-project-name

Invoking a Lambda function

        -
Name: Test
Actions:
-
Name: your-action-name
ActionTypeId:
Category: Invoke
Owner: AWS
Version: 1
Provider: Lambda
Configuration:
FunctionName: your-function-name
UserParameters: paraemters-to-pass-in
RunOrder: 1

Running Opsworks Cookbooks

            -
Name: RunCookbooks
InputArtifacts:
-
Name: your-input-artifact-name
ActionTypeId:
Category: Deploy
Owner: AWS
Version: 1
Provider: OpsWorks
Configuration:
DeploymentType: deploy_app
StackId: your-stack-id
AppId: your-app-id
RunOrder: 3

Deploying an app with Codedeploy

            -
Name: your-action-name
InputArtifacts:
-
Name: your-input-artifact
ActionTypeId:
Category: Deploy
Owner: AWS
Version: 1
Provider: CodeDeploy
Configuration:
ApplicationName: your-app-name
DeploymentGroupName: your-group-name
RunOrder: 4

And lastly, creating a manual approval step

            -
Name: Approve
ActionTypeId:
Category: Approval
Owner: AWS
Version: 1
Provider: Manual
RunOrder: 1

So far I’ve been very pleased with the switch. We’ve been able to re-arrange actions and stages pretty easily, and it made creating additional stages have way fewer clicks.

AWS CLI: Table Output

Earlier today I stumbled on an AWS CLI feature I hadn’t noticed before, the “output” flag.

The default value of this flag is json, which is probably what you want most of the time. It makes it pretty easy to manipulate and pull out the data you need.
But if you are just looking for a visualization of your command you have the option of specifying “text” or “table” in the command.
aws ec2 --output table describe-instances

Text will give you a flat dump of the information, while table will format it pretty with colors and blocks. I used the table command to find an instance I wanted to use in the screen shot below.

Keep in mind that table output adds a lot of extra characters, so it isn’t a great way to view large numbers of instances at the same time, but if you’re looking for readable output, this is handy.

I wouldn’t recommend it, but you can set this to be the default in your AWS credentials file at

C:\users\\.aws\credentials

It would look like

AWS Powershell Tools Snippets: CodeBuild Cloudwatch Logs

We’ve been using AWS CodeBuild to run java maven builds almost since it came out. It’s great when it works, but when Maven has a problem it can pretty pretty difficult to sift through logs in the Cloudwatch console.

Below is an AWS Powershell Tools snippet that will pull down a cloudwatch log stream and dump it both to your console and to a file. There are a few parameters you’ll need to set

  1. LogStreamName – this should be the specific log stream you want to download. Usually this correlates to a Codebuild run
  2. LogGroupName – this will be the /aws/codebuild/
  3. OutFile location – this is where you want the files dumped
Get-CWLLogEvents -logstreamname logstreamname -loggroupname /aws/codebuild/yourprojectname).events | foreach {$message = "$($_.TimeStamp) - $($_.Message)";write-host $message; $message | out-file logfilelocation.log -append}

This command can also be used to pull down other log streams as well, you just need to swap out the same variables.

Happy troubleshooting!