AWS S3 Lifecycle Policies – Prep for Deep Archive

AWS recently released a new S3 storage class called Deep Archive. It’s an archival data service with pretty low cost for data you need to hold onto, but don’t access very often.

Deep Archive is about half the cost of Glacier at $0.00099 per GB per month, but you sacrifice the option to get your data back in minutes, your only retrieval option is hours.

I work for a health care company so we hold onto patient data for years. There are plenty of reasons we might need to retrieve data from years ago, but few of them would have a time limit of less than several weeks. That makes Deep Archive a great fit for our long term data retention.

Setting it up is as simple as changing an existing life cycle transition to Deep Archive, or creating a new S3 Lifecycle transition to glacier

We put together a quick script to find the lifecycle transition rules in our S3 buckets that move data to Glacier already

$buckets = get-s3bucket;

# Iterate through buckets in the current account
foreach ($bucket in $buckets) {
write-host -foregroundcolor Green "Bucket: $($bucket.BucketName)";

# Get the lifecycle configuration for each bucket
$lifecycle = Get-S3LifecycleConfiguration -BucketName $bucket.BucketName;

# Print a warning if ther eare no lifecycles for this bucket
if(!$lifecycle) {
write-host -foregroundcolor Yellow "$($bucket.BucketName) has no life cycle policies";
} else {
# Iterate the transition rules in this lifecycle
foreach ($rule in $lifecycle.Rules) {
write-host -foregroundcolor Magenta "$($rule.Id) with prefix: $($rule.Filter.Lifecyclefilterpredicate.Prefix)";
# Print a warning if there are no transitions
if(!($rule.Transitions)) {
write-host -foregroundcolor Yellow "No lifecycle transitions";
}

# Iterate the transitions and print the rules
foreach ($transition in $rule.Transitions) {
if($transition.StorageClass -eq "GLACIER") {
$color = "Yellow";
} else {
$color = "White";
}
write-host -foregroundcolor $color "After $($transition.Days) transition to $($transition.StorageClass)";
}
}
}
}

To run this script you’ll need the AWS powershell tools, an IAM account setup, and a default region initialized.

When you run the script it will print out your current S3 buckets, the lifecycle rules, and the transitions in each of them, highlighting the transitions to Glacier in yellow.

Graph Your RI Commitment Over Time (subtitle: HOW LONG AM I PAYING FOR THIS?!?!?)

In my last post I talked about distributing your committed RI spend over time. The goal being to avoid buying too many 1 year RIs (front loading your spend), and missing out on the savings of committing to 3 years, but not buying too many 3 year RIs (back loading your spend) and risking having a bill you have to foot if your organization has major changes.

Our solution for balancing this is a powershell snippet that graphs our RI commitment over time.

# Get RI entries from AWS console
$ri_entries = Get-EC2ReservedInstance -filter @(@{Name="state";Value="active"});

# Array to hold the relevant RI data
$ri_data = @();

# Calculate monthly cost for RIs
foreach ($ri_entry in $ri_entries) {
$ri = @{};
$hourly = $ri_entry.RecurringCharges.Amount;
$monthly = $hourly * 24 * 30 * $ri_entry.InstanceCount;
$ri.monthly = $monthly;
$ri.End = $ri_entry.End;
$ri_data += $ri;
}

# Three years into the future (maximum duration of RIs as of 1.22.2019)
$three_years_out = (get-date).addyears(3);

# Our current date iterator
$current = (get-date);

# Array to hold the commit by month
$monthly_commit = @();

# CSV file name to save output
$csv_name = "ri_commitment-$((get-date).tostring('ddMMyyyy')).csv";

# Remove the CSV if it already exists
if(test-path $csv_name) {
remove-item -force $csv_name;
}

# Insert CSV headers
"date,commitment" | out-file $csv_name -append -encoding ascii;

# Iterate from today to three years in the future
while($current -lt $three_years_out) {

# Find the sum of the RIs that are active on this date
# all RI data -> RIs that have expirations after current -> select the monthly measure -> get the sum -> select the sum
$commit = ($ri_data | ? {$_.End -gt $current} | % {$_.monthly} | measure -sum).sum;

# Build a row of the CSV
$output = "$($current),$($commit)";

# Print the output to standard out for quick review
write-host $output;

# Write out to the CSV for deeper analysis
$output | out-file $csv_name -append -encoding ascii;

# Increment to the next month and repeat
$current = $current.addmonths(1);
}

Ok, short’s not the right word. It’s a little lengthy, but at the end it kicks out a CSV in your working directory with months and your RI commit for them.

From there it’s easy to create a graph that shows your RI spend commit over time.

That gives you an idea of how much spend you’ve committed to, and for how long.

Our RI Purchase Guidelines

I’ve talked about it a couple of times, but AWS’s recommendation engine is free and borderline magic.

It’s a part of AWS Cost Explorer and ingests your AWS usage data, and spits back reserved instance recommendations

At first glance it feels a little suspect that a vendor has a built in engine to help you get insight into how to save money, but Amazon is playing the long game. If you’re use of AWS is more efficient (and you’re committing to spend with them) you’re more likely to stay a customer, and spend more in the long haul.
The Recommendation engine has a few parameters you can tweak. They default to the settings that will save you the most money (and have you commit to the longest term spend with Amazon), but that may not be the right fit for you.
For example our total workload fluctuates depending on new features that get released, performance improvements for our databases, etc., so we typically buy convertible instances so we have the option of changing instance types, size and OS if we need to.
As you click around in these options you’ll notice the total percent savings flies around like a kite. Depending on the options you select your savings can go up and down quite a bit.
Paying upfront and standard vs. convertible can give you a 2-3% difference (based on what I’ve seen), but buying a 3 year RI instead of a 1 year doubles you’re savings. That can be a big difference if you’re willing to commit to the spend.
Now, three years is almost forever in Amazon. Keep in mind Amazon releases a new instance type or family about every year, so a 3 standard RI feels a little risky to me. Here are the guidelines we’re trying out
  • Buy mostly convertible (the exception is services that will not change)
  • Stay below ~70% RI coverage (we have a couple efficiency projects in the works that will reduce our EC2 running hours)
  • Distribute your spend commit
My next post will cover how we distribute our spend commit.

Getting Started with AWS Reserved Instances

If you’ve been using AWS for a while you’ve probably built up some excess spend. The “pay as you go” nature of AWS is a double edged sword: It’s easy to get a PoC up and running, but you can wind up with waste if you aren’t disciplined with your cleanup.
That’s the situation we found ourselves in recently. My company has been running production workloads in AWS for 3+ years, and we’ve had 100% of customer facing workload in Amazon for over a year.
Over those 3 plus years we’ve redesigned out app delivery environment, released several new products, rebuild our BI environment, and reworked our CICD process. We subscribe to the “fail as fast as you can” methodology, so we’ve also started several PoCs that never went live.
All of that is to say we’ve done a lot in Amazon and we’ve tried a lot of new services. Despite our best efforts, we’ve had some wasted spend build up in our AWS bill. The whole engineering team was aware of it, but how do you start cleaning up waste, especially if your bill is large?

Sell the Marathon to Execs

Pitching a big, expensive cost saving project to execs is hard. Pitching a slow and steady approach is a lot easier. Rather than try to block a week for “cost savings” exercises we asked management for 1 hour working meeting a week. No follow ups for outside of the meeting, no third party reporting tools, and only low/no risk changes.
The risk with a dramatic cost savings project is that executives will think of it as a purchase rather than a continual effort. If they spend a lot of money on cost savings, they’ll expect costs to automatically stay lower forever. If they get used to the idea of a small effort for a long time, it will be easier to keep up with it.

Start Small, Cautious, and Skeptical

Most of the struggle is finding the waste. Tools liked Trusted Advisor are useful, but they have, I hate to say, somewhat pithy recommendations. It’s never quite as straightforward to turn off services as we might like. 
For example when Trusted Advisor finds an under-utilized instance you have a slew of questions to answer before you can turn it off. “Is it low use, but important?” “Is it used at specific times like month end?” “Is it using a resource Trusted advisor doesn’t check?”
Instead of taking these straight recommendations pull a small coalition of knowledgeable resources into your 1 hour cost saving meeting. We started with
  • A DBA – someone who knows about big, expensive systems who will be very cautious about damaging them
  • An IT engineer – the team with permissions to create new servers who support customer environments (also very cautious)
  • A DevOps engineer – someone with the ability to write scripts to cross data sets like EBS usage and CPU usage
With those three roles we had the people in the room who would get called when something breaks, meaning they would be very careful not to impact production.

Avoid Analysis as Long as Possible

Avoid getting bogged down with analysis until there are no more easy cost savings. With our cost savings team of a DBA, an IT Engineer, and a DevOps engineer we started with cost savings options that we all agreed on within 5 minutes. If we debated a plan for more than 5 minutes we tabled it for when we’d run out of easy options.
That approach let us show value for the 1 hour/week cost savings meetings quickly, and convince execs we should keep having them.
When you start to run out of easy ideas start doing more analysis to think through how much money you’ll save with a change, and how much time it will take to do it. That will let you prioritize the harder stuff.

Document, Document, Document

Documenting your cost saving efforts well early on will lend itself to building our a recurring/automated process later on. If you save the scripts you use to find unused instances, you can re-run them later and eventually build them into Lambda functions or jobs that run.
It will also make it easy to demonstrate value to execs. If you have good documentation on estimated cost savings, actual cost savings it will be easier to show to your executives.
That’s our high level approach, see my blog post on getting started with AWS Cost Explorer to start diving into your bill details!