Docker Windows container for Pester Tests

I recently wrote an intro to unit testing your powershell modules with Pester, and I wanted to give a walk through for our method of running these unit tests inside of a Docker for Windows container.

Before we get started, I’d like to acknowledge this post is obviously filled with trendy buzzwords (CICD, Docker, Config Management, *game of thrones, docker-compose, you get the picture). All of the components we’re going to talk through today add concrete value to our business, and we didn’t do any resume driven development.

Why?

Here’s a quick run through of our motivation for each of the pieces I’ll cover in this post.
  1. Docker image for running unit tests 
    1. gives engineers a consistent way to run the unit tests. One your workstation you might need different versions of SDKs and tools, but a docker container lets you pin versions of things like the AWS Powershell tools
    2. Makes all pathing consistent – you can setup your laptop anyway you lock, but the paths inside of the container are consistent
  2. Docker-compose
    1. Provides a way to customize unit test runs to a project
    2. Provides a consistent way for engineers to map drives into the container
  3. Code coverage metrics
    1. At my company we don’t put too much stock in code coverage metrics, but they offer some context for how thorough an engineer has been with unit tests
    2. We keep a loose goal of 60%
  4. Unit test passing count
    1. A failed unit test does not go to production. A failed unit test has a high chance of causing production outage

How!

The first step is to setup Docker Desktop for Windows. The biggest struggle I’ve seen people having getting docker running on Windows is getting virtualization enabled, so pay extra attention to that step.
Once you have Docker installed you’ll need to create an image you can use to run your unit tests, a script to execute them, and a docker-compose file. The whole structure will look like

  • /
    • docker-compose.yml
    • /pestertester
      • Dockerfile
      • Run-AllUnitTests.ps1

We call our image “pestertester” (I’m more proud of that name than I should be).

There are two files inside of the pestertester folder: a Dockerfile that defines the image, and a script called Run-AllUnitTests.ps1.
Here’s a simple example of the dockerfile. For more detail on how to write a dockerfile you should explore the dockerfile reference

FROM mcr.microsoft.com/windows/servercore
RUN "powershell Install-PackageProvider -Name NuGet -MinimumVersion 2.8.5.201 -Force"
RUN "powershell Install-Module -Scope CurrentUser -Name AWSPowerShell -Force;"
COPY ./Run-AllUnitTests.ps1 c:/scripts/Run-AllUnitTests.ps1

All we need for these unit tests is the AWS Powershell Tools, and we install NuGet so we can use powershell’s Install-Module.

We played around with several different docker images before we picked mcr.microsoft.com/windows/servercore.

  1. We moved away from any of the .NET containers because we didn’t need the dependencies they added, and they were very large
  2. We moved away from nano server images because some of our powershell modules call functions outside of .NET core
Next we have the script Run-AllUnitTests.ps1. The main requirement for this script to work is that your tests be stored with this file structure
  • /ConfigModule.psm1
    • /tests
      • /ConfigModule.tests.ps1
  • ConfigModule2.psm1
    • /tests
      • /ConfigModule2.tests.ps1
The script isn’t too complicated
$results = @();
gci -recurse -include tests -directory | ? {$_.FullName -notlike "*dsc*"} | % {
set-location $_.FullName;
$tests = gci;
foreach ($test in $tests) {
$module = $test.Name.Replace("tests.ps1","psm1")
$result = invoke-pester ".\$test" -CodeCoverage "..\$module" -passthru -quiet;
$results += @{Module = $module;
Total = $result.TotalCount;
passed = $result.PassedCount;
failed = $result.FailedCount
codecoverage = [math]::round(($result.CodeCoverage.NumberOfCommandsExecuted / $result.CodeCoverage.NumberOfCommandsAnalyzed) * 100,2)
}
}
}

foreach ($result in $results) {
write-host -foregroundcolor Magenta "module: $($result['Module'])";
write-host "Total tests: $($result['total'])";
write-host -ForegroundColor Green "Passed tests: $($result['passed'])";
if($result['failed'] -gt 0) {
$color = "Red";
} else {
$color = "Green";
}
write-host -foregroundcolor $color "Failed tests: $($result['failed'])";
if($result['codecoverage'] -gt 60) {
$color = "Green";
} elseif($result['codecoverage'] -gt 30) {
$color = "Yellow";
} else {
$color = "Red";
}
write-host -ForegroundColor $color "CodeCoverage: $($result['codecoverage'])";
}

The script iterates through any subdirectories named “tests”, and executes the unit tests it finds there, running code coverage metrics for each module.

The last piece to tie all of this together is a docker-compose file. The docker compose file handles

  1. Mapping the windows drives into the container
  2. Executing the script that runs the unit tests
The docker-compose file is pretty straightforward too
version: '3.7'

services:
pestertester:
build: ./pestertester
volumes:
- c:\users\bolson\documents\github\dt-infra-citrix-management\ssm:c:\ssm
stdin_open: true
tty: true
command: powershell "cd ssm;C:\scripts\Run-AllUnitTests.ps1"

Once you’ve got all of this setup, you can run your unit tests with

docker-compose run pestertester

One the container starts up you’ll see your test results

Experience

We’ve been running linux containers in production for a couple of years now, but we’re just starting to pilot windows containers. According to the documentation they’re not production ready yet

Docker is a full development platform for creating containerized apps, and Docker Desktop for Windows is the best way to get started with Docker on Windows.

Running our unit tests inside of windows containers has been a good way to get some experience with them without risking production impact.

A couple final thoughts

Windows containers are large, even server core and nano server are gigabytes.

The container we landed on is 11GB

If you need to run windows containers, and you can’t stick to .NET core and get onto nano server, you’re going to be stuck with pretty large images.

Start up times for windows containers will be a few minutes

Especially the first time on a machine while resources are getting loaded.

Versatile Pattern

This pattern of unit testing inside of a container is pretty versatile. You can use it with any unit testing framework, and any operating system you can run inside a container.

*no actual game of thrones references will be in this blog post

Graph Your RI Commitment Over Time (subtitle: HOW LONG AM I PAYING FOR THIS?!?!?)

In my last post I talked about distributing your committed RI spend over time. The goal being to avoid buying too many 1 year RIs (front loading your spend), and missing out on the savings of committing to 3 years, but not buying too many 3 year RIs (back loading your spend) and risking having a bill you have to foot if your organization has major changes.

Our solution for balancing this is a powershell snippet that graphs our RI commitment over time.

# Get RI entries from AWS console
$ri_entries = Get-EC2ReservedInstance -filter @(@{Name="state";Value="active"});

# Array to hold the relevant RI data
$ri_data = @();

# Calculate monthly cost for RIs
foreach ($ri_entry in $ri_entries) {
$ri = @{};
$hourly = $ri_entry.RecurringCharges.Amount;
$monthly = $hourly * 24 * 30 * $ri_entry.InstanceCount;
$ri.monthly = $monthly;
$ri.End = $ri_entry.End;
$ri_data += $ri;
}

# Three years into the future (maximum duration of RIs as of 1.22.2019)
$three_years_out = (get-date).addyears(3);

# Our current date iterator
$current = (get-date);

# Array to hold the commit by month
$monthly_commit = @();

# CSV file name to save output
$csv_name = "ri_commitment-$((get-date).tostring('ddMMyyyy')).csv";

# Remove the CSV if it already exists
if(test-path $csv_name) {
remove-item -force $csv_name;
}

# Insert CSV headers
"date,commitment" | out-file $csv_name -append -encoding ascii;

# Iterate from today to three years in the future
while($current -lt $three_years_out) {

# Find the sum of the RIs that are active on this date
# all RI data -> RIs that have expirations after current -> select the monthly measure -> get the sum -> select the sum
$commit = ($ri_data | ? {$_.End -gt $current} | % {$_.monthly} | measure -sum).sum;

# Build a row of the CSV
$output = "$($current),$($commit)";

# Print the output to standard out for quick review
write-host $output;

# Write out to the CSV for deeper analysis
$output | out-file $csv_name -append -encoding ascii;

# Increment to the next month and repeat
$current = $current.addmonths(1);
}

Ok, short’s not the right word. It’s a little lengthy, but at the end it kicks out a CSV in your working directory with months and your RI commit for them.

From there it’s easy to create a graph that shows your RI spend commit over time.

That gives you an idea of how much spend you’ve committed to, and for how long.

Our RI Purchase Guidelines

I’ve talked about it a couple of times, but AWS’s recommendation engine is free and borderline magic.

It’s a part of AWS Cost Explorer and ingests your AWS usage data, and spits back reserved instance recommendations

At first glance it feels a little suspect that a vendor has a built in engine to help you get insight into how to save money, but Amazon is playing the long game. If you’re use of AWS is more efficient (and you’re committing to spend with them) you’re more likely to stay a customer, and spend more in the long haul.
The Recommendation engine has a few parameters you can tweak. They default to the settings that will save you the most money (and have you commit to the longest term spend with Amazon), but that may not be the right fit for you.
For example our total workload fluctuates depending on new features that get released, performance improvements for our databases, etc., so we typically buy convertible instances so we have the option of changing instance types, size and OS if we need to.
As you click around in these options you’ll notice the total percent savings flies around like a kite. Depending on the options you select your savings can go up and down quite a bit.
Paying upfront and standard vs. convertible can give you a 2-3% difference (based on what I’ve seen), but buying a 3 year RI instead of a 1 year doubles you’re savings. That can be a big difference if you’re willing to commit to the spend.
Now, three years is almost forever in Amazon. Keep in mind Amazon releases a new instance type or family about every year, so a 3 standard RI feels a little risky to me. Here are the guidelines we’re trying out
  • Buy mostly convertible (the exception is services that will not change)
  • Stay below ~70% RI coverage (we have a couple efficiency projects in the works that will reduce our EC2 running hours)
  • Distribute your spend commit
My next post will cover how we distribute our spend commit.

Getting Started with AWS Reserved Instances

If you’ve been using AWS for a while you’ve probably built up some excess spend. The “pay as you go” nature of AWS is a double edged sword: It’s easy to get a PoC up and running, but you can wind up with waste if you aren’t disciplined with your cleanup.
That’s the situation we found ourselves in recently. My company has been running production workloads in AWS for 3+ years, and we’ve had 100% of customer facing workload in Amazon for over a year.
Over those 3 plus years we’ve redesigned out app delivery environment, released several new products, rebuild our BI environment, and reworked our CICD process. We subscribe to the “fail as fast as you can” methodology, so we’ve also started several PoCs that never went live.
All of that is to say we’ve done a lot in Amazon and we’ve tried a lot of new services. Despite our best efforts, we’ve had some wasted spend build up in our AWS bill. The whole engineering team was aware of it, but how do you start cleaning up waste, especially if your bill is large?

Sell the Marathon to Execs

Pitching a big, expensive cost saving project to execs is hard. Pitching a slow and steady approach is a lot easier. Rather than try to block a week for “cost savings” exercises we asked management for 1 hour working meeting a week. No follow ups for outside of the meeting, no third party reporting tools, and only low/no risk changes.
The risk with a dramatic cost savings project is that executives will think of it as a purchase rather than a continual effort. If they spend a lot of money on cost savings, they’ll expect costs to automatically stay lower forever. If they get used to the idea of a small effort for a long time, it will be easier to keep up with it.

Start Small, Cautious, and Skeptical

Most of the struggle is finding the waste. Tools liked Trusted Advisor are useful, but they have, I hate to say, somewhat pithy recommendations. It’s never quite as straightforward to turn off services as we might like. 
For example when Trusted Advisor finds an under-utilized instance you have a slew of questions to answer before you can turn it off. “Is it low use, but important?” “Is it used at specific times like month end?” “Is it using a resource Trusted advisor doesn’t check?”
Instead of taking these straight recommendations pull a small coalition of knowledgeable resources into your 1 hour cost saving meeting. We started with
  • A DBA – someone who knows about big, expensive systems who will be very cautious about damaging them
  • An IT engineer – the team with permissions to create new servers who support customer environments (also very cautious)
  • A DevOps engineer – someone with the ability to write scripts to cross data sets like EBS usage and CPU usage
With those three roles we had the people in the room who would get called when something breaks, meaning they would be very careful not to impact production.

Avoid Analysis as Long as Possible

Avoid getting bogged down with analysis until there are no more easy cost savings. With our cost savings team of a DBA, an IT Engineer, and a DevOps engineer we started with cost savings options that we all agreed on within 5 minutes. If we debated a plan for more than 5 minutes we tabled it for when we’d run out of easy options.
That approach let us show value for the 1 hour/week cost savings meetings quickly, and convince execs we should keep having them.
When you start to run out of easy ideas start doing more analysis to think through how much money you’ll save with a change, and how much time it will take to do it. That will let you prioritize the harder stuff.

Document, Document, Document

Documenting your cost saving efforts well early on will lend itself to building our a recurring/automated process later on. If you save the scripts you use to find unused instances, you can re-run them later and eventually build them into Lambda functions or jobs that run.
It will also make it easy to demonstrate value to execs. If you have good documentation on estimated cost savings, actual cost savings it will be easier to show to your executives.
That’s our high level approach, see my blog post on getting started with AWS Cost Explorer to start diving into your bill details!

Diving Into (and reducing) Your AWS Costs!

AWS uses a “pay as you go” model for most of it’s services. You can start using them at any time, you often get a runway of free usage to get up to speed on the service, then they charge you for what you use. No contract negotiations, no figuring out bulk discounts, and you don’t have to provision for max capacity.

This model is a double edge sword. It’s great when you’re

  • First getting started
  • Working with a predictable workload
  • Working with a modern technology stack (i.e. most of your resources are stateless and can be ephemeral
But it has some challenges when
  • Your workload is unpredictable
  • Your stack is not stateless (i.e. you have to provision for max capacity)
  • Your environment is complex with a lot of services being used by different teams

It’s easy to have your AWS costs run away from you and you can suddenly find yourself paying much more than you need or want to. We recently found ourselves in that scenario. Obviously I can’t show you our actual account costs, but I’ll walk you through the process we used to start digging into (and reducing our costs) with one of my personal accounts.

Step 1: AWS Cost Explorer

Cost Explorer is your first stop for understanding your AWS bill. You’ll navigate to your AWS Billing Dashboard, and then launch cost explorer. If you haven’t been in cost explorer it doesn’t hurt to look at some of the alerts on the home page, but the real interesting data is in Costs and Usage
My preference is to switch to “stack view”
I find this helps to view your costs in context. If you’re looking to cut costs the obvious place to start is the server that takes up the largest section of the bar. For this account it’s ElastiCache
ElastiCache is pretty straight forward to cut costs for – you either cut your nodes or node size – so let’s pick a more interesting service like S3.
Once you’ve picked a service to try to cut costs for add a service filter on the right hand side, and group by service type
Right away we can see that most of our costs are TimedStorage-ByteHrs which translates to S3 Standard Storage, so we’ll focus our cost savings on that storage class.

Next we’ll go to Cloudwatch to see where our storage in this class is. Open up Cloudwatch, open up metrics, and select S3.


Inside of S3 click on Storage Metrics and search for “StandardStorage” and select all buckets.


Then change your time window to something pretty long (say, 6 months) and your view type to Number

This will give you a view of specific buckets and how much they’re storage. You can quickly skim through to find the buckets storing the most data.

Once you have your largest storage points you can clean them up or apply s3 lifecycle policies to transition them to cheaper storage classes.

After you’re done with your largest cost areas, you rinse and repeat on other services.
This is a good exercise to do regularly. Even if you have good discipline around cleaning up old AWS resources costs can still crop up.
Happy cost savings!

AWS Powershell Tools: Get Specific Tags

A quick AWS Powershell tools snippet post here. When you call Get-EC2Instance from the AWS Powershell tools it returns an instance object that has a Tags attribute, which is a Powershell list of EC2 Tag Objects.
I’m usually a fan of how the AWS Powershell Tools object models are setup, but this is one case where I feel there could be some improvement. Instead of using a list and forcing users to iterate the list to find the right tag, the EC2 objects “Tags” property should be a hashtable with the tag Key as the hash key so you can index directly to the object. But, this is was we have to work with for now.
So we came up with a simple function to generate a map of desired EC2 tags from an instance.
function Get-Ec2InstanceTag($instance, [array]$desiredTagKeys) {
$instanceTags = $instance.Tags;
$tagMap = @{};
foreach ($desiredTagKey in $desiredTagKeys) {
foreach ($instanceTag in $instanceTags) {
if($desiredTagKey -eq $instanceTag.Key) {
$tagMap[$desiredTagKey] = $instanceTag.Value;
}
}
}
return $tagMap;
}

Usage for this function looks like this:

VPC Flowlogs through Cloudwatch Logs Insights

You know all those times you think to yourself, “I wish there were a faster way to search all these logs I keep putting in Cloudwatch?”

Well apparently Alexa was reading your mind at one of those times because at AWS re:Invent 2018 released CloudWatch Logs Insights. It’s advertised as a more interactive, helpful log analytics solution than the log search option we already have.
This last week I got some questions about blocked traffic in our AWS account, which seemed like the perfect opportunity to give it a shot (NOTE: You will need to be sending your VPC flowlogs to Cloudwatch for this to work for you).
Here are some of the queries I tried out, most of them based loosely off of the examples page.

Count of Blocked Traffic by Source IP and Destination Port

fields @timestamp, dstPort, srcAddr
| stats count(*) as countSrcAddr by srcAddr, dstPort
| sort countSrcAddr desc

This query gives use the the top blocked senders by destination service.

Using this I pretty quickly found an ELB with the wrong instance port.

This worked for me because we separate our accept and reject flowlogs. If you keep them together you can add a filter as the first line

filter action = 'REJECT'
| fields @timestamp, dstPort, srcAddr
| stats count(*) as countSrcAddr by srcAddr, dstPort
| sort countSrcAddr desc

Blocked Destination Addresses and Ports for a Specific Source

During our troubleshooting we noticed a specific address sending us a lot of traffic on port 8088, which is vendor specific or an alternate HTTP port. This was a little odd because we don’t use 8088 externally or internally.
We dug in using this query
filter srcAddr = 'x.x.x.x'
| fields @timestamp, dstPort, srcAddr, dstAddr

Where x.x.x.x is the source IP address.

I’m not going to post the screen shot because there are a lot of our destination addresses and it would take time to block them out, but you get the idea.

We did a lookup on the address and it was owned by Digital Ocean, which is a cloud hosting company. It’s likely someone was doing a scan from a server in their environment, hard to say if it was good or bad intentions.

To satisfy my curiosity I wanted to ask the question, “When did the scan begin and how aggressive is it?”
So I added a “stats” filter to group the sum of the packets by 5 minute totals.

filter srcAddr = 'x.x.x.x' and dstPort = '8088'
| fields @timestamp, dstPort, srcAddr, dstAddr
| stats sum(packets) by bin( 30 min)

When you use the stats command with a time series you can get a “visualization” like the one below:

It looks like the scan lasted about 6 hours, from 4 am or so my time to around 9:45 my time.

Conclusion

CloudWatch Log Insights is a much faster way to analyze your logs than the current Cloudwatch search. The query language is pretty flexible, and reasonably intuitive (though I did spend several minutes scratching my head over the syntax before I found a helpful example).
While it’s an improvement over what was there, it’s not on par with a log analytics tool like Splunk, or a data visualization tool like Kibana. The visualizations page for Insights only works with time series data (as far as I can tell) and isn’t very intuitive for combining multiple query results. For that you still have to import the query into a cloudwatch dashboard.
Amazon’s edge over those more mature tools is that it’s integrated into your AWS account already, with almost no setup required (except getting your logs into Cloudwatch), and the pricing model. As usual with AWS it’s extremely friendly for getting up and started, but it’s easy to see how the cost could grow out of control if you’re not paying attention (picture someone creating a query for a months worth of data on a dashboard that refreshes every 30 seconds).
Happy querying!

AWS Powershell Tools Snippets: S3 Multipart Upload Cleanup

My company does quite a bit with AWS S3. We use it to store static files and images, we push backups to it, we use it to deliver application artifacts, and the list goes on.

When you push a significant amount of data to and from S3, you’re bound to experience some network interruptions that could stop an upload. Most of the time S3 clients will recover on their own, but there are some cases where it might struggle.

One such case is when you are pushing a large file and using S3 Multi Part Uploads. This can leave you with pieces of files sitting in S3 that are not useful for anything, but still taking up space and costing you money. We recently worked with AWS support to get a report of how how many incomplete uploads we had sitting around, and it was in the double digit terabytes!

We started looking for a way to clean them up and found that AWS recently created a way to manage these with a bucket lifecycle policy. Some details are in a doc here and there’s an example of how to create this policy on the AWS CLI towards the bottom.

We decided to recreate this functionality in Powershell using “Write-S3LifecycleConfiguration Cmdlet” to make it a little easier to apply the policy to all of the buckets in our account at once.

It took a little reverse engineering. The Write-S3LifecycleConfiguration commandlet doesn’t have many useful examples. In the end I wound up creating the policy I wanted in the AWS console, and then using Get-S3LifecycleConfiguration to see how AWS is representing the policies in their .NET class structure.

It seems to me that there are a lot of classes between you and creating this policy, but that could mean that AWS has future plans to make these policies even more dynamic and useful.

The code I came up with at the end is below. Hope it’s helpful!


$rule = new-object -typename Amazon.S3.Model.LifecycleRule;
$incompleteUploadCleanupDays = new-object -typename Amazon.S3.Model.LifecycleRuleAbortIncompleteMultipartUpload
$incompleteUploadCleanupDays.DaysAfterInitiation = 7
$rule.AbortIncompleteMultipartUpload = $incompleteUploadCleanupDays
$rule.ID = "WholeBucketPolicy"
$rule.status = "Enabled"

$prefixPredicate = new-object -type Amazon.S3.Model.LifecyclePrefixPredicate

$lifecycleFilter = new-object -type Amazon.S3.Model.LifecycleFilter

$lifecycleFilter.LifecycleFilterPredicate = $prefixPredicate

$rule.Filter = $lifecycleFilter

foreach ($bucket in get-s3bucket) {
write-host "Bucket name: $($bucket.bucketname)"
$existingRules = get-s3lifecycleconfiguration -bucketname $bucket.bucketname
$newPolicyNeeded = $true;
foreach ($existingRule in $existingRules.rules) {
if($existingRule.ID -eq $rule.ID) {
write-host "Policy $($rule.ID) already exists, skipping bucket"
$newPolicyNeeded = $false;
}
}
if($newPolicyNeeded) {
write-host "Rule not found, adding"
$existingRules.rules += $rule

Write-S3LifecycleConfiguration -bucketname $bucket.bucketname -configuration_rule $existingRules.rules
}
}

WannaCry: Finding where SMB is allowed in AWS

WannaCry is the latest ransomware to sweep the internet and cause lots of excitement. As occasionally happens with well publicized security events like this I got to hear a former firewall admins favorite words: “Can you please take away a bunch of network access?” What fun!

I love blocking traffic as much as the next guy, but it’s not a great idea to just change firewall rules willy nilly. You should always spend a little time thinking about the impacting and looking at what access it’s prudent to remove. In this post I’ll list a couple of the commands I used to poke around our AWS Security groups and find where SMB was allowed.
Because time was a factor, I decided to do all of this in the AWS Powershell tools. As usual, you’ll need to set default credentials and an AWS region before running any of these commands.
The first one is pretty straightforward. The sequence of commands will spit out any security groups in your region that allow SMB traffic inbound in the default ports (137-139 and 445).
Fun fact, apparently AWS built this section of the powershell tools before you could control egress permissions in a group. The inbound rules are named “IpPermissions” while the outbound rules have the more describe “IpPermissionsEgress”. Other places (like cloudformation templates) name inbound rule sets “Ingress” to match the outbound version.
get-ec2securitygroup | foreach {foreach ($ingress in $_.ippermissions) {if(($ingress.fromport -le 445 -and $ingress.toport -ge 445) -or $ingress.fromport -le 137 -and $ingress.toport -ge 139) -or $ingress.ipprotocol -eq -1){$_.GroupName}}} | select-object  -unique

I apologize for the length of this single line, but it should show you anywhere your instances are accepting SMB, including rules that allow all traffic.

Next it can also be helpful to look for rules allowing SMB outbound. You might want to restrict this traffic too. This command is almost the same as the first

PS C:\Users\bolson\Documents\AWS> get-ec2securitygroup | foreach {foreach ($egress in $_.ippermissionsegress) {if(($egress.fromport -le 445 -and $egress.toport -ge 445) -or ($egress.fromport -le 137 -and $egress.toport -ge 137) -or $egress.ipprotocol -eq -1){$_.GroupName}}} | select-object  -unique

Now that you’ve got a listing of where SMB is allowed in your AWS account, you may want to remove specific security groups from instances. If you’re looking to do one or two instances, that can be done pretty easily through the console. If you’re looking to pull a few security groups off of every instance, you can use the example below, updating the security group IDs.

We used this example for removing a “temp setup” group that we use in our environment to allow extra access for configuring a new instance.

set-defaultawsregion us-east-1
set-awscredentials -profilename PHI
(Get-EC2Instance -filter @( @{name='instance.group-id';values="sg-11111","sg-22222"})).instances | foreach {
write-host "Instance Name: $(($_.tags | where {$_.key -eq "Name"}).value) - $($_.InstanceId)";
$finalGroups = @();
$finalGroupNames = @();
foreach ($group in $_.SecurityGroups) {
write-host $group.groupid
if($group.groupid -ne "sg-11111" -and $group.groupid -ne "sg-22222") {
write-host "$($group.groupid -ne 'sg-333333')"
$finalGroups += $group.groupid;
$finalGroupNames += $group.groupname
}
}
Edit-EC2InstanceAttribute -InstanceId $_.InstanceId -group $finalGroups
write-host "Finalgroups: $($finalGroupNames)"
}

Hopefully that helps you do some analysis in your environment!

Auditing AWS IAM Users

Like any other company with sensitive data we go through audits pretty regularly. The latest one included some questions about accounts that have access to sensitive data, and the number of auth factors required to log into them.

As usual I started digging around in the AWS Powershell Tools to find a way to make this job easier than just manually looking through accounts, and I quickly found Request-IAMCredentialReport and Get-IAMCredentialReport.

These two commands are a pretty interesting combination. The first tells AWS to generate a credential report. Request-IAMCredentialReport is the step required to generate the report on AWS’s end. There’s some pretty good documentation on how that section works. The most interesting point to me is that AWS will only generate a report every 4 hours. This is important to note if you’re making changes, and re-running reports to double check they fixed an issue.
The second command, Get-IAMCredentialReport actually downloads the report that’s generated. From what I’ve seen, if you haven’t run Request-IAMCredentialReport in the last 4 hours to have a fresh report, this command will fail.
I found the output of this command to be the most useful when I included the -AsTextArray option. That returns an array of lines of the report with the columns separated by columns. I won’t include an sample output from our accounts for obvious reasons, but check the documentation for an example of what that looks like.
Now that we’ve got all of the components to download this report, it’s pretty trivial powershell work to do some parsing and logic.
The script example below creates an array of maps for each line of the report, letting you iterate over it and check for different conditions.
The one I’m testing for now is IAM accounts that have passwords enabled, but do not have an MFA device activated, but you can see it would be pretty easy to add additional criteria that would be tested against the report.
$awsProfiles = @("FirstProfileName","SecondProfileName");
set-DefaultAWSRegion us-east-1;
foreach ($awsProfile in $awsProfiles) {
write-host "Running audit on $awsProfile";
Set-AWSCredentials -ProfileName $awsProfile;

# Attempt to get an IAM credential report, if one does not exist, sleep to let one generate
$reportResult = Request-IAMCredentialReport -Force;

# Sleep for 15 seconds to allow the report to generate
start-sleep -s 15

try {

# Get IAM Credential report
$credReports = Get-IAMCredentialReport -AsTextArray;
} catch {
write-host "No credential report exists for this account, please run script again in a few minutes to let one generate";
exit;
}
# Empty list that will contain parsed, formatted credential reports
$credReportList = @();

# Get the headings from the report
$headings = $credReports[0].split(",");

# Start processing the report, starting after the headings
for ($i = 1; $i -lt $credReports.length; $i++) {

# Break up the line of the report by commas
$splitLine = $credReports[$i].split(",");
$lineMap = @{};

# Go through the line of the report and set a map key of the header for that column
for ($j = 0; $j -lt $headings.length; $j++) {
$lineMap[$headings[$j]] = $splitLine[$j];
}

# Add the formatted line to the final list
$credReportList += , $lineMap;
}

# Iterate over the report, using rules to evaluate the contents
foreach($credReport in $credReportList) {
# Check for users that have an active password, but not an active MFA device
if($credReport['password_enabled'] -eq "TRUE" -and $credReport['mfa_active'] -eq "FALSE") {
write-host "ALERT: User: $($credReport['user']) has a password enabled, but no MFA device"
}
}
write-host "";
}

This script assumes you have created AWS Powershell tools profiles that match the array on the first line.

And here is some example output of users I had to go have a chat with today to activate their MFA devices.

NOTE: You may need to run this script a couple times, if you haven’t generated an IAM Credential Report in a while.