Chef Server and the risks of depending on the open source community

As a DevOps engineer I’ve played around with a number of different automation tools. Some professionally, some for a blog post, and some just because I was curious about them. When most companies are evaluating automation tools they look at some common attributes

  1. How much will it cost?
  2. Does it support the platforms we’re on?
  3. Is it a good fit for my teams skills?
  4. How long will it take to setup and see some value add?

And then there run down the list of common tools like Docker, Ansible, Chef, Puppet, Powershell DSC, and on until they find one that fits their needs will enough. But I’d argue there’s another aspect to most of these tools that’s harder to quantify, but more impactful to the long term success of a project or tool: how excited is the open source community about the tool.

Most of these tools have at least some level of dependency on open source development, whether it’s the Moby Linux vm behind running Linux containers on Windows machines, or the chef cookbooks in your Berksfile, you’re probably using something that’s open source. The effect that has on your project success can be subtle.

I’ve had two projects at work I’d like to walk you through: migrating a Django app into Docker, and adding AWS’s boto3 to a server using chef.

When I started on the Dockerfile for the Django app I googled “Django app in docker” and immediately found several good examples with notes, explanations, and a simple demo app. I could pick my Django and python versions using image tags, and I had my app starting in a container in about an hour (after that I spent a day hunting down undocumented environment variables, but that’s on us, not docker).

Overall it was a smooth process, there was rich documentation, and every question about tooling had already been answered on forums.

Compare that to trying to add a python package to a server using chef. I started off fighting with chef test kitchen. It took my an hour or so to figure out all of the variables I needed to create a server with the ec2 kitchen driver (env vars, values swapped out in my kitchen.yml file, drivers I needed to download, etc.)

Once I had an instance created in AWS I had to find an example of installing a python package. After 20 or so minutes of googling I landed on Poise python, which mostly works, but hasn’t been updated in close to a year.

The package I needed was psycopg2, and I got it added to my cookbook pretty easily, but then I tried to rerun kitchen converge, and I got an error from pretty deep inside of poise python. I’ve always believed part of the unwritten contract you agree to with open source is digging into bugs when you find them, so I dove in and eventually found the problem in here doc that creates a python script to install pip modules.

Looking at the heredoc gave me a few ideas for tweaking my chef recipe to use poise python differently and avoid the error, and after experimenting for a while I found a combination of version pinning that worked and I could converge my node.

Not long after that I a commit and PR ready to add the python module to the new node and I was done.

So what’s the difference here? The work I was doing was pretty similar in both situations, I needed to update packages for a python application, but the docker experience was smooth, there were lots of examples, and it only took a few hours total. The chef experience was painful, I spent time digging through source code, debugging my own problems, and had trouble finding examples.

To be clear I’m not promoting Docker over Chef. They’re different tools that solve similar problems. I’m trying to point out that community support has moved away from Chef and the open source Chef cookbooks over the last 3–5 years. That transition makes the tool much harder to use because there are fewer people to contribute and help solve problems.

And the scary thing is that the exact same thing could happen to Docker, leaving all of the companies containerizing their applications high and dry with less free community support than they have now. The open source world is enamored with Docker today, but they may not be tomorrow. So before you commit to an open source tool for the cost or other reasons above, I’d encourage you to ask these questions too

  1. Am I willing to contribute back to this open source project to solve my own problems?
  2. Is the open source community moving towards, or away from this project?
  3. How bad is it for my org if we start to find ourselves with less open source support for this?
  4. Am I willing to pay for enterprise support, or hire an open source community member if it comes to it?

Paranoid Defensive Programming

Defensive programming is a programming discipline designed to ensure reliability of a system in case of known failure. It focuses on error handling and input validation to make sure that your application is resilient to failure.

Enterprise Craft lists several key concepts of Defensive Programming

  1. Test preconditions before operations
  2. Check for nulls
  3. Assert state whenever you are changing it

These are great practices to do, but I argue that there are a few things missing

  1. Defensive programming focuses on what happens inside of your application, and spends less time worrying about what could be happening outside your code
  2. Defensive programming tends to assume your program will be the only thing modifying state
  3. Defensive programming works best when the APIs or SDKs your working with provide you with good exceptions and errors

For those reasons I’ve started using Paranoid Defensive Programming in some scenarios. Paranoid Defensive Programming adds a couple concepts to the list above

  1. Assert state before, during, and after you make a change
  2. Verbosely log all of the result

To accomplish those goals we follow a few guidelines

  1. Extract a state testing function because you’ll need to repeatedly test state
  2. Log state results that could be benign at “info”
  3. Log state results that mean there is an error at “error”
  4. Log state results that are ambiguous at “warn”

Let’s walk through an example. Let’s say we have a function that resetUserPassword() that resets a user’s password to a random value, and another function getLastUserPasswordReset() that returns the last time a user’s password was reset.

Our goal is to reset the user’s password if it has been the same for more than 30 days. If we know that resetUserPassword() will reliably throw exceptions or return errors we can just do defensive programming like we normally would.

But here’s the catch: what if we know that resetUserPassword() is unreliable. Maybe there’s a bug in the password generator and the new password doesn’t always meet complexity requirements, maybe it depends on some interaction from the user that is hard to predict, maybe the developer didn’t take the time to throw reasonable exceptions (or throw exceptions at all), but for whatever reason we don’t trust resetUserPassword() to do it’s job correctly.

Our Paranoid Defensive Programming code should look something like this

original_reset_time = getLastUserPasswordReset("myUser")
log("Last reset time was: " + original_reset_time)
if(original_reset_time < date().days - 30) {
  log("Password is old, resetting")
  resetUserPassword("myUser")
  new_reset_time = getLastUserPasswordReset("myUser")
  if(new_reset_time < original_reset_time) {
    log("password reset correctly")
  } else {
    log("password failed reset!")
  }
} else {
  log("password is new, moving on")
}

As you can see we’re bending over backwards, to look for and record errors, but this approach gives us confidence that we’ll know the state of our system when we’re done.

When is Paranoid Defensive Programming a good idea?

  • When you are calling an unreliable API, and you do not trust it to return errors or exceptions
  • When you do not have the ability to change the system you are calling, it might be owned by another team or outside your org
  • When having consistent results is very important, but you aren’t working with something as reliable as a database (i.e. config management code)

When is Paranoid Defensive Programming not a good fit?

  • When you have a reliable API that returns consistent errors
  • When you have a chance to make your upstream dependencies more reliable
  • When you have other frameworks that will handle idempotency for you

Using Unit Tests for Communication

Expressing software requirements is hard. Software is abstract and intangible, you can’t touch it see it or feel it, so talking about it can become very difficult.

There are plenty of tools and techniques that people use to make communicating about software easier like UML diagrams or software state diagrams. Today I’d like to talk about a software development technique my team has been using for communication: Test Driven Development.
Test Driven Development (TDD) is a very involved software development approach, and I won’t go into it in depth in this post, but here’s a quick summary
  1. Write tests before code, expecting them to fail
  2. Build out code that adds the behavior the test looks for
  3. Rerun the test
On my team we’ve started trying to use unit tests as a communication tool. Here’s an example.
At my company we use the concept of a unique customer ID that consists of the first two digits of the state the customer’s headquarters is in, and a sequential 3 digit number. At least, most of the time. The Site ID concept grew organically, and like any standard that starts organically it has exceptions. Here are a few that I’m aware
  1. One very large customer uses an abbreviation of their company name, instead of state code
  2. Most systems pad with zeros, some do not (e.g. TX001 in some systems is TX1 in others)
  3. Most systems use a 3 digit number, some use four (e.g. TX001 vs TX0001)
After we added “site id converters” to a few modules in our config management code we decided it was time to centralize that functionality and build out a single “site id converter” function. When I was writing the card it was clear there was enough variation that it was going to take a fair amount of verbiage to spell out what I wanted. Let’s give it a try for fun.

Please build a function that takes a site id (the siteid can be two digit state code and 1, 3, or 4 digits, or 3 digit customer code and 1, 3, or 4 digits). The function should also take a “target domain” which is where the site id will be used (for example “citrix” or “bi”). The function should convert the site id into the right format for the target domain. For example if TX0001 and Citrix is passed in the function should convert to “TX001”. If TX001 and “BI” are passed in the function should convert to “TX0001”.

It’s not terrible, but it gets more complex when you start unpacking it and notice the details I may have forgotten to add. What if the function gets passed an invalid domain? What should the signature look like? What module should the function go into?

And it gets clarified when we add a simple unit tests with lots of details that feel a little awkward to put in the card.

Describe "Get SiteID by domain tests" {
InModuleScope CommonFunctions {
It "converts long id to short for citrix" {
Get-SiteIdByDomain -siteid "TX0001" -domain "citrix" | Should be "TX001"
}
}

I’m not suggesting you stop writing cards or user stories, rather the right answer usually seems to be a little of both. Some verbiage to tell that background, give some motivation for the card, and open a dialog with the engineer doing the work. Some unit tests to give more explicit requirements on the interface to the function.

This approach also leaves the engineer free to pick their own implementation details. They could use a few nested “if” statements, they could use powershell script blocks, or anything else they can think of (that will pass a peer review). As long as it meets the requirements of the unit test the specifics are wide open.

A note about TDD

Please note, I’m not a TDD fanatic. TDD is a really useful tool that drives you to write testable code that keeps units small, makes your functions reasonable, and lets you refactor with confidence. As long as your code still passes your unit test sweet you can make any changes you want.
But it’s not a magic hammer. In my experience it’s not a good fit to start with TDD when
  • You’re using an unfamiliar SDK or framework, TDD will slow down your exploration
  • You have a small script that is truly one time use, TDD isn’t worth the overhead
  • You have a team that is unfamiliar with TDD, introducing it can be good, using it as a hard and fast rule will demoralize and delay

Docker Windows container for Pester Tests

I recently wrote an intro to unit testing your powershell modules with Pester, and I wanted to give a walk through for our method of running these unit tests inside of a Docker for Windows container.

Before we get started, I’d like to acknowledge this post is obviously filled with trendy buzzwords (CICD, Docker, Config Management, *game of thrones, docker-compose, you get the picture). All of the components we’re going to talk through today add concrete value to our business, and we didn’t do any resume driven development.

Why?

Here’s a quick run through of our motivation for each of the pieces I’ll cover in this post.
  1. Docker image for running unit tests 
    1. gives engineers a consistent way to run the unit tests. One your workstation you might need different versions of SDKs and tools, but a docker container lets you pin versions of things like the AWS Powershell tools
    2. Makes all pathing consistent – you can setup your laptop anyway you lock, but the paths inside of the container are consistent
  2. Docker-compose
    1. Provides a way to customize unit test runs to a project
    2. Provides a consistent way for engineers to map drives into the container
  3. Code coverage metrics
    1. At my company we don’t put too much stock in code coverage metrics, but they offer some context for how thorough an engineer has been with unit tests
    2. We keep a loose goal of 60%
  4. Unit test passing count
    1. A failed unit test does not go to production. A failed unit test has a high chance of causing production outage

How!

The first step is to setup Docker Desktop for Windows. The biggest struggle I’ve seen people having getting docker running on Windows is getting virtualization enabled, so pay extra attention to that step.
Once you have Docker installed you’ll need to create an image you can use to run your unit tests, a script to execute them, and a docker-compose file. The whole structure will look like

  • /
    • docker-compose.yml
    • /pestertester
      • Dockerfile
      • Run-AllUnitTests.ps1

We call our image “pestertester” (I’m more proud of that name than I should be).

There are two files inside of the pestertester folder: a Dockerfile that defines the image, and a script called Run-AllUnitTests.ps1.
Here’s a simple example of the dockerfile. For more detail on how to write a dockerfile you should explore the dockerfile reference

FROM mcr.microsoft.com/windows/servercore
RUN "powershell Install-PackageProvider -Name NuGet -MinimumVersion 2.8.5.201 -Force"
RUN "powershell Install-Module -Scope CurrentUser -Name AWSPowerShell -Force;"
COPY ./Run-AllUnitTests.ps1 c:/scripts/Run-AllUnitTests.ps1

All we need for these unit tests is the AWS Powershell Tools, and we install NuGet so we can use powershell’s Install-Module.

We played around with several different docker images before we picked mcr.microsoft.com/windows/servercore.

  1. We moved away from any of the .NET containers because we didn’t need the dependencies they added, and they were very large
  2. We moved away from nano server images because some of our powershell modules call functions outside of .NET core
Next we have the script Run-AllUnitTests.ps1. The main requirement for this script to work is that your tests be stored with this file structure
  • /ConfigModule.psm1
    • /tests
      • /ConfigModule.tests.ps1
  • ConfigModule2.psm1
    • /tests
      • /ConfigModule2.tests.ps1
The script isn’t too complicated
$results = @();
gci -recurse -include tests -directory | ? {$_.FullName -notlike "*dsc*"} | % {
set-location $_.FullName;
$tests = gci;
foreach ($test in $tests) {
$module = $test.Name.Replace("tests.ps1","psm1")
$result = invoke-pester ".\$test" -CodeCoverage "..\$module" -passthru -quiet;
$results += @{Module = $module;
Total = $result.TotalCount;
passed = $result.PassedCount;
failed = $result.FailedCount
codecoverage = [math]::round(($result.CodeCoverage.NumberOfCommandsExecuted / $result.CodeCoverage.NumberOfCommandsAnalyzed) * 100,2)
}
}
}

foreach ($result in $results) {
write-host -foregroundcolor Magenta "module: $($result['Module'])";
write-host "Total tests: $($result['total'])";
write-host -ForegroundColor Green "Passed tests: $($result['passed'])";
if($result['failed'] -gt 0) {
$color = "Red";
} else {
$color = "Green";
}
write-host -foregroundcolor $color "Failed tests: $($result['failed'])";
if($result['codecoverage'] -gt 60) {
$color = "Green";
} elseif($result['codecoverage'] -gt 30) {
$color = "Yellow";
} else {
$color = "Red";
}
write-host -ForegroundColor $color "CodeCoverage: $($result['codecoverage'])";
}

The script iterates through any subdirectories named “tests”, and executes the unit tests it finds there, running code coverage metrics for each module.

The last piece to tie all of this together is a docker-compose file. The docker compose file handles

  1. Mapping the windows drives into the container
  2. Executing the script that runs the unit tests
The docker-compose file is pretty straightforward too
version: '3.7'

services:
pestertester:
build: ./pestertester
volumes:
- c:\users\bolson\documents\github\dt-infra-citrix-management\ssm:c:\ssm
stdin_open: true
tty: true
command: powershell "cd ssm;C:\scripts\Run-AllUnitTests.ps1"

Once you’ve got all of this setup, you can run your unit tests with

docker-compose run pestertester

One the container starts up you’ll see your test results

Experience

We’ve been running linux containers in production for a couple of years now, but we’re just starting to pilot windows containers. According to the documentation they’re not production ready yet

Docker is a full development platform for creating containerized apps, and Docker Desktop for Windows is the best way to get started with Docker on Windows.

Running our unit tests inside of windows containers has been a good way to get some experience with them without risking production impact.

A couple final thoughts

Windows containers are large, even server core and nano server are gigabytes.

The container we landed on is 11GB

If you need to run windows containers, and you can’t stick to .NET core and get onto nano server, you’re going to be stuck with pretty large images.

Start up times for windows containers will be a few minutes

Especially the first time on a machine while resources are getting loaded.

Versatile Pattern

This pattern of unit testing inside of a container is pretty versatile. You can use it with any unit testing framework, and any operating system you can run inside a container.

*no actual game of thrones references will be in this blog post

Unit Testing PowerShell Modules with Pester

Pester is a unit testing framework for Powershell. There are some good tutorials for it on their github page, and a few other places, but I’d like to pull together some of the key motivating use cases I’ve found and a couple of the gotchas.

Let’s start with a very simple example.
This is the contents of a simple utility module named Util.psm1
function Get-Sum([int]$number1, [int]$number2) {
$result = $number1 + $number2;
write-host "Result is: $($result)";
return $result;
}

And this is the content of a simple unit test file named UtilTest.ps1

Import-Module .\Util.psm1
Describe "Util Function Tests" {
It "Get-Sum Adds two numbers" {
Get-Sum 2 2 | Should be 4;
}
}

We can run these tests using “Invoke-Pester .\UtilTest.ps1”.

And already there’s a gotcha here that wasn’t obvious to me from the examples online. Let’s say I change my function to say “Sum is:” instead of “Result is” and save the file. When I re-run my pester tests I still see “Result is:” printed out.

What’s also interesting is that the second run rook 122 ms, while the first took 407 ms.

It turns out both of these changes are results of the same fact – once the module you are testing is loaded into memory it will stay there until you Remove it. That means any changes you make trying to fix your unit tests won’t take effect until you’ve refreshed the module. The fix is simple

Import-Module .\Util.psm1
Describe "Util Function Tests" {
It "Get-Sum Adds two numbers" {
Get-Sum 2 2 | Should be 4;
}
}
Remove-Module Util;

Removing the module after running your tests makes powershell pull a fresh copy into memory so you can see the changes.

The next gotcha is using the Mock keyword. Let’s say I want to hide the write-host output in my function so it doesn’t clutter up my unit tests. The obvious way is to use the “Mock” keyword to create a new version of write-host that doesn’t actually write anything. My first attempt looked like this

Import-Module .\Util.psm1
Describe "Util Function Tests" {
It "Get-Sum Adds two numbers" {
Mock write-host;
Get-Sum 2 2 | Should be 4;
}
}
Remove-Module Util;

But I still see the write-host output in my unit test results.

It turns out the reason is that the Mock keyword creates mock objects in the current scope, instead of in scope for the module being tested. There are two ways of fixing this. One is the InModuleScope, or the ModuleName parameter on the Mock object. Here’s an example of the first option

Import-Module .\Util.psm1

InModuleScope Util {
Describe "Util Function Tests" {
It "Get-Sum Adds two numbers" {
Mock write-host;
Get-Sum 2 2 | Should be 4;
}
}
}
Remove-Module Util;

And just like that the output goes away!