AWS Powershell Tools: Get Specific Tags

A quick AWS Powershell tools snippet post here. When you call Get-EC2Instance from the AWS Powershell tools it returns an instance object that has a Tags attribute, which is a Powershell list of EC2 Tag Objects.
I’m usually a fan of how the AWS Powershell Tools object models are setup, but this is one case where I feel there could be some improvement. Instead of using a list and forcing users to iterate the list to find the right tag, the EC2 objects “Tags” property should be a hashtable with the tag Key as the hash key so you can index directly to the object. But, this is was we have to work with for now.
So we came up with a simple function to generate a map of desired EC2 tags from an instance.
function Get-Ec2InstanceTag($instance, [array]$desiredTagKeys) {
$instanceTags = $instance.Tags;
$tagMap = @{};
foreach ($desiredTagKey in $desiredTagKeys) {
foreach ($instanceTag in $instanceTags) {
if($desiredTagKey -eq $instanceTag.Key) {
$tagMap[$desiredTagKey] = $instanceTag.Value;
}
}
}
return $tagMap;
}

Usage for this function looks like this:

Is My Traffic Being Blocked? (or Using Wireshark to Unanswered SYN Packets)

One of the most frustrating things in network troubleshooting can be finding out if traffic is being blocked. Blocked traffic can keep applications or a services from running correctly. A lot of applications will throw unhelpful or vague error messages.

Until relatively recently the average network was a pretty open and trusting place. Most computers were allowed to receive and send traffic on any port they pleased, any protocol was allowed to use any port with no explanation required.
But with ideas like network segmentation becoming more prevalent, and tools like AWS security groups essentially wrapping each server in a tiny firewall, those ideas are changing. It’s much more common for firewall and security admins to white list traffic and only allow the ports and services explicitly asked for.
My goal with this post is to get you focus a packet capture to find this blocked traffic, which can be very helpful for your network or firewall admins by relieving headaches, reducing stress, and getting you up and running faster. I’m assuming you’re somewhat familiar with TCP and Wireshark.

This brings us to a really helpful aspect of the TCP 3 way handshake: it’s relatively easy to find TCP conversations that never leave the initial state, which usually means that traffic is being blocked.

Let’s look at a healthy example to get some context.

I marked off the TCP flags because that’s the piece we’re interested in. This is an example of my laptop reaching out to a web server on port 443. The three packets follow the standard TCP handshake
  1. My laptop sends a SYN (“Hi, I’m a client and I’d like to connect.”)
  2. The server sends a SYN,ACK (“Hi, I’m a server and I’d like to connect to.”)
  3. My laptop sends an ACK (“Sounds good. Let’s be friends!”)
NOTE: The parentheticals are paraphrased for comedic effect.
How does this help us find blocked traffic? Well we can use a Wireshark analysis to look for conversations the client is trying to start that never get answered.
When a typical application sends a SYN packet out, but never receives a SYN,ACK back from the server it will retry the SYN packet several times, hoping to get a connection. Because the application is trying the same thing over and over again we can use a wireshark analysis that finds retransmitted packets:
tcp.analysis.retransmission

I’ll use start a wireshark packet capture on my wireless adapter and use a powershell command to initiate traffic to a port on the internet I know won’t respond, say 4433 on google.com:

Test-NetConnection -Port 4433 -computername google.com

And then apply a filter in wireshark to look for retransmitted packets to this port, and they jump right out!

If this were a real application I was trying to fix, and not just a toy example I’d be able to go to my network admin and tell them that I see traffic leaving my server outbound for the IP and port in the capture.

In this toy example I was able to be very specific, but in real life you might not know port your application is trying to communicate on. In this scenarios you might wind up capturing a lot of traffic pretty fast. For example when I started this paragraph I turned on a packet capture with no filters and in the time it took me to type this I caught around 500 packets just from background activity on my laptop!

That can quickly turn into a lot of traffic to sort through, so we can add a Wireshark filter to look only for SYN retransmits.

Test-NetConnection -Port 4433 -computername google.com

In English this is saying, “Show me the packets that are being retransmitted AND are the beginning of a TCP conversation.”

And you can see this filter let me find my 4433 traffic pretty easily (and something on port 8013 my laptop sincerely wants to talk to).

Happy packet capturing!

VPC Flowlogs through Cloudwatch Logs Insights

You know all those times you think to yourself, “I wish there were a faster way to search all these logs I keep putting in Cloudwatch?”

Well apparently Alexa was reading your mind at one of those times because at AWS re:Invent 2018 released CloudWatch Logs Insights. It’s advertised as a more interactive, helpful log analytics solution than the log search option we already have.
This last week I got some questions about blocked traffic in our AWS account, which seemed like the perfect opportunity to give it a shot (NOTE: You will need to be sending your VPC flowlogs to Cloudwatch for this to work for you).
Here are some of the queries I tried out, most of them based loosely off of the examples page.

Count of Blocked Traffic by Source IP and Destination Port

fields @timestamp, dstPort, srcAddr
| stats count(*) as countSrcAddr by srcAddr, dstPort
| sort countSrcAddr desc

This query gives use the the top blocked senders by destination service.

Using this I pretty quickly found an ELB with the wrong instance port.

This worked for me because we separate our accept and reject flowlogs. If you keep them together you can add a filter as the first line

filter action = 'REJECT'
| fields @timestamp, dstPort, srcAddr
| stats count(*) as countSrcAddr by srcAddr, dstPort
| sort countSrcAddr desc

Blocked Destination Addresses and Ports for a Specific Source

During our troubleshooting we noticed a specific address sending us a lot of traffic on port 8088, which is vendor specific or an alternate HTTP port. This was a little odd because we don’t use 8088 externally or internally.
We dug in using this query
filter srcAddr = 'x.x.x.x'
| fields @timestamp, dstPort, srcAddr, dstAddr

Where x.x.x.x is the source IP address.

I’m not going to post the screen shot because there are a lot of our destination addresses and it would take time to block them out, but you get the idea.

We did a lookup on the address and it was owned by Digital Ocean, which is a cloud hosting company. It’s likely someone was doing a scan from a server in their environment, hard to say if it was good or bad intentions.

To satisfy my curiosity I wanted to ask the question, “When did the scan begin and how aggressive is it?”
So I added a “stats” filter to group the sum of the packets by 5 minute totals.

filter srcAddr = 'x.x.x.x' and dstPort = '8088'
| fields @timestamp, dstPort, srcAddr, dstAddr
| stats sum(packets) by bin( 30 min)

When you use the stats command with a time series you can get a “visualization” like the one below:

It looks like the scan lasted about 6 hours, from 4 am or so my time to around 9:45 my time.

Conclusion

CloudWatch Log Insights is a much faster way to analyze your logs than the current Cloudwatch search. The query language is pretty flexible, and reasonably intuitive (though I did spend several minutes scratching my head over the syntax before I found a helpful example).
While it’s an improvement over what was there, it’s not on par with a log analytics tool like Splunk, or a data visualization tool like Kibana. The visualizations page for Insights only works with time series data (as far as I can tell) and isn’t very intuitive for combining multiple query results. For that you still have to import the query into a cloudwatch dashboard.
Amazon’s edge over those more mature tools is that it’s integrated into your AWS account already, with almost no setup required (except getting your logs into Cloudwatch), and the pricing model. As usual with AWS it’s extremely friendly for getting up and started, but it’s easy to see how the cost could grow out of control if you’re not paying attention (picture someone creating a query for a months worth of data on a dashboard that refreshes every 30 seconds).
Happy querying!