horizontal lines
Gigasheet Primary logo
  • crystalaryeh

How To Analyze Network Flow Data For Security Events

This blog will demonstrate how to do an analysis of network security data using a dataset from the 2019 Trendmicro CTF Wildcard 400. We have shared this file in Gigasheet publicly here so that you can follow along.

This TrendMicro Capture The Flag exercise involves using network flow data to uncover anomalous security events. The data presented here is synthetic and does not represent typical network protocols or behavior. As a result, an extensive understanding of network protocols is not required for these issues.

network analysis security

Upload your file in Gigasheet

While network flow datasets can typically overwhelm tools, Gigasheet can handle up to 1 billion rows of data in a file.

log files for network analysis

The gowiththeflow_20190826.csv file does not originally come with header names. If your file does not have header names, you can add them in Gigasheet for easier analysis. Headers help keep your data organized. We can add header names by clicking the three lines on the far right and selecting rename.

Free network security tool
Adding Header Names

I named the headers Time Frame, IP.SRC, IP.DST, PORTS, and BYTES SENT.

Network dataset
Added Header Names

Now that we have the headers named, we can answer some questions from TrendMicro's 2019 Capture the Flag exercise.

Capture the Flag Question 1: Find the Noisy IP Address

Q1: Our intellectual property is leaving the building in large chunks. A machine inside is being used to send out all of our widget designs. One host sends out much more data from the enterprise than the others. What is its IP?

To find out the IP address, we used our grouping feature to group the IP.SRC column. By selecting the three lines on the right, you can select group.

Netflow data exfiltration
Grouping IP.SRC

I identified as the bad IP and looking at the distribution of the outbound of traffic; I can see that this is suspicious because it's the only one that stayed under 50 GB to not seem suspicious.

Netflow suspicious activity Row

Capture the Flag Question 2: After Work Activity

Q2: Another attacker has a job scheduled that export the contents of our internal wiki. One host is sending out much more data during off-hours from the enterprise than the others. Office hours are between 16:00 to 23:00. What is its IP?

I first sorted the bytes by selecting Sort Sheet - 9 to 1

Netflow analysis of bytes sent
Sort Sheet 9 to 1

This then showed me what value of the bytes sent was the highest. By also using our aggregations feature in the bottom right under the Bytes Sent column and selected max to see the highest number of bytes sent.

Netflow security analytics
Aggregations feature

It is clear that the host with IP is exfiltrating data during off-business hours, and the highest number of bytes sent is 843091.

Netflow traffic analysis

Q3: We're always running a low-grade infection; some internal machines will always have some sort of malware. Some of these infected hosts phone home to C&C on a private channel. What unique port is used by external malware C&C to marshal its bots?

By using our filter feature, I filtered port 113

Search large netflow log
Filter feature

I found that port 113 is used to communicate with internal machines with only one external IP address (, it was the only IP address doing so.  

Filter netflow by port

This is most likely the port used by external malware C&C (command and control) to marshal its bots.

Command and control traffic in netflow
C&C traffic

Gigasheet makes this kind of analysis easy on any CSV, JSON, PCAP, or EVTX file.

Sign up at gigasheet. co so you can try this on your own too.