Analyzing Honeypot's Attacker Lifecycle
Looking across several months of data, from February 2015 to April 2015, I started analyzing the typical lifecycle of an attacker hitting my honeypots. There was a piece in the Verizon DBIR regarding how long an IP address was used, which got me interested in my data, to see how the trends matched up.
In this post, I'll go over IP addresses used by attackers, as well as how often domains hosting the malware were seen, which would show us how often they changed their attack infrastructure as well as how often they changed their domains.
IP Address Lifecycle Analysis
The first search I ran in Splunk was to see the average number of days I noted an attacker's IP address:
sourcetype=kippojson | rex field=timestamp "(?<date>\d{4}-\d{2}-\d{2})T"
| stats dc(date) as dc by src_ip | stats avg(dc)
The above search would create a new field called date
, which was just the YYYY-MM-DD of the event, then I got the unique number of dates seen by attacker, and lastly, returned the average number of days seen. The average number of days was 2.63
, with the most numbers of days seen being 42
, and the lowest being 1
.
Next, out of all the IP addresses I've seen, what are the top IP's that were seen for 42 days:
sourcetype=kippojson | rex field=timestamp "(?<date>\d{4}-\d{2}-\d{2})T"
| stats dc(date) as dc by src_ip | sort - dc | head 10
The search is pretty much the same, I'm just returning the top 10 src_ip's instead of getting the average:
As an aside, let's see what type of activity the Top 10 IP's are involved in. It seems that these IP's were mainly used to scan for credentials, while, a few of them would attempt to run a poorly written script which always failed to download their malware (if successfully logged in).
What about the IP addresses that were only seen for one day, what type of activity were they getting into? It looks like only 2.44% of sessions included commands being entered, so, it's safe to say that about 97% of sessions were just used to harvest credentials, or just test login capability.
Out of all the IP addresses seen for 30+ days, only 43
of the login attempts were successful, while the other 1,745
attempts failed.
Next, let's look at the percentage of attackers who we only saw for one day:
sourcetype=kippojson | rex field=timestamp "(?<date>\d{4}-\d{2}-\d{2})T"
| stats dc(date) as dc by src_ip | where dc == 1
Out of all the attackers, the percentage of attackers only seen for one day was 65%, or 3,855 IP addresses.
Next, how about the percentage of attackers seen for more than one day:
sourcetype=kippojson | rex field=timestamp "(?<date>\d{4}-\d{2}-\d{2})T"
| stats dc(date) as dc by src_ip | where dc > 1
Which gave us 35%, or 1,946 IP addresses.
I was interested to see if the attacker's infrastructure was used sporadically, or in bursts, so I ran this search to generate some visuals:
sourcetype=kippojson src_ip=115.230.126.151 OR src_ip=183.136.216.4 OR
src_ip=183.136.316.6 OR src_ip=115.231.218.130 OR src_ip=115.231.218.130
OR src_ip=115.231.222.45 OR src_ip=183.136.216.3 OR src_ip=175.126.82.235 OR
src_ip=95.77.16.45 OR src_ip=115.239.228.11
| timechart count by src_ip
The search looked at the Top 10 IP addresses and counted the number of events for each IP, which gives us this:
This illustrates that for the most part, the attackers use these IP's as kind of like a burner phone, and just use them for scanning for a few weeks, then dumped then. We can see the activity picks up right around March 4th - 5th, and then goes strong for a few weeks and dies down on April 1st.
Domain Lifecycle Analysis
Moving on to domains used by attackers to host malware, I wanted to see how long each domain seen was used for.
Starting out, we have about 198 unique domains seen throughout our honeypots. The most days a domain has been seen so far is 30
, with the fewest being 1
.
Let's do the same type of searches as we did with IP addresses to see what we get, starting with percentage of domains seen one day, and percentage seen more than one day:
sourcetype=kippojson [ search sourcetype=kippojson domain=* | stats count by session
| fields session ] | transaction session
| rex field=timestamp "(?<date>\d{4}-\d{2}-\d{2})T"
| stats dc(date) by domain | where dc(date) == 1
The above search uses a subsearch to return only the events with domains seen, then we group them together with the transaction
command to put the events together based on their session identifier. We then apply the same regex to the timestamp field and get the distinct count of days per domain, then only return the domains who were seen only one day, which gives us 58% or 114 domains. Knowing that, we can then say the remaining 84 domains (or 42%) were seen more than one day.
How about the average number of days each domain was seen:
sourcetype=kippojson domain=* | rex field=timestamp "(?<date>\d{4}-\d{2}-\d{2})T"
| stats dc(date) as dc by domain | stats avg(dc)
Running the above search, we get 2.54
days for the average number of days.
Below are the domains that were seen the most:
Lastly, how's the timeline for these domains look? Not as pretty and straightforward as the attacker's IP addresses, that's for sure.
TL;DR
- Average number of days attacking IP was used: 2.63
- Highest number of days attacking IP was used: 42
- Average number of days malware-hosting domain was used: 2.54
- Highest number of days malware-hosting domain was used: 30