Great Firewall of China Active Probing IDS Signatures

15 September 2015

A friend of mine @Munin recently linked a page that had research done regarding the "Great Firewall of China" (GFW). It went on to talk about how it was actively probing servers on the internet to detect if they are acting as proxies so that users behind the GFW could circumvent the firewall and access the internet unfiltered.

Active probing is the most recent step in the ongoing arms race of Internet censorship. Users set up proxies to circumvent blocks; censors responded by identifying and blocking proxies by deep packet inspection (DPI); and circumventors made proxy protocols more difficult to detect in turn. Deprived of its capacity for easy, passive protocol identification, the censor now goes straight to the source and interrogates the server directly after it sees a potentially suspicious connection. The censor acts like a user by issuing its own connections to a suspected proxy server, as illustrated in the diagram to the right. If the server responds using a prohibited protocol, then the censor now takes some blocking action, such as adding its IP address to a blacklist.

The research has a ton of interesting information about the GFW and how they are probing servers, but, it also had some ways to detect probes, which would make for some great IDS signatures.

I haven't written a ton of IDS signatures, so I thought this would be a good way to work on my skills and possibly helping users identify these probes. So, let's get started...

First up in their article, was:

Check for traffic from the IP address 202.108.181.70.
The IP address 202.108.181.70 is disproportionately involved in active probing (sending half of all probes in one study), for reasons we do not understand.

This one should be relatively easy...

alert ip 202.108.181.70 any -> $HOME\_NET any (msg:"Active Probing from known Chinese IP";
flow:to_client,established; reference:url,https://nymity.ch/active-probing/;
classtype:misc-attack; sid:1000000;)

The above rule is alerting if we see the IP Address 202.108.181.70 talking to any of our assets. Nothing too special here. Moving on...

Look for certain requests in web server logs.
The pattern POST /vpnsvc/connect.cgi indicates a SoftEther probe. The pattern GET /twitter.com indicates an AppSpot probe.

alert http $EXTERNAL_NET any -> $HTTP_SERVERS any (msg:"Active Probing - SoftEther Probe";
flow:established,to_server; content:"POST"; http_method; content:"/vpnsvc/connect.cgi"; http_uri;
nocase;reference:url,https://nymity.ch/active-probing/; classtype:misc-attack;
sid:1000001;)

alert http $EXTERNAL_NET any -> $HTTP_SERVERS any (msg:"Active Probing - AppSpot Probe";
flow:established,to_server; content:"POST"; http_method; content:"/twitter.com"; http_uri; nocase;
reference:url,https://nymity.ch/active-probing/; classtype:misc-attack; sid:1000002;)

The two rules above are looking for two different probes, both are looking for POST requests to a server, and then two different URI's, /twitter.com and /vpnsvc/connect.cgi

Next up we're looking for a certain Host in the HTTP header:

Look for web requests with an unexpected Host header.
An unexpected Host header, especially one pointing to a subdomain of appspot.com, is possible evidence of an AppSpot probe. Your web server may not log the Host header by default. In Apache, you can enable mod_log_forensic to see request headers.

alert http $EXTERNAL_NET any -> $HTTP_SERVERS any (msg:"Active Probing - AppSpot Host Header";
flow:established,to_server; content:"appspot.com"; http_header; nocase;
reference:url,https://nymity.ch/active-probing/;classtype:misc-attack; sid:1000003;)

In the above rule, we're looking for appspot.com in the HTTP header. As the quote above says, your web server may not be logging the Host header in the requests, so, you may need to enable this feature in your particular webserver.

Next...

Check for binary garbage in application logs.
The obfs2 and obfs3 protocols look like random binary noise by design. They tend to stand out in application logs. For example, here is an obfs2 probe seen in an Apache log:

192.0.2.1 - - [13/Jul/2015:05:56:50 -0600] "\xba\xf4\xf1gy\x9e\xe7O9..." 400 0 "-" "-"
Try grepping your logs for escaped bytes. (Be aware that there may be many false positives; for example \x16\x03 usually simply indicates a TLS connection to a non-TLS port.)

grep '\x' application.log

alert http $EXTERNAL_NET any -> $HTTP_SERVERS any (msg:"Active Probing - obfs2/obfs3 Protocol";
flow:established,to_server; content:"\\x"; reference:url,https://nymity.ch/active-probing/;
classtype:misc-attack; sid:1000004;)

Not sure how valuable this rule would be, as it could easily generate a ton of false positives. Plus, I'm not exactly sure this would be the best way to search for binary data in a request like this. Since the data is random noise, I'm not even sure if I can include length for the binary data, like \\x{1,5}, since I'm not positive of how long it will be.

Next, we are looking for some binary content probing for TOR servers.

The Great Firewall probes for Tor servers using a TLS connection containing a single Tor VERSIONS cell (see Section 4.1 of the linked specification). The VERSIONS cell declares support for versions 1 and 2 of the Tor protocol. In hexadecimal, the payload is this:
00 00 07 00 04 00 01 00 02

alert tcp $EXTERNAL_NET any -> $HOME_NET any (msg:"Active Probing - Tor VERSIONS Cell";
flow:established,to_server;content:"|00|00|07|00|04|00|01|00|02|";
reference:url,https://nymity.ch/active-probing/;classtype:misc-attack; sid:1000005;)

The above rule is simply looking for the hexadecimal payload in the packet contents.

Last one...

SoftEther probes resemble the HTTPS-based client handshake of SoftEther VPN, a multi-protocol VPN client.

POST /vpnsvc/connect.cgi HTTP/1.1
Connection: Keep-Alive
Content-Length: 1972
Content-Type: image/jpeg

GIF89a...

The value of the Content-Length header may vary. In the official SoftEther protocol, the Content-Length reflects a random amount of padding following the fixed part of the body. The body of the SoftEther probe we saw also included random padding, but because we only recovered one example in full detail, we cannot say for sure whether the length varies.

Despite the Content-Type header, the POST body is a GIF image, not a JPEG, 1,411 bytes in size. In the SoftEther source code, the file is found in src/Cedar/Watermark.c. As an image, it looks like this:

alert http $EXTERNAL_NET any -> $HTTP_SERVERS any (msg:"Active Probing - SoftEther Probe";
flow:established,to_server;content:"POST"; http_method; content:"/vpnsvc/connect.cgi"; http_uri;
nocase; content:"image/jpeg"; http_header;nocase; file_data; content:"|GIF89a|; reference:url,https://nymity.ch/active-probing/;
classtype:misc-attack; sid:1000006;)

The above rule is looking for a POST request to /vpnsvc/connect.cgi that has a HTTP content type header of image/jpeg, but has the magic number for a GIF GIF89a in the packet contents.

So, that's it. Nothing that crazy, and this was some good experience for me. Although, if I made any mistakes in these rules, please feel free to shoot an email to brian@nullsecure.org and I'll update accordingly. Also, check out the article I referenced in the beginning, it has some great research and data in it.