First, we need to understand how the GFW blocks our traffic#
-
IP blackhole: Currently unsolvable, but only affects certain services, such as Google services (Google, Twitter, YouTube, etc.)
-
DNS pollution: Returns a fake IP for the domain name. Use the hosts file to force the domain to correspond to an IP or use encrypted DNS (DoH, DNS signing, etc.)
-
HTTP hijacking: Since the traffic is not encrypted, the GFW, as a natural man-in-the-middle, can directly tamper with it (e.g., redirecting to a 404 page, hijacking to an anti-fraud page, etc.). You can use HTTPS connections to avoid this, but you may encounter SNI blocking.
-
SNI blocking: Before establishing an encrypted connection between the client and server, the client sends a
Client Hello
message, which is in plaintext and generally carries theserver_name
. The GFW can know which website you are trying to access and block domains that are not on the whitelist (e.g., discord.com). Sinceserver_name
is actually an extension and not mandatory, you can avoid SNI blocking by not sending it.
Now, let's analyze the GFW's blocking situation for different websites#
We use WireShark for packet capturing.
-
First, try to access
www.baidu.com
, which is a domain not blocked by the GFW.-
Let's ping it first.
-
Get the IP:
2408:873d:22:18ac:0:ff:b021:1393
-
Force binding through Hosts.
-
Capture packets with WireShark, and you can see that the
Client Hello
sent by the client clearly shows theServer Name
field, and it can also receiveServer Hello
normally, then both parties start communication.
-
Check the browser, the website is accessible.
-
-
Let's try accessing
discord.com
.-
Let's ping it first, and we can find that both the domain and the resolved IP are unreachable.
-
At this point, we try to use
itdog.cn
for v4 ping and sequentially ping the resolved domain names.
-
It can be seen that the first IP is reachable.
-
Force binding Hosts and try to capture packets.
-
- It can be seen that after forcing Hosts binding, the
Client Hello
sent by the client is detected by the GFW with theServer Name
field, and then the GFW sends aRST
message to the client, which requests to reset the client connection. On the client side, it receivesERR_CONNECTION_RESET
, meaning the connection has been reset. The user cannot access the webpage.
Next, try sending an empty Server Name
message#
Successfully accessed. The Server Name
field was not found in WireShark.
The killer move, tcpioneer#
It modifies TCP packets in such a way that the GFW cannot detect them, and WireShark cannot capture the Client Hello
message, but it can still establish a connection, meaning the server sends Server Hello
.