X-forwarded-for is an HTTP extension header that serves primarily to allow Web servers to obtain the real IP addresses of visitors (which may or may not be true, as we’ll see later).

Why is it that Web servers only get real IP addresses through the X-Forwarded-For header? In PHP, developers who don’t understand how this works will use the $_SERVER[‘REMOTE_ADDR’] variable to get the client’S IP address. This server variable indicates the IP address to shake with the Web server (this cannot be forged). But many users access the server through proxies, so if you use this global variable, PHP gets the IP of the proxy server (not the user).

It may be confusing to many people, so let’s look at the possible path a request could take:

Client => (Forward proxy => Transparent proxy => server Reverse proxy =>) Web serverCopy the code

The forward proxy, transparent proxy and server reverse proxy do not necessarily exist.

  • What is a forward proxy? Many enterprises will set up a proxy on their egress gateway (mainly to speed up and save traffic).
  • Transparent proxies may be agents set up by the user (for example, to get over a wall, bypassing the company’s forward proxies).
  • The server reverse proxy is deployed in front of the Web server for load balancing and security reasons.

Now consider several scenarios:

  • $_SERVER[‘REMOTE_ADDR’] gets the real IP address of the client if the client is directly connected to the Web server (assuming the Web server has a public address).
  • If a reverse proxy (such as Nginx) is deployed on the Web server, $_SERVER[‘REMOTE_ADDR’] gets the IP address of the reverse proxy device (Nginx).
  • If the client directly connects to the Web server through a forward proxy (assuming that the Web server has a public address), $_SERVER[‘REMOTE_ADDR’] obtains the IP address of the forward proxy device.

$_SERVER[‘REMOTE_ADDR’] returns the IP from the TCP connection to the Web server (this can’t be forged, and Web servers don’t change this header).

X-Forwarded-For

Http_forwarded-for http_FORWARded-for Http_FORWARded-for Http_FORWARded-for Http_FORWARded-for http_FORWARded-for http_FORWARded-for http_FORWARded-for http_FORWARded-for http_FORWARded-for This protocol header was also drafted by Squid (Squid was probably one of the first proxies).

The format of this protocol header is:

X-Forwarded-For: client, proxy1, proxy2Copy the code

Client represents the user’s real IP address. The proxy server adds the user’s IP address to this header each time it passes through the proxy server. Note that the last proxy server does not attach its OWN IP address to x-Forwarded-For. The last proxy server’s IP address should be $_SERVER[‘REMOTE_ADDR’].

For example, if the IP address of A user (A) passes through two proxy servers (B and C) and finally reaches the Web server, the X-Forwarded-For message sent to the Web server is A and B.

So how does PHP get the real client IP?

$ip = isset($_SERVER['HTTP_X_FORWARDED_FOR'])?trim($_SERVER['HTTP_X_FORWARDED_FOR') :' ';
if(! $ip) { $ip = isset($_SERVER['REMOTE_ADDR'])?trim($_SERVER['REMOTE_ADDR') :' ';
}
$a = explode('|', str_replace(', '.'|', $ip));
$ip = trim($a[0]);Copy the code

Just to be clear, assume that these two proxy servers are good proxy servers and do not forge HTTP_X_FORWARDED_FOR.

Configuring a Reverse Proxy

It keeps talking about agency, so you might think what’s the point? Different types of agent have different purposes, for forward agent is to speed up and give the IP address of the LAN users have a real, and transparent proxy is mainly for some other purpose (just don’t want to let others know my IP, for example), and reverse proxy basically is the enterprise internal security and load balance, This section describes how to configure a reverse proxy.

Now, as long as there is a certain size of the website (Web server more than 1), for security and load balancing considerations will be deployed in front of the Web server reverse proxy, such as HAproxy, Nginx, Apache and so on.

Here we deploy the reverse proxy using Nginx:

proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;Copy the code

Here’s a quick explanation:

  • X-forwarded-for indicates that the header received by Nginx is Forwarded as is. (If it is not Forwarded, the Web server cannot obtain this header.)
  • X-real-ip, this is an internal protocol header (as agreed between the reverse proxy server and the Web server), this header represents the IP address of the connection to the reverse proxy server (this address cannot be forged), but I think it should not be set like this in order to keep the PHP code unambiguousproxy_set_header REMOTE_ADDR $remote_addr;

How do Apache WEB server Access logs get the X-Forwarded-For header

If x-Forwarded-For is forwardedto Apache, it is forwardedto X-Forwarded-For. If x-Forwarded-For is forwardedto Apache, it is forwardedto X-Forwarded-For. If x-Forwarded-For is forwardedto Apache, it is forwardedto X-Forwarded-For.

LogFormat "%{X-Forwarded-For}i %a %h %A %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combinedCopy the code

X – Forwarded – For security

If you set this to X-Forwarded-For, you’ll be able to get the real IP address of the user. For Web servers, security has two dimensions. The first dimension is REMOTE_ADDR. The second latitude is X-Forwarded-for, but this header can be forged.

So who’s faking it? , let’s take a look respectively:

Forwarded-forwarded-for proxies are typically used by companies For acceleration. If there is no specific purpose, they should not carry the X-Forwarded-For header, because its upper bound is an internal IP address, which should not be exposed. Of course, they can transparently carry the header’s value (which can be forged by the user).

Transparent proxy, this may be the user’s own building (over the wall, for example), and in a user’s request, there can be multiple transparent proxy, transparent proxy could not, at this time in order to let oneself as far as possible the correct, will also be transparent passed the first value (the value users can forge), of course, some illegal business or personnel, for some purpose, Changes the value of the header (such as IP addresses from around the world).

Reverse proxy, the reverse proxy server in front of the Web server does not forge (the same company), usually passes the value of the header as is.

So what about the application, since this value cannot be trusted completely? It depends on the nature of the application:

If the service is likely to be unclassified and does not need to know the user’s real IP address, it is recommended that the application or Web server impose restrictions on REMOTE_ADDR, such as speed limiting, or allow some whitelisted proxy IP addresses that are difficult to measure.

If you have an important service, like a lottery (each IP address can only be entered once), you might want to get the user’s real IP address via X-Forwarded-For (if REMOTE_ADDR is used, this would kill one). But since X-Forwarded-For might be bogus, So there’s really no good way to do it but at the application layer.