I have multiple web servers behind one IP address and use port forwarding to each allowing external access. The main web server is externally accessible on port 80, but the others require manual port selection in a URL. This is annoying to say the least, and potentially problematic if a surfer is restricted only to port 80.
The solution is to use Squid in reverse proxy, or web/http accelerator, mode. The turns Squid on its head, serving requests from the outside and fetching content from internal servers. In effect, it listens on external port 80 and connects to the correct internal web server depending on the Host header in the HTTP request.
Depending on your requirements there are a few issues to take into account. In my particular case, I did not want Squid to perform any caching – it is purely for routing external requests. The other major requirement is that logging of external IP addresses should still work in each web server’s access log. This is easily achieved with a bit of additional configuration.
Your complete squid.conf should be made up of the following:
http_port 8080 vhost
I port forward external port 80 to internal port 8080, as there is already a web server running on port 80 on the same machine (it is generally not a good idea to have Squid and Apache on the same machine, but I’m not caching to disk and my site isn’t that popular). An additional ‘default_site=<host>’ option can be added if you want to handle requests that do not specify a particular Host.
cache_peer <Web server IP> parent 80 0 no-query originserver no-digest login=PASS name=<Any name for later reference>
‘no-query’ turns of ICP requests, ‘originserver’ tells Squid this is a web server, ‘no-digest’ turns off Squid’s annoying requests to ’squid-internal-periodic’ on each web server (which would never be fulfilled), ‘login=PASS’ allows basic authentication to be passed to the web servers (it tells Squid to trust them, and not strip the Authorization or WWW-Authenticate headers from requests), and ‘name’ is just a name we use later (I use ‘<hostname>_<server type>’).
Then for each Host/domain you wish to forward:
acl <acl name> dstdomain <Host/domain> http_access allow <acl name> cache_peer_access <cache_peer name> allow <acl name> [Remaining acl's, if any.] cache_peer_access <cache_peer name> deny all
A nice addition for testing is:
acl LocalWWW dstdomain <internal test hostname> acl localnet src 192.168.0.0/16 # RFC1918 possible internal network http_access allow LocalWWW localnet [Place amongst cache_peer_access lines for default web server, before 'deny all' line.] cache_peer_access <cache_peer name of default server> allow LocalWWW
This allows you to test Squid on your internal network by requesting a page from an internal hostname (e.g. if you run an internal DNS server). This is obviously inaccessible from the outside world.
To tighten things up, also add:
acl localhost src 127.0.0.1/32 acl manager proto cache_object acl Safe_ports port 80 # http acl CONNECT method CONNECT http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT http_access deny all
Make sure the last line is the last of all the ‘http_access’ lines!
This is where the tricks begin:
X-Forwarded-For is the de facto HTTP request Header for notifying upstream servers that the request has been forwarded by a proxy server. This is the key to enabling logging of external IP addresses by the web server: instead of logging the IP address of the client connection (which will always be that of Squid’s machine), it will use the value in this header. This can be easily done in Apache by:
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined_forward_for
There is one problem: if the request has already come from a proxy and adds X-Forward-For to the request header before arriving at this Squid server, this Squid instance will add this machine’s IP to the header (as per the spec), which will result in a list of IP addresses. If the above ‘LogFormat’ is used as-is, log entries will be invalid (causing log analysers to reject such lines). This is not what we want. To work around this, add the following to squid.conf:
header_replace X-Forwarded-For
This instructs Squid to replace an incoming request’s X-Forward-For header with nothing (i.e. remove it). Luckily, this is done before Squid adds its own entry to X-Forward-For, which means there will always be only one IP address in the header when the request is forwarded on.
(If you wish to skip the remainder of the Squid configuration tips and proceed directly to the rest of logging in Apache, click here.)
Next, you probably don’t want Via, X-Cache and X-Squid headers to be sent back to clients, as you would like Squid’s presence to be (mostly) invisible to the outside world. You can add the following to instruct Squid not to add such headers when replying to external requests (they are still added for internal network requests, which can help with debugging):
via off reply_header_access X-Cache-Lookup deny !localnet reply_header_access X-Squid-Error deny !localnet reply_header_access X-Cache deny !localnet
To avoid caching anything (just act as a request router):
cache deny all
To conserve memory:
memory_pools off
If you are already running squid on the same machine in normal mode (i.e. serving internal clients), then specify a different SNMP port (if you’re using SNMP on the original instance):
acl snmppublic snmp_community public snmp_access allow snmppublic localnet snmp_access deny all snmp_port 3402 # +1 from default
If you use ICP and have an existing instance, set a different port:
icp_port 3131 # +1 from default
To speed up restarts while testing:
shutdown_lifetime 0 seconds
Don’t forget your log:
access_log /var/log/squid3/access.log squid
And finally:
cache_mgr <spam-safe email address, or something made-up if you don't wish to be contacted> cachemgr_passwd <password>
That’s it for Squid. Now to Apache:
In addition to the custom ‘LogFormat’ specified above, you can add this to a new file in the ‘conf.d’ directory:
SetEnvIf Remote_Addr "^127\.0\.0\.1$" local_network SetEnvIf Remote_Addr "^192\.168\." local_network SetEnvIf X-Forwarded-For "^$" normal_request SetEnvIf X-Forwarded-For ".+" via_accel
Then in each virtual host (’VirtualHost’) add:
RewriteEngine On RewriteCond %{ENV:local_network} =1 RewriteCond %{ENV:via_accel} !=1 RewriteRule .* - [env=local_request:1]
These directives will set the ‘local_request’ variable whenever a direct request is made to the webserver on the internal network. This is good for restricting directory/page access to internal clients, but not requests forwarded through Squid (which would appear as a local network client if only using IP address restriction). An example is:
<Location /blah> Order Allow,Deny Allow from env=local_request </Location>
Now the grand finale: logging. Just use the following:
CustomLog /var/log/apache2/access.log combined_forward_for env=via_accel CustomLog /var/log/apache2/access.log combined env=normal_request
The same file can be used over multiple ‘CustomLog’ lines, and the appropriate line is chosen depending on whether the request is direct or forwarded through Squid. Either way, the origin IP address is logged.
If you use IIS, then this ISAPI DLL is for you. It will detect the X-Forwarded-For header and change the request address server variable to the X-Forwarded-For value (I do not know how it handles the IP address list problem). Remember to have it execute first in the filter list.
A good place to visit when getting started with Squid configuration is their wiki. SNMP information can be found here.
This is a good page on understanding the refresh_pattern directive.