Table of Contents
- General
- Features
- Configuration
- VCL
- How do I…
- How can I force a refresh on a object cached by varnish?
- How can I debug the requests of a single client?
- How can I rewrite URLS before they are sent to the backend?
- I have a site with many hostnames, how do I keep them from multiplying the …
- How can I log the client IP address on the backend?
- How do I add a HTTP header?
- How do I do to alter the request going to the backend?
- How do I force the backend to send Vary headers?
- How can I customize the error messages that Varnish returns?
- How do I instruct varnish to ignore the query parameters and only cache …
- Troubleshooting
Frequently asked questions about Varnish
General
What does Varnish mean?
Varnish means:
r.v. var·nished, var·nish·ing, var·nish·es 1. To cover with varnish. 2. To give a smooth and glossy finish to. 3. To give a deceptively attractive appearance to; gloss over.
We use Varnish in the 3rd meaning of the word. Who are we trying to kid? We are using it to make a bad backend (origin server) look good.
Why bother with Varnish - why not use Squid?
Varnish was written from the ground up to be a high performance caching reverse proxy. Squid is a forward proxy that can be configured as a reverse proxy. Besides - Squid is rather old and designed like computer programs were supposed to be designed in 1980. Please see ArchitectNotes for details.
Does that mean I can't use Varnish as a forward proxy?
You can, but you probably don't want to. Doing it requires significant amounts of DNS magic and a huge Varnish VCL file.
Are there binary packages available for my flavour of Linux?
Probably. Varnish has been packaged for Debian, Ubuntu, RHEL, Centos, (Open)SuSE and Gentoo.
Features
Can I see what Varnish holds in the cache?
No, unfortunately not. There are several reasons for this:
First, it would be a very dangerous command since it has the potential to generate a LOT of output if used carelessly.
Second, traversing the hash table in order to dump the contents would interfere somewhat with the delivery of contents.
Third, doing the obvious "show me what matches the rexep foo.[a-z][a-n][b-w]" would be horribly expensive CPU wise.
Is there any way to do HTTPS with Varnish?
HTTPS traffic is encrypted, with individual encryption keys for each user. Thus, no HTTPS object will look the same to Varnish, and it will be impossible to cache. A possible workaround is to let pound or stunnel do the HTTPS part, and forward the traffic to Varnish.
Does Varnish support compression?
This is a simple question with a complicated answer; see FAQ/Compression.
Where can I find the log files?
Varnish does not log to a file, but to shared memory log. Use the varnishlog utility to print the shared memory log or varnishncsa to present it in the Apache / NCSA "combined" log format.
What is the purpose of the X-Varnish HTTP header?
The X-Varnish HTTP header allows you to find the correct log-entries for the transaction. For a cache hit, X-Varnish will contain both the ID of the current request and the ID of the request that populated the cache. It makes debugging Varnish a lot easier.
Why does the Via: header say "1.1" for Varnish 2.0 ?
The Via header is standardized in RFC2616 section 14.45, and reports which HTTP version the client used.
Configuration
How large a cache do I need?
You should have a cache at least the size of your working data set. However, there is no penalty for having a larger cache. If you got disk to spare - set the cache as large as you can.
How do I regulate how much memory Varnish will use for caching?
You don't. ;-) Your OS will regulate this automatically. Please see the ArchitectNotes for details.
Can Varnish run on its own server, or must it be on the web server?
Varnishd doesn't care, it receives requests and answers them, using HTTP to contact the backend server when necessary. You can put it on the same machine, on a different machine with any number of network interfaces you like.
Can I run multiple instances of Varnish on the same server?
Yes. Just remember to set the -n argument of varnishd to something different for each instance.
VCL
How do I load VCL file while Varnish is running?
- Place the VCL file on the server
- Telnet into the managment port.
- do a "vcl.load <configname> <filename>" in managment interface. <configname> is whatever you would like to call your new configuration.
- do a "vcl.use <configname>" to start using your new config.
Does Varnish require the system to have a C compiler?
Yes. The VCL compiler generates C source as output, and uses the systems C-compiler to compile that into a shared library. If there is no C compiler, Varnish will not work.
... Isn't that security problem?
The days when you could prevent people from running non-approved programs by removing the C compiler from your system ended roughly with the VAX 11/780 computer.
What is a VCL file?
VCL is an acronym for Varnish Configuration Language. In a VCL file, you configure how Varnish should behave. Sample VCL files will be included in this Wiki at a later stage.
Where is the documentation on VCL?
Please see "man 7 vcl".
Should I use pipe or pass in my VCL code? What is the difference?
When varnish does a pass it acts like a normal HTTP proxy. It reads the request and pushes it onto the backend. The next HTTP request can then be handled like any other.
pipe is only used when Varnish for some reason can't handle the pass. pipe reads the request, pushes in onty the backend _only_ pushes bytes back and forth, with no other actions taken.
Since most HTTP clients do pipeline several requests into one connection this might give you an undesirable result - as every subsequent request will reuse the existing pipe.
Varnish versions prior to 2.0 does not support handling a request body with pass mode, so in those releases pipe is required for correct handling.
In 2.0 and later, pass will handle the request body correctly.
Can Varnish do load balancing?
Yes, Varnish allows backends to be grouped in a director, which directs requests to its members in a pre-defined fashion. Here is an example of a round robin director:
director www-director round-robin {
{ .backend = www; }
{ .backend = { .host = "www2.example.com; .port = "http"; } }
}
This director can now be used in the same way as a backend in vcl_recv :
sub vcl_recv {
if (req.http.host ~ "^(www.)?example.com$") {
set req.backend = www-director;
}
}
How do I…
How can I force a refresh on a object cached by varnish?
Refreshing is often called purging a document. You can purge at least 2 different ways in Varnish:
1. From the command line you can write:
url.purge ^/$
to purge your / document. As you might see url.purge takes an regular expression as its argument. Hence the ^ and $ at the front and end. If the ^ is ommited, all the documents ending in a / in the cache would be deleted.
So to delete all the documents in the cache, write:
url.purge .*
at the command line.
2. HTTP PURGE
VCL code to allow HTTP PURGE is to be found here. Note that this method does not support wildcard purging.
How can I debug the requests of a single client?
The "varnishlog" utility may produce a horrendous amount of output. To be able debug our own traffic can be useful.
The ReqStart? token will include the client IP address. To see log entries matching this, type:
$ varnishlog -c -o ReqStart 192.0.2.123
To see the backend requests generated by a client IP address, we can match on the TxHeader? token, since the IP address of the client is included in the X-Forwarded-For header in the request sent to the backend.
At the shell command line, type:
$ varnishlog -b -o TxHeader 192.0.2.123
How can I rewrite URLS before they are sent to the backend?
You can use the "regsub()" function to do this. Here's an example for zope, to rewrite URL's for the virtualhostmonster:
if (req.http.host ~ "^(www.)?example.com") {
set req.url = regsub(req.url, "^", "/VirtualHostBase/http/example.com:80/Sites/example.com/VirtualHostRoot");
}
I have a site with many hostnames, how do I keep them from multiplying the cache?
You can do this by normalizing the "Host" header for all your hostnames. Here's a VCL example:
if (req.http.host ~ "^(www.)?example.com") {
set req.http.host = "example.com";
}
How can I log the client IP address on the backend?
All I see is the IP address of the varnish server. How can I log the client IP address?
We will need to add the IP address to a header used for the backend request, and configure the backend to log the content of this header instead of the address of the connecting client (which is the varnish server).
Varnish configuration:
sub vcl_recv {
# Add a unique header containing the client address
remove req.http.X-Forwarded-For;
set req.http.X-Forwarded-For = client.ip;
# [...]
}
For the apache configuration, we copy the "combined" log format to a new one we call "varnishcombined", for instance, and change the client IP field to use the content of the variable we set in the varnish configuration:
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" varnishcombined
And so, in our virtualhost, you need to specify this format instead of "combined" (or "common", or whatever else you use)
<VirtualHost *:80> ServerName www.example.com # [...] CustomLog /var/log/apache2/www.example.com/access.log varnishcombined # [...] </VirtualHost>
The mod_extract_forwarded Apache module might also be useful.
How do I add a HTTP header?
To add a HTTP header, unless you want to add something about the client/request, it is best done in vcl_fetch as this means it will only be processed every time the object is fetched:
sub vcl_fetch {
# Add a unique header containing the cache servers IP address:
remove obj.http.X-Varnish-IP;
set obj.http.X-Varnish-IP = server.ip;
# Another header:
set obj.http.Foo = "bar";
}
How do I do to alter the request going to the backend?
You can use the bereq object for altering requests going to the backend but from my experience you can only 'set' values to it. So, if you need to change the requested URL, this doesn't work:
sub vcl_miss {
set bereq.url = regsub(bereq.url,"stream/","/");
fetch;
}
Because you cannot read from bereq.url (in the value part of the assignment). You will get:
mgt_run_cc(): failed to load compiled VCL program: ./vcl.1P9zoqAU.o: undefined symbol: VRT_r_bereq_url VCL compilation failed
Instead, you have to use req.url:
sub vcl_miss {
set bereq.url = regsub(req.url,"stream/","/");
fetch;
}
How do I force the backend to send Vary headers?
We have anectdotal evidence of non-RFC2616 compliant backends, which support content negotiation, but which do not emit a Vary header, unless the request contains Accept headers.
It may be appropriate to send no-op Accept headers to trick the backend into sending us the Vary header.
The following should be sufficient for most cases:
Accept: */* Accept-Language: * Accept-Charset: * Accept-Encoding: identity
Note that Accept-Encoding can not be set to *, as the backend might then send back a compressed response which the client would be unable to process.
This can of course be implemented in VCL.
How can I customize the error messages that Varnish returns?
A custom error page can be generated by adding a vcl_error to your configuration file. The default error page looks like this:
sub vcl_error {
set obj.http.Content-Type = "text/html; charset=utf-8";
synthetic {"
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>"} obj.status " " obj.response {"</title>
</head>
<body>
<h1>Error "} obj.status " " obj.response {"</h1>
<p>"} obj.response {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} req.xid {"</p>
<address><a href="http://www.varnish-cache.org/">Varnish</a></address>
</body>
</html>
"};
deliver;
}
How do I instruct varnish to ignore the query parameters and only cache one instance of an object?
This can be achieved by removing the query parameters using a regexp:
sub vcl_recv {
set req.url = regsub(req.url, "\?.*", "");
}
Troubleshooting
Why does it look like Varnish sends all requests to the backend? I thought it was a cache?
Most likely, the backend does not set an expiry time on the requested image, so Varnish uses the default TTL (normally 120s). Another possibility is that your site uses cookies; by default, Varnish will not serve requests that come with a cookie from its cache.
