How to cache things longer on Varnish than on the client
RFC2616 spends quite a lot of time explaining what the expiration rules are for normal client-side caches.
The explanation is not the best I have seen, and Varnish is not a client side cache anyway, so this is my attempt to set the record straight, or at least firmly crooked on the subject in a Varnish context.
At the sound of the tone…
In an ideal world, all computers would have clocks that show the correct time.
If they did, the Expires: header could be used to say when a given web-object should be thrown away, and my explanation would be done now.
Getting computer clocks in sync is a lot harder than it sounds and despite the valiant efforts of Prof. Dave Mills and his NTP gang, this is far from the situation on the internet.
The main obstacle is that people just does not care enough to do it, and the secondary obstacles are complicated rules for timekeeping, which involves not only time zones and daylight savings time but also leap seconds.
Fortunately, once upon a time it was predicted that some basic web-clients would not have a clock at all, and therefore the RFC2616 standard offers a way to control lifetime in relative terms ("throw away after 600 seconds") instead of absolute terms ("throw away at 10:35:00 20-01-2008 UTC").
RFC2616 specifies an algorithm in section 13.2.4 which combines the absolute information and the relative information, and then picks the earlier of the two resulting deadlines.
The Varnish complication
Varnish does not fit the model in RFC2616 for the simple reason that varnish is not a client side cache, but a part of the web-server.
Where a client cache must be defensive about everything, to not get in the way or change semantics for the client/server relationship, varnish is the server in the relationship and may be responsible for implementing content policies etc.
At the most basic level, how long varnish and the client can cache a given object may differ.
A website may very likely want varnish to cache an object forever, trusting the backend server to explicitly purge it, should it be updated.
But that does not mean that we want the clients to cache the object forever.
Because the backends purge requests can not reach the clients, it is necessary to have the client check back after a reasonable amount of time, to see if the object has changed.
How it works
Varnish acts like a RFC2616 client side cache by default, with the footnote, that if no cacheability information is available, we use a default Time To Live (TTL) from the paramter "default_ttl".
This means that Varnish will respect the s-maxage or max-age Cache-Control fields and will respect Expires headers.
Varnish leaves Expires: and Cache-Control: headers intact, and sets the Age: header with the number of seconds the object have been cached and therefore, any RFC2616 client will do the right thing by default.
How it should work
It is very likely that you want to have Varnish cache objects longer than the clients do, and this is where RFC2616 comes up short: it offers no way to communicate the two different lifetimes from the backend.
The solution is to have the backend emit the objects with the desired headers for client use, and then set the obj.ttl in the VCL code to the longer duration.
But this is not quite enough to get the desired effect.
The Expires header from the backend must be removed, it would pertain only to the direct client onnection case, and it could in theory be replaced, by varnish, with a new header.
According to RFC2616, just issuing a max-age to the client should be just as precise as generating an Expires header, and it has the advantage of not expecting the clients clock to be correct, so unless informs me otherwise, my recommendation is to not bother with Expires.
Besides, we do not have a convenient way to generate this timestamp in Varnish presently.
The Age: header generated by Varnish must also be neutered, otherwise it would grow well beyond the max-age sent to the client.
A solution in VCL could look like this:
sub vcl_fetch {
if (obj.cacheable) {
/* Remove Expires from backend, it's not long enough */
unset obj.http.expires;
/* Set the clients TTL on this object */
set obj.http.cache-control = "max-age = 900";
/* Set how long Varnish will keep it */
set obj.ttl = 1w;
/* marker for vcl_deliver to reset Age: */
set obj.http.magicmarker = 1;
}
}
sub vcl_deliver {
if (resp.http.magicmarker) {
/* Remove the magic marker */
unset resp.http.magicmarker;
/* By definition we have a fresh object */
set resp.http.age = "0";
}
}
