Grace in the scope of Varnish means delivering otherwise expired objects when circumstances call for it. This can happen because:

  • The backend-director selected is down
  • A different thread has already made a request to the backend that's not yet finished.

Both cases hare handled the same in VCL. However, you may want different grace periods for the different scenarios.

sub vcl_recv {
  if (req.backend.healthy) {
    set req.grace = 30s;
  } else {
    set req.grace = 1h;

sub vcl_fetch {
   set beresp.grace = 1h;

This code snippet tells varnish to keep objects 1 hour in cache past their expiry time. The code in recv tells varnish that if the backend is healthy, only accept objects that are 30 seconds old, but if the backend is sick, accept objects that are up to an hour old.

The effect this has is that as long as a backend is healthy, clients will be delivered content that is no more than 30 seconds past it's TTL. After that, all clients will have to wait for the backend request to finish. There are two reasons you want this:

  1. If an object has 1200 requests/s and it takes 2 seconds to refresh it, you now have 2400 clients waiting for the same object before it's finished. That's not good.
  2. If you have 1200requests/s and it takes 2 seconds to refresh it, your clients will have waited 20+ minutes total for this object.

Setting req.grace higher when the backend is sick is done to ensure delivery of old content when there is no way to get new content.

The value of beresp.grace is the maximum grace-period for an object, which affects when it's purged. In other words: beresp.grace should be set to the maximum value you will ever want to set req.grace to.

Note: For grace to kick in when a backend is sick, BackendPolling must be enabled. Otherwise, the backend will always be considered healthy.