req.hash_always_miss doesn't work sometimes

Henry Qian Henry.Qian at datasphere.com
Fri Sep 7 02:59:05 CEST 2012


One additional information is that if I use browser to trigger the refresh http://riverdale.11alive.com/?refreshcache, and later use browser to visit http://riverdale.11alive.com, the newly refreshed content is always served.

The randomness of serving refreshed content occurs when I use a java program to programmatically trigger the refresh, and later use browser to visit the site.  In vcl_hash you can find only req.url, req.http.host, server.ip, X-UA-Family are used for hash. I can guarantee these 4 attributes are exactly same when I trigger the refresh with browser and java program. So the hashed object should be same. Could Varnish 3.x change something and use some additional attributes not defined in vcl_hash to identify hash objects?

I tested on Varnish 3.02 + CentOS 6.2 with same VCL, same problem still exists.

While under old Varnish version 2.0.6, hash_always_miss was not available, so I used the following VCL to refresh the cache with the same java program, it always succeed with no issues.
sub vcl_recv {
set req.grace = 5m; if ((req.http.user-agent == "refresh")||(req.url ~ “\?refreshcache”)) {
set req.grace = 0s;
}
}
sub vcl_hit {
if (((req.http.user-agent == "refresh")||(req.url ~ “\?refreshcache”)) && req.restarts == 0) {
set obj.ttl = 0s; restart;
}
}
From: varnish-misc-bounces at varnish-cache.org [mailto:varnish-misc-bounces at varnish-cache.org] On Behalf Of Henry Qian
Sent: Thursday, September 06, 2012 5:13 PM
To: varnish-misc at varnish-cache.org
Subject: req.hash_always_miss doesn't work sometimes

Environment is Varnish 3.03 + CentOS 6.2.

I use req.hash_always_miss=true in vcl_recv to refresh cached pages. However I found the req.hash_always_miss=true doesn’t always works. Sometimes Varnish serves newly refreshed cache content, sometimes Varnish servers old cache content, even through the varnish log reveals the new cache content has been looked up and correctly delivered.

Here is the VCL.

sub vcl_recv {
    set req.backend=hyperlocal;

    if (req.backend.healthy) {
      set req.grace = 30s;
    } else {
      set req.grace = 24h;
    }

    if ((req.http.user-agent == "refresh") || (req.url ~ "\?refreshcache")) {
          set req.hash_always_miss = true;
    }

    if (req.restarts == 0) {
      if (req.http.x-forwarded-for) {
          set req.http.X-Forwarded-For =
            req.http.X-Forwarded-For + ", " + client.ip;
      } else {
          set req.http.X-Forwarded-For = client.ip;
      }
    }
    if (req.request != "GET" &&
      req.request != "HEAD" &&
      req.request != "PUT" &&
      req.request != "POST" &&
      req.request != "TRACE" &&
      req.request != "OPTIONS" &&
      req.request != "DELETE") {
        /* Non-RFC2616 or CONNECT which is weird. */
        return (pipe);
    }
    if (req.request != "GET" && req.request != "HEAD") {
        /* We only deal with GET and HEAD by default */
        return (pass);
}

    return (lookup);
}

# sub vcl_pipe {
#     # Note that only the first request to the backend will have
#     # X-Forwarded-For set.  If you use X-Forwarded-For and want to
#     # have it set for all requests, make sure to have:
#     # set bereq.http.connection = "close";
#     # here.  It is not set by default as it might break some broken web
#     # applications, like IIS with NTLM authentication.
#     return (pipe);
# }
#
# sub vcl_pass {
#     return (pass);
# }

sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }

    # Add something to differential mobile, tablet, desktop visit.
    if (req.http.X-UA-Family) {
        hash_data(req.http.X-UA-Family);
    }

    return (hash);
}

#sub vcl_hit {
#    return (deliver);
#}

# sub vcl_miss {
#     return (fetch);
# }

sub vcl_fetch {
    # Grace to allow varnish to serve content if backend is lagged
set beresp.grace = 24h;

if (req.url == "/") {
      set beresp.ttl = 720m;        // 12 hours
    } else {
      set beresp.ttl = 14400m;               // 10 days
    }

    if (req.http.Cookie ~ "(VARNISH|DRUPAL_UID=[^-])") {
      set beresp.http.cache-control = "no-cache";
      set beresp.http.X-Cacheable = "NO";
      set beresp.http.ETag = "ADMIN";
    } else {
      set beresp.http.X-Cacheable = "YES";
    }

    if (beresp.ttl <= 0s ||
        beresp.http.Set-Cookie ||
        beresp.http.Vary == "*") {
            /*
            * Mark as "Hit-For-Pass" for the next 2 minutes
            */
            set beresp.ttl = 120 s;
            return (hit_for_pass);
    }

    return (deliver);
}

sub vcl_deliver {
   set resp.http.age = "0";

   #add cache hit data
   if (obj.hits > 0) {
     #if hit add hit count
     set resp.http.X-Cache = "HIT";
     set resp.http.X-Cache-Hits = obj.hits;
   } else {
     set resp.http.X-Cache = "MISS";
   }

    return (deliver);
}


The below Varnishlog reveals the content of http://riverdale.11alive.com/ has been refreshed, triggered by a special request http://riverdale.11alive.com/?refreshcache. The log reveals the refresh is successful.

   60 SessionOpen  c 127.0.0.1 56116 :6081
   60 ReqStart     c 127.0.0.1 56116 1570899703
   60 RxRequest    c GET
   60 RxURL        c /?refreshcache
   60 RxProtocol   c HTTP/1.0
   60 RxHeader     c Host: riverdale.11alive.com
   60 RxHeader     c X-Real-IP: 10.112.0.221
   60 RxHeader     c X-Forwarded-For: 10.113.0.134, 10.112.0.221
   60 RxHeader     c X-Client-Verify: SUCCESS
   60 RxHeader     c X-UA-Family: desktop
   60 RxHeader     c Connection: close
   60 RxHeader     c User-Agent: Java/1.6.0_20
   60 RxHeader     c Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
   60 VCL_call     c recv lookup
   60 VCL_call     c hash
   60 Hash         c /
   60 Hash         c riverdale.11alive.com
   60 Hash         c desktop
   60 VCL_return   c hash
   60 VCL_call     c miss fetch
   60 Backend      c 61 hyperlocal hyperlocal
   60 TTL          c 1570899703 RFC 120 -1 -1 1346968996 0 1346968992 0 0
   60 VCL_call     c fetch
   60 TTL          c 1570899703 VCL 123 86400 -1 1346968993 -3
   60 TTL          c 1570899703 VCL 43203 86400 -1 1346968993 -3
   60 VCL_return   c deliver
   60 ObjProtocol  c HTTP/1.1
   60 ObjResponse  c OK
   60 ObjHeader    c Date: Thu, 06 Sep 2012 22:03:12 GMT
   60 ObjHeader    c Server: Apache/2.2.15 (CentOS)
   60 ObjHeader    c Vary: User-Agent,Accept-Encoding
   60 ObjHeader    c X-Powered-By: PHP/5.3.3
   60 ObjHeader    c X-Drupal-Cache: MISS
   60 ObjHeader    c Last-Modified: Thu, 06 Sep 2012 22:03:12 +0000
   60 ObjHeader    c Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
   60 ObjHeader    c ETag: "1346968992"
   60 ObjHeader    c Content-Encoding: gzip
   60 ObjHeader    c Content-Type: text/html; charset=utf-8
   60 ObjHeader    c X-Cacheable: YES
   60 Gzip         c u F - 27026 155024 80 80 216139
   60 VCL_call     c deliver deliver
   60 TxProtocol   c HTTP/1.1
   60 TxStatus     c 200
   60 TxResponse   c OK
   60 TxHeader     c Server: Apache/2.2.15 (CentOS)
   60 TxHeader     c Vary: User-Agent,Accept-Encoding
   60 TxHeader     c X-Powered-By: PHP/5.3.3
   60 TxHeader     c X-Drupal-Cache: MISS
   60 TxHeader     c Last-Modified: Thu, 06 Sep 2012 22:03:12 +0000
   60 TxHeader     c Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
   60 TxHeader     c ETag: "1346968992"
   60 TxHeader     c Content-Type: text/html; charset=utf-8
   60 TxHeader     c X-Cacheable: YES
   60 TxHeader     c Date: Thu, 06 Sep 2012 22:03:16 GMT
   60 TxHeader     c X-Varnish: 1570899703
   60 TxHeader     c Via: 1.1 varnish
   60 TxHeader     c Connection: close
   60 TxHeader     c age: 0
   60 TxHeader     c X-Cache: MISS
   60 Gzip         c U D - 27026 155024 80 80 216139
   60 Length       c 155024
   60 ReqEnd       c 1570899703 1346968992.932512045 1346968996.214715719 0.000031471 3.280443430 0.001760244

However later when I visited the http://riverdale.11alive.com/, I got old content with the following header information. From the header information below, you can find the content is last modified on Thu, 06 Sep 2012 21:31:16 +0000, instead of the Last-Modified: Thu, 06 Sep 2012 22:03:12 +0000 as indicated in the varnish log.

Response Headersview source
Age 0
Cache-Control no-cache, must-revalidate, post-check=0, pre-check=0
Connection close
Content-Encoding gzip
Content-Type text/html; charset=utf-8
Date Thu, 06 Sep 2012 22:10:57 GMT
Etag "1346967076"
Last-Modified Thu, 06 Sep 2012 21:31:16 +0000
Server nginx/1.0.14
Transfer-Encoding chunked
Vary Accept-Encoding, Accept-Encoding
Via 1.1 varnish
X-Cache HIT
X-Cache-Hits 4
X-Cacheable YES
X-Drupal-Cache MISS
X-Powered-By PHP/5.2.10
X-Varnish 1570906726 1570869110

Request Headersview source
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding gzip, deflate
Accept-Language en-us,en;q=0.5
Cache-Control no-cache
Connection keep-alive
Cookie s_cc=true; s_sq=%5B%5BB%5D%5D; hl-email-avoid=true; __utma=196336364.508874118.1346959523.1346959523.1346969008.2; __utmc=196336364; __utmz=196336364.1346959523.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); SiteLifeHost=l3vm105l3pluckcom; anonId=0227849c-521b-4054-a1cc-864ad8a54073; __utmb=196336364.34.0.1346969440935; BC_BANDWIDTH=1346969016992X1808
Host riverdale.11alive.com
Pragma no-cache
User-Agent Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0

Did I miss anything or Varnish req.hash_always_miss doesn't work as expected sometimes and thus is a bug?

Here I also post a good one, which is actually the request just before the bad one. You can tell it by checking the ObjHeader ETag. The good one is “1346968991”, while the bad one is “1346968992”.  Varnish as expected miss fetched the content from backend.

   79 SessionOpen  c 127.0.0.1 55822 :6081

   79 ReqStart     c 127.0.0.1 55822 1570899684

   79 RxRequest    c GET

   79 RxURL        c /?refreshcache

   79 RxProtocol   c HTTP/1.0

   79 RxHeader     c Host: northslocounty.kcoy.com

   79 RxHeader     c X-Real-IP: 10.112.0.221

   79 RxHeader     c X-Forwarded-For: 10.113.0.134, 10.112.0.221

   79 RxHeader     c X-Client-Verify: SUCCESS

   79 RxHeader     c X-UA-Family: desktop

   79 RxHeader     c Connection: close

   79 RxHeader     c User-Agent: Java/1.6.0_20

   79 RxHeader     c Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2

   79 VCL_call     c recv lookup

   79 VCL_call     c hash

   79 Hash         c /

   79 Hash         c northslocounty.kcoy.com

   79 Hash         c desktop

   79 VCL_return   c hash

   79 VCL_call     c miss fetch

   79 Backend      c 80 hyperlocal hyperlocal

   79 TTL          c 1570899684 RFC 120 -1 -1 1346968993 0 1346968991 0 0

   79 VCL_call     c fetch

   79 TTL          c 1570899684 VCL 122 86400 -1 1346968991 -2

   79 TTL          c 1570899684 VCL 43202 86400 -1 1346968991 -2

   79 VCL_return   c deliver

   79 ObjProtocol  c HTTP/1.1

   79 ObjResponse  c OK

   79 ObjHeader    c Date: Thu, 06 Sep 2012 22:03:11 GMT

   79 ObjHeader    c Server: Apache/2.2.3 (CentOS)

   79 ObjHeader    c X-Powered-By: PHP/5.2.10

   79 ObjHeader    c X-Drupal-Cache: MISS

   79 ObjHeader    c Last-Modified: Thu, 06 Sep 2012 22:03:11 +0000

   79 ObjHeader    c Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0

   79 ObjHeader    c ETag: "1346968991"

   79 ObjHeader    c Vary: Accept-Encoding

   79 ObjHeader    c Content-Encoding: gzip

   79 ObjHeader    c Content-Type: text/html; charset=utf-8

   79 ObjHeader    c X-Cacheable: YES

   79 Gzip         c u F - 19388 110022 80 80 155035

   79 VCL_call     c deliver deliver

   79 TxProtocol   c HTTP/1.1

   79 TxStatus     c 200

   79 TxResponse   c OK

   79 TxHeader     c Server: Apache/2.2.3 (CentOS)

   79 TxHeader     c X-Powered-By: PHP/5.2.10

   79 TxHeader     c X-Drupal-Cache: MISS

   79 TxHeader     c Last-Modified: Thu, 06 Sep 2012 22:03:11 +0000

   79 TxHeader     c Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0

   79 TxHeader     c ETag: "1346968991"

   79 TxHeader     c Vary: Accept-Encoding

   79 TxHeader     c Content-Type: text/html; charset=utf-8

   79 TxHeader     c X-Cacheable: YES

   79 TxHeader     c Date: Thu, 06 Sep 2012 22:03:12 GMT

   79 TxHeader     c X-Varnish: 1570899684

   79 TxHeader     c Via: 1.1 varnish

   79 TxHeader     c Connection: close

   79 TxHeader     c age: 0

   79 TxHeader     c X-Cache: MISS

   79 Gzip         c U D - 19388 110022 80 80 155035

   79 Length       c 110022

   79 ReqEnd       c 1570899684 1346968991.255424976 1346968992.866729498 0.000033379 1.610149622 0.001154900



And then later when I visit the site, http://northslocounty.kcoy.com, the latest cache content was served. Please see header information below.

Response Headersview source

Age     0

Cache-Control no-cache, must-revalidate, post-check=0, pre-check=0

Connection      close

Content-Encoding      gzip

Content-Type  text/html; charset=utf-8

Date    Thu, 06 Sep 2012 22:05:42 GMT

Etag     "1346968991"

Last-Modified Thu, 06 Sep 2012 22:03:11 +0000

Server  nginx/1.0.14

Transfer-Encoding      chunked

Vary    Accept-Encoding, Accept-Encoding

Via      1.1 varnish

X-Cache          HIT

X-Cache-Hits  2

X-Cacheable   YES

X-Drupal-Cache          MISS

X-Powered-By            PHP/5.2.10

X-Varnish       1570901954 1570899684



Request Headersview source

Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

Accept-Encoding        gzip, deflate

Accept-Language        en-us,en;q=0.5

Connection      keep-alive

Cookie hl-email-avoid=true

Host    northslocounty.kcoy.com

User-Agent     Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0
Thanks,
-Henry


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20120907/fbfbbce2/attachment-0001.html>


More information about the varnish-misc mailing list