Varnish 503ing on ~1/100 POSTs

Ronan Mullally ronan at iol.ie
Tue Mar 8 21:38:08 CET 2011


I'm seeing intermittant 503s on POSTs to a fairly busy VBulletin website.
The current load is light (up to a couple of thousand active sessions,
peak is around five thousand).  Varnish has a fairly simple config with
a director consisting of two Apache backends:

 backend backend1 {
         .host = "1.2.3.4";
         .port = "80";
         .connect_timeout = 5s;
         .first_byte_timeout = 90s;
         .between_bytes_timeout = 90s;
         .probe = {
                 .timeout = 5s;
                 .interval = 5s;
                 .window = 5;
                 .threshold = 3;
                 .request =
                         "HEAD /favicon.ico HTTP/1.0"
                         "X-Forwarded-For: 1.2.3.4"
                         "Connection: close";
         }
 }

 backend backend2 {
         .host = "5.6.7.8";
         .port = "80";
         .connect_timeout = 5s;
         .first_byte_timeout = 90s;
         .between_bytes_timeout = 90s;
         .probe = {
                 .timeout = 5s;
                 .interval = 5s;
                 .window = 5;
                 .threshold = 3;
                 .request =
                         "HEAD /favicon.ico HTTP/1.0"
                         "X-Forwarded-For: 5.6.7.8"
                         "Connection: close";
         }
 }

The numbers are modest, but significant - about 1 POST in a hundred fails.
I've upped the backend timeouts to 90 seconds (first_byte / between_bytes)
and I'm pretty confident they're responding in well under that time.

varnishlog does not show any backend health changes.  A typical event
looks like:

 Varnish:
 a.b.c.d - - [08/Mar/2011:14:48:03 +0000] "POST http://www.sitename.net/newreply.php?do=postreply&t=285227 HTTP/1.1" 503 2623

 Backend:
 a.b.c.d - - [08/Mar/2011:14:48:03 +0000] "POST /newreply.php?do=postreply&t=285227 HTTP/1.1" 200 2686

The POST appears to work fine on the backend but the user gets a 503 from
Varnish.  It's not unusual to see users getting the error several times in
a row (presumably re-submitting the post):

 a.b.c.d - - [08/Mar/2011:18:21:23 +0000] "POST http://www.sitename.net/editpost.php?do=updatepost&p=9405408 HTTP/1.1" 503 2623
 a.b.c.d - - [08/Mar/2011:18:21:36 +0000] "POST http://www.sitename.net/editpost.php?do=updatepost&p=9405408 HTTP/1.1" 503 2623
 a.b.c.d - - [08/Mar/2011:18:21:50 +0000] "POST http://www.sitename.net/editpost.php?do=updatepost&p=9405408 HTTP/1.1" 503 2623

A typical request is below.  The first attempt fails with:

  33 FetchError   c http first read error: -1 0 (Success)

there is presumably a restart and the second attempt (sometimes to
backend1, sometimes backend2) fails with:

  33 FetchError   c backend write error: 11 (Resource temporarily unavailable)

This pattern has been the same on the few transactions I've examined in
detail.  The full log output of a typical request is below.

I'm stumped.  Has anybody got any ideas what might be causing this?


-Ronan


   33 RxRequest    c POST
   33 RxURL        c /ajax.php
   33 RxProtocol   c HTTP/1.1
   33 RxHeader     c Accept: */*
   33 RxHeader     c Accept-Language: nl-be
   33 RxHeader     c Referer: http://www.redcafe.net/
   33 RxHeader     c x-requested-with: XMLHttpRequest
   33 RxHeader     c Content-Type: application/x-www-form-urlencoded; charset=UTF-8
   33 RxHeader     c Accept-Encoding: gzip, deflate
   33 RxHeader     c User-Agent: Mozilla/4.0 (compatible; ...)
   33 RxHeader     c Host: www.sitename.net
   33 RxHeader     c Content-Length: 82
   33 RxHeader     c Connection: Keep-Alive
   33 RxHeader     c Cache-Control: no-cache
   33 RxHeader     c Cookie: ...
   33 VCL_call     c recv
   33 VCL_return   c pass
   33 VCL_call     c hash
   33 VCL_return   c hash
   33 VCL_call     c pass
   33 VCL_return   c pass
   33 Backend      c 44 backend backend1
   44 TxRequest    b POST
   44 TxURL        b /ajax.php
   44 TxProtocol   b HTTP/1.1
   44 TxHeader     b Accept: */*
   44 TxHeader     b Accept-Language: nl-be
   44 TxHeader     b Referer: http://www.sitename.net/
   44 TxHeader     b x-requested-with: XMLHttpRequest
   44 TxHeader     b Content-Type: application/x-www-form-urlencoded; charset=UTF-8
   44 TxHeader     b User-Agent: Mozilla/4.0 (compatible; ...)
   44 TxHeader     b Host: www.sitename.net
   44 TxHeader     b Content-Length: 82
   44 TxHeader     b Cache-Control: no-cache
   44 TxHeader     b Cookie: ...
   44 TxHeader     b Accept-Encoding: gzip
   44 TxHeader     b X-Forwarded-For: a.b.c.d
   44 TxHeader     b X-Varnish: 657185708
 * 33 FetchError   c http first read error: -1 0 (Success)
   44 BackendClose b backend1
   33 Backend      c 47 backend backend2
   47 TxRequest    b POST
   47 TxURL        b /ajax.php
   47 TxProtocol   b HTTP/1.1
   47 TxHeader     b Accept: */*
   47 TxHeader     b Accept-Language: nl-be
   47 TxHeader     b Referer: http://www.sitename.net/
   47 TxHeader     b x-requested-with: XMLHttpRequest
   47 TxHeader     b Content-Type: application/x-www-form-urlencoded; charset=UTF-8
   47 TxHeader     b User-Agent: Mozilla/4.0 (compatible; ...)
   47 TxHeader     b Host: www.sitename.net
   47 TxHeader     b Content-Length: 82
   47 TxHeader     b Cache-Control: no-cache
   47 TxHeader     b Cookie: ...
   47 TxHeader     b Accept-Encoding: gzip
   47 TxHeader     b X-Forwarded-For: a.b.c.d
   47 TxHeader     b X-Varnish: 657185708
 * 33 FetchError   c backend write error: 11 (Resource temporarily unavailable)
   47 BackendClose b backend2
   33 VCL_call     c error
   33 VCL_return   c deliver
   33 VCL_call     c deliver
   33 VCL_return   c deliver
   33 TxProtocol   c HTTP/1.1
   33 TxStatus     c 503
   33 TxResponse   c Service Unavailable
   33 TxHeader     c Server: Varnish
   33 TxHeader     c Retry-After: 0
   33 TxHeader     c Content-Type: text/html; charset=utf-8
   33 TxHeader     c Content-Length: 2623
   33 TxHeader     c Date: Tue, 08 Mar 2011 17:08:33 GMT
   33 TxHeader     c X-Varnish: 657185708
   33 TxHeader     c Age: 3
   33 TxHeader     c Via: 1.1 varnish
   33 TxHeader     c Connection: close
   33 Length       c 2623
   33 ReqEnd       c 657185708 1299604110.559967279 1299604113.447372913 0.000037670 2.887368441 0.000037193
   33 SessionClose c error
   33 StatSess     c a.b.c.d 50044 3 1 1 0 1 0 235 2623





More information about the varnish-misc mailing list