Ticket #310 (closed defect: fixed)

Opened 6 years ago

Last modified 3 years ago

WS_Reserve panic + error

Reported by: sky Owned by: kristian
Priority: normal Milestone: Varnish 2.1 release
Component: varnishd Version: 2.1.4
Severity: normal Keywords:
Cc:

Description (last modified by kristian) (diff)

Sep 3 00:20:20 varnish2 varnishd[25355]: Child (25376) died signal=6 Sep 3 00:20:20 varnish2 varnishd[25355]: Child (25376) Panic message: Assert error in WS_Reserve(), cache_ws.c line 156: Condition(ws->r == NULL) not true. thread = (cache-worker)sp = 0x14a64fc8 { fd = 1507, id = 1507, xid = 684175780, client = 72.81.232.3:1299, step = 0x8, handling = 0x0, err_code = 301, err_reason = (null), ws = 0x14a65038 { id = "sess", {s,f,r,e} = {0x14a65760+947,(nil),+32768}, }, worker = 0x2aaee1db0c10 { }, vcl = { srcname = { "/home/artur/wikia-beta.vcl", "Default", }, }, obj = 0x2aabad2f3000 { refcnt = 3, xid = 676411472, ws = 0x2aabad2f3028 { id = "obj", {s,f,r,e} = {0x2aabad2f3348+530,(nil),+31928}, }, http = { ws = 0x2aabad2f3028 { id = "obj", {s,f,r,e} = {0x2aabad2f3348+530,(nil),+31928}, }, hd = { "Date: Tue, 02 Sep 2008 20:44:22 GMT", "Server: Apache", "X-Powered-By: PHP/5.2.6", "Content-language: en", Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, sock) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, sock) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, cli_in) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, cli_out) Sep 3 00:20:20 varnish2 varnishd[25355]: child (28791) Started Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: Pushing vcls failed: dlopen(./vcl.1P9zoqAU.so): ./vcl.1P9zoqAU.so: cannot open shared object file: No such file or directory Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Closed fds: 5 6 10 11 13 14 Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Child starts Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said managed to mmap 15032385536 bytes of 15032385536 Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Ready

No core dump seem to have survived.

Any suggestion for what to look for?

Change History

comment:1 Changed 6 years ago by sky

Sep  3 00:20:20 varnish2 varnishd[25355]: Child (25376) died signal=6
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (25376) Panic message: Assert error in WS_Reserve(), cache_ws.c line 156:   Condition(ws->r == NULL) not true.  
thread = (cache-worker)sp = 0x14a64fc8 {   fd = 1507,
id = 1507, 
xid = 684175780,   
client = 72.81.232.3:1299,
step = 0x8,   
handling = 0x0,   
err_code = 301, 
err_reason = (null),   
ws = 0x14a65038 {
     id = "sess",
     {s,f,r,e} = {0x14a65760,,+947,(nil),+32768},
   },    
 worker = 0x2aaee1db0c10 {     },    
 vcl = {       srcname = {         "/home/artur/wikia-beta.vcl",         "Default",       },     },   
 
obj = 0x2aabad2f3000 {
    refcnt = 3, xid = 676411472,
    ws = 0x2aabad2f3028 {
      id = "obj",
      {s,f,r,e} = {0x2aabad2f3348,,+530,(nil),+31928},
    },
    http = {
      ws = 0x2aabad2f3028 {
        id = "obj",
        {s,f,r,e} = {0x2aabad2f3348,,+530,(nil),+31928},
      },
      hd = {
        "Date: Tue, 02 Sep 2008 20:44:22 GMT",
        "Server: Apache",
        "X-Powered-By: PHP/5.2.6",
        "Content-language: en",
        "X-Time-CPU-Time: 0.951 0.601",
        "Vary: Accept-Encoding,Cookie",
        "X-Vary-Options: Cookie;string-contains=wikicitiesUserID;string-contains=wikicities_session,Accept-Encoding;list-contains=gzip",   
        "Cache-Control: s-maxage=18000, must-revalidate, max-age=0",
        "Last-modified: Wed, 27 Aug 2008 09:04:40 GMT",
        "Content-Encoding: gzip",
        "Content-Type: text/html; charset=utf-8",
        "Content-Length: 22225",
        "X-Cacheable: YES",
        "X-Cache: HIT",
      },
    },
    len = 22225,
    store = {
      22225 {
        1f 8b 08 00 00 00 00 00 00 03 ed bd fb 77 db 36 |.............w.6|
        16 27 fe 73 72 4e ff 07 44 dd 36 c9 34 b2 5e 7e |.'.srN..D.6.4.^~|
        bf ba 8e ed 24 9e c9 c3 63 bb cd cc b6 3d 3a 94 |....$...c....=:.|
        44 4b ac 29 51 43 51 76 dc d9 fe ef fb b9 78 10 |DK.)QCQv......x.|
        [22161 more]
      },
    },
  },
},
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, sock)
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, sock)
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, cli_in)
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, cli_out)
Sep  3 00:20:20 varnish2 varnishd[25355]: child (28791) Started
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: Pushing vcls failed: dlopen(./vcl.1P9zoqAU.so): ./vcl.1P9zoqAU.so: cannot open shared object file: No such file or directory 
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Closed fds: 5 6 10 11 13 14
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Child starts
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said managed to mmap 15032385536 bytes of 15032385536
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Ready


comment:2 Changed 6 years ago by sky

backend default {
        .host = "127.0.0.1";
        .port = "80";
}

backend wikia {
        .host = "xxxx"; 
        .port = "80";
}

backend armchair {
        .host = "xxxx";
        .port = "80";
}

backend gamewikis {
        .host = "xxxx";
        .port = "80";
}

backend images { 
        .host = "xxxx";
        .port = "80";
}

# Below is a commented-out copy of the default VCL logic.  If you
# redefine any of these subroutines, the built-in logic will be
# appended to your code.

## Called when a client request is received
#
sub vcl_recv {


        if (req.http.Accept-Encoding) {
                if (req.http.Accept-Encoding ~ "gzip") {
                    set req.http.Accept-Encoding = "gzip";
                } elsif (req.http.Accept-Encoding ~ "deflate") {
                    set req.http.Accept-Encoding = "deflate";
                } else {
                    # unkown algorithm
                    remove req.http.Accept-Encoding;
                }
        }

        # clean out requests sent via curls -X mode and LWP
        if (req.url ~ "http://") {
                set req.url = regsub(req.url, "http://[^/]*","");
        }


        if (req.request == "PURGE") {
          if (req.http.purgeurl) {
            purge_hash(req.http.purgeurl); 
            error 200 "purged";
          } else {
            error 503 "empty purgeurl";
          }
        }

        if (req.url == "/lvscheck.html") {
                error 200 "varnish is okay";
        }

        if(req.http.host ~ "^(nwn|oblivion|meta|war|gw)$") {
                set req.backend = wikia;
        } elsif(req.http.host ~ "(gw.wikia.com|gamewikis.org)$") {
                set req.backend = gamewikis;
        } elsif(req.http.host ~ "^siwiki.sportsillustrated.cnn.com$") {
                set req.backend = armchair;
        } elsif(req.http.host ~ "^thirdturn.armchairgm.com$") {
                set req.backend = wikia;
        } elsif(req.http.host ~ "armchairgm.com$") {
                set req.backend = armchair;
        } elsif(req.http.host ~ "images.wikia.com") {
                set req.backend = images;
        } else {
                set req.backend = wikia;
        }

        if (req.url ~ "/__utm.gif") {
                set req.url = "/__utm.gif";
        }

        if (req.url ~ "/rx") {
              error 200 "not serving this";
        }


        if (req.http.host == "216.151.156.11") {
                set req.http.origurl = req.url;
        }


        if (req.request != "GET" && req.request != "HEAD" && req.request != "PURGE") {
                pipe;
        }
        if (req.http.Expect) {
                pipe;
        }

        if (req.http.User-Agent ~ "Opera") {
                pipe;
        }

        if (req.http.Cookie ~ "UserID") {
                set req.http.tmp_userid  = regsuball(req.http.Cookie, "(.*?)(^|;|\s)(.*UserID=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_userid = " ";
        }
        if (req.http.Cookie ~ "UserName") {
                set req.http.tmp_username  = regsuball(req.http.Cookie, "(.*?)(^|;|\s)(.*UserName=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_username = " ";
        }
        if (req.http.Cookie ~ "session") {
                set req.http.tmp_session  = regsub(req.http.Cookie, "(.*)(^|;|\s)(.*session=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_session = " ";
        }
        if (req.http.Cookie ~ "Token") {
                set req.http.tmp_token  = regsub(req.http.Cookie, "(.*)(^|;|\s)(.*Token=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_token = " ";
        }
        if (req.http.Cookie ~ "LoggedOut") {
                set req.http.tmp_loggedout  = regsub(req.http.Cookie, "(.*)(^|;|\s)(.*LoggedOut=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_loggedout = " ";
        }
        set req.http.Cookie = "";
        set req.http.Cookie = req.http.tmp_userid " ; " req.http.tmp_session " ; " req.http.tmp_token " ; " req.http.tmp_username " ; " req.http.tmp_loggedout;
        remove req.http.tmp_userid;
        remove req.http.tmp_session;
        remove req.http.tmp_token;
        remove req.http.tmp_loggedout;
        remove req.http.tmp_username;

        if (req.http.Authenticate) {
                pass;
        }
        lookup;
}


sub vcl_hash {
        set req.hash += req.url;
        set req.hash += req.http.host;
        hash;
}

#
## Called when entering pipe mode
#
#sub vcl_pipe {
#       pipe;
#}
#
## Called when entering pass mode
#
#sub vcl_pass {
#        set resp.http.foo = "x";
#       pass;
#}
#
## Called when entering an object into the cache
#


#
## Called when the requested object was found in the cache
#
sub vcl_hit {
       if (req.request == "PURGE") {
                error 200 "purged";
        }

        if (!obj.cacheable) {
                pass;
        }
        if (obj.http.X-Cache == "MISS") {
                set obj.http.X-Cache = "HIT";
        }

        deliver;
}
#
## Called when the requested object was not found in the cache
#
sub vcl_miss {
       if (req.request == "PURGE") {
#               nuke;
                error 200 "purged";
        }

        fetch;
}
#
## Called when the requested object has been retrieved from the
## backend, or the request to the backend has failed
#
sub vcl_fetch {
        set obj.http.X-Cache = "MISS";
        # this is the old wow ip, so issue redirect
        if (req.http.host == "216.151.156.11") {
                set obj.http.origurl = req.http.origurl;
        }

        if (!obj.cacheable) {
                set obj.http.X-Cacheable = "NO:Not-Cacheable";
                pass;
        }
        if (obj.http.Cache-Control ~ "private") {
                if(req.http.Cookie ~"(UserID|_session)") {
                        set obj.http.X-Cacheable = "NO:Got Session";
                } else {
                        set obj.http.X-Cacheable = "NO:Cache-Control=private";
                }
                pass;
        }
        if (obj.http.Set-Cookie ~ "(UserID|_session)") {
                set obj.http.X-Cacheable = "NO:Set-Cookie";
                pass;
        }

        set obj.http.X-Cacheable = "YES";
        set obj.grace = 10s;
        deliver;
}

sub vcl_prefetch {
        pass;
}

#
#
## Called before a cached object is delivered to the client
#
sub vcl_deliver {

        set resp.http.X-Served-By = "varnish2";

        # this is the old wow ip, so issue redirect
        if (resp.http.origurl) {
                set resp.http.Location = regsub(resp.http.origurl, "^" , "http://www.wowwiki.com");
                set resp.status = 301;
                set resp.response = "Moved Permanently";
                deliver;
        }



    set resp.http.Cache-Control = "private, s-maxage=0, max-age=0, must-revalidate";
    set resp.http.Expires = "Thu, 01 Jan 1970 00:00:00 GMT";
    if(!resp.http.Vary) {
        set resp.http.Vary = "Accept-Encoding,Cookie";
        }
    deliver;
}

comment:3 Changed 6 years ago by sky

started as such

bin/varnishd/varnishd -w 500,2000 -a :80 -T localhost:6082 -f /home/artur/wikia-beta.vcl -u varnish -g varnish -s file,/var/lib/varnish/varnish_storage.bin,14G -p obj_workspace=32768  -p sess_workspace=32768 -p thread_pools=2 -p listen_depth=8192   -p ping_interval=0 -p log_hashstring=on -F

comment:4 Changed 6 years ago by phk

  • Status changed from new to closed
  • Resolution set to worksforme

It looks like you ran out of workspace for http modifications, as far as I can tell you need to increase sess_workspace param.

comment:5 Changed 6 years ago by sky

  • Status changed from closed to reopened
  • Resolution worksforme deleted

The path it took was via recv/hash/hit/deliver -- the sess_workspace is already 32kb, there is nothing very different from all the other requests. I am rather surprised how it could eat up 32768 bytes of work space for a hit

The second bug would be failing to reload the vcl after the crash.

comment:6 Changed 6 years ago by sky

(gdb) bt
#0  0x0000003482e30055 in raise () from /lib64/libc.so.6
#1  0x0000003482e31af0 in abort () from /lib64/libc.so.6
#2  0x0000000000419bc2 in pan_ic (func=<value optimized out>, file=<value optimized out>, line=<value optimized out>,     cond=<value optimized out>, err=64, xxx=<value optimized out>) at cache_panic.c:325
#3  0x0000000000426054 in WS_Reserve (ws=0x2aab8d8b9028, bytes=0) at cache_ws.c:156
#4  0x0000000000420ce6 in vrt_assemble_string (hp=dwarf2_read_address: Corrupted DWARF expression.) at cache_vrt.c:131
#5  0x0000000000422af5 in VRT_SetHdr (sp=0xb122b68, where=<value optimized out>, hdr=0x2aae30291417 "\bX-Cache:",     p=0x2aae30291426 "HIT") at cache_vrt.c:179
#6  0x00002aae302903b1 in ?? ()
#7  0x00002aaeee5c4c10 in ?? ()
#8  0x000000000b122b68 in ?? ()
#9  0x000000000b122b68 in ?? ()
#10 0x000000000041f466 in VCL_hit_method (sp=0xb122b68) at ../../include/vcl_returns.h:42
#11 0x000000000040dfd8 in cnt_hit (sp=0xb122b68) at cache_center.c:518
#12 0x000000000040f8d4 in CNT_Session (sp=0x2aaeee5c2b90) at steps.h:40
#13 0x000000000041b761 in wrk_do_cnt_sess (w=0x2aaeee5c4c10, priv=<value optimized out>) at cache_pool.c:363
#14 0x000000000041a8a7 in wrk_thread (priv=0x3744e20) at cache_pool.c:276
#15 0x0000003483e062f7 in start_thread () from /lib64/libpthread.so.0#16 0x0000003482ece85d in clone () from /lib64/libc.so.6

comment:7 Changed 6 years ago by phk

  • Status changed from reopened to new
  • Owner changed from des to phk
  • Component changed from build to varnishd
  • Milestone set to After 2.0

I guess this is a "Ah-ha" bug.

We cannot safely assign a string variable that is built on the object workspace in vcl_hit{} because the object is not locked.

If two worker threads tries to do it at the same time, they will collide like we see here.

I don't think this will be fixed for 2.0, it will have to be an errata.

comment:8 Changed 6 years ago by phk

  • Priority changed from normal to lowest

comment:9 Changed 5 years ago by kb

Just as a note to others, this issue also occurs if you set obj headers in vcl_fetch(). That's a good place to mark the cached object with new headers with useful information (say, the original URL if you're rewriting), so IMHO it would be nice to have this functionality at some point.

comment:10 Changed 4 years ago by phk

  • Status changed from new to closed
  • Resolution set to invalid

The new allocation policy for objects, forced on us by the persistent stevedore, makes it impossible to change objects headers once they are created.

In response to this, we have made a number of changes to the VCL, including the move in vcl_fetch from obj.* to beresp.*.

In this light, this ticket is now invalid, in the sense that it has been overtaken by events.

comment:11 Changed 3 years ago by kristian

  • Status changed from closed to reopened
  • Priority changed from lowest to normal
  • Version changed from trunk to 2.1.4
  • Resolution invalid deleted
  • Description modified (diff)

Re-opening this: It's not applicable to trunk, but it is applicable to 2.1.4 and current 2.1 (ie: soon-to-be 2.1.5). Check committed with r5646 .

Haven't had a chance to track it for 2.1.

No practical impact to users who don't use vcl_hit for set obj.http (which should be all), but race and assert-issues for those who now do.

comment:12 Changed 3 years ago by kristian

  • Status changed from reopened to new
  • Owner changed from phk to kristian

comment:13 Changed 3 years ago by kristian

  • Status changed from new to assigned

comment:14 Changed 3 years ago by harm

I can confirm that this issue still exists in v2.1.5

Feb 10 23:04:29 true1 varnishd[6213]: Child (20808) Panic message: Assert error in WS_Reserve(), cache_ws.c line 174:  Condition(ws->r == NULL) not true.
thread = (cache-worker)#012ident = Linux,2.6.32-26-server,x86_64,-smalloc,-hcritbit,epoll
Backtrace:  0x426778: /usr/sbin/varnishd() 
[0x426778]#012  0x432084: /usr/sbin/varnishd(WS_Reserve+0x54) 
[0x432084]#012  0x42be95: /usr/sbin/varnishd() [0x42be95]#012  0x4309b8: /usr/sbin/varnishd(VRT_SetHdr+0xf8) 
[0x4309b8]#012  0x7f3bf06f7fb3: ./vcl.1P9zoqAU.so(+0x1fb3) 
[0x7f3bf06f7fb3]#012  0x42b5e3: /usr/sbin/varnishd(VCL_hit_method+0x43) 
[0x42b5e3]#012  0x413b01: /usr/sbin/varnishd() 
[0x413b01]#012  0x415a3d: /usr/sbin/varnishd(CNT_Session+0x33d) [0x415a3d]#012  0x427f28: /usr/sbin/varnishd() 
[0x427f28]#012  0x4282fb: /usr/sbin/varnishd() 
[0x4282fb]
sp = 0x7f3b2703e008 {
#012  fd = 14, id = 14, xid = 2058269645,#012  client = 87.233.6.43 36464,#012  step = STP_HIT,#012  handling = deliver,#012  restarts = 0, esis = 0#012  ws = 0x7f3b2703e080 { #012    id = "sess",#012 

I have the following vcl_hit

sub vcl_hit {
        if (!obj.cacheable) {
                return(pass);
        }

        set obj.http.X-Cache = "Cache-Hit";

        return (deliver);
}

comment:15 Changed 3 years ago by kristian

  • Status changed from assigned to closed
  • Resolution set to fixed

Aaaand closing again. The bug is long since fixed, and we are now working mainly on 3.0, where it's been fixed.

Note: See TracTickets for help on using tickets.