[Varnish] #1083: Persistent Varnish crashes since using bans and lurker
Varnish
varnish-bugs at varnish-cache.org
Mon Oct 29 12:18:02 CET 2012
#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------
Description changed by tfheen:
Old description:
> We use a farm with three persistent Varnishes (-s
> persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").
>
> This Varnishes runs since 3 months without any crashes (in the moment not
> in production, but stressed with several stress tests).
>
> Since some days, we use bans and the lurker process (lurker-friendly bans
> via: ban("obj.http.x-url ~ " + req.url);
> We have about 250 bans/hour.
>
> Now we have the big problem, that the varnishes crashes after some hours.
> Curios: all three Varnishes crashes in the same moment. And they runs on
> three different Servers!
>
> The follow part from syslog suggest, that there is an problem with an
> invalid ban:
>
> Jan 9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> died signal=6
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> Panic message: Missing errorhandling code in smp_append_sign(),
> storage_persistent_subr.c line 128:#012 Condition((smp_chk_sign(ctx)) ==
> 0) not true.thread = (cache-worker)#012ident =
> Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
> 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44a346:
> /usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012 0x447b6d:
> /usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012 0x4125c7:
> /usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012 0x433bd5:
> /usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012 0x7f91f39fa4be:
> ./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012 0x433863:
> /usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012 0x417c22:
> /usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012 0x42efb8:
> /usr/sbin/varnishd() [0x42efb8]#012 0x42e19b: /usr/sbin/varnishd()
> [0x42e19b]#012sp = 0x7f91ed4ab008 {#012 fd = 15, id = 15, xid =
> 683670119,#012 client = 172.27.70.103 36115,#012 step = STP_RECV,#012
> handling = deliver,#012 restarts = 0, esi_level = 0#012 flags = #012
> bodystatus = 4#012 ws = 0x7f91ed4ab080 { #012 id = "sess",#012
> {s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012 },#012 http[req] =
> {#012 ws = 0x7f91ed4ab080[sess]#012 "PURGE",#012
> "105867846",#012 "HTTP/1.0",#012 },#012 worker = 0x7f91ef1faa80
> {#012 ws = 0x7f91ef1facc0 { #012 id = "wrk",#012 {s,f,r,e} =
> {0x7f91ef1e8a30,+32,(nil),+65536},#012 },#012 },#012 vcl = {#012
> srcname = {#012 "input",#012 "Default",#012 },#012
> },#012},#012
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
> Started
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
> failed:#012CLI communication error (hdr)
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
> died signal=6
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
> Panic message: Assert error in smp_open(), storage_persistent.c line
> 320:#012 Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
> (cache-main)#012ident =
> Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
> 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44756a:
> /usr/sbin/varnishd() [0x44756a]#012 0x444d57:
> /usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012 0x42b525:
> /usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012 0x43d5ec:
> /usr/sbin/varnishd() [0x43d5ec]#012 0x43de7c: /usr/sbin/varnishd()
> [0x43de7c]#012 0x7f92015684c7: /usr/lib64/varnish/libvarnish.so(+0x94c7)
> [0x7f92015684c7]#012 0x7f9201568b58:
> /usr/lib64/varnish/libvarnish.so(vev_schedule+0x88) [0x7f9201568b58]#012
> 0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132) [0x43d7c2]#012 0x44cacb:
> /usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said Child starts
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
New description:
We use a farm with three persistent Varnishes (-s
persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").
This Varnishes runs since 3 months without any crashes (in the moment not
in production, but stressed with several stress tests).
Since some days, we use bans and the lurker process (lurker-friendly bans
via: ban("obj.http.x-url ~ " + req.url);
We have about 250 bans/hour.
Now we have the big problem, that the varnishes crashes after some hours.
Curios: all three Varnishes crashes in the same moment. And they runs on
three different Servers!
The follow part from syslog suggest, that there is an problem with an
invalid ban:
{{{
Jan 9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
Panic message: Missing errorhandling code in smp_append_sign(),
storage_persistent_subr.c line 128:#012 Condition((smp_chk_sign(ctx)) ==
0) not true.thread = (cache-worker)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44a346:
/usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012 0x447b6d:
/usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012 0x4125c7:
/usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012 0x433bd5:
/usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012 0x7f91f39fa4be:
./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012 0x433863:
/usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012 0x417c22:
/usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012 0x42efb8:
/usr/sbin/varnishd() [0x42efb8]#012 0x42e19b: /usr/sbin/varnishd()
[0x42e19b]#012sp = 0x7f91ed4ab008 {#012 fd = 15, id = 15, xid =
683670119,#012 client = 172.27.70.103 36115,#012 step = STP_RECV,#012
handling = deliver,#012 restarts = 0, esi_level = 0#012 flags = #012
bodystatus = 4#012 ws = 0x7f91ed4ab080 { #012 id = "sess",#012
{s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012 },#012 http[req] =
{#012 ws = 0x7f91ed4ab080[sess]#012 "PURGE",#012
"105867846",#012 "HTTP/1.0",#012 },#012 worker = 0x7f91ef1faa80
{#012 ws = 0x7f91ef1facc0 { #012 id = "wrk",#012 {s,f,r,e} =
{0x7f91ef1e8a30,+32,(nil),+65536},#012 },#012 },#012 vcl = {#012
srcname = {#012 "input",#012 "Default",#012 },#012
},#012},#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
Started
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
failed:#012CLI communication error (hdr)
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
Panic message: Assert error in smp_open(), storage_persistent.c line
320:#012 Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
(cache-main)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44756a:
/usr/sbin/varnishd() [0x44756a]#012 0x444d57:
/usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012 0x42b525:
/usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012 0x43d5ec:
/usr/sbin/varnishd() [0x43d5ec]#012 0x43de7c: /usr/sbin/varnishd()
[0x43de7c]#012 0x7f92015684c7: /usr/lib64/varnish/libvarnish.so(+0x94c7)
[0x7f92015684c7]#012 0x7f9201568b58:
/usr/lib64/varnish/libvarnish.so(vev_schedule+0x88) [0x7f9201568b58]#012
0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132) [0x43d7c2]#012 0x44cacb:
/usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said Child starts
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
}}}
--
--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:2>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list