[Varnish] #1331: Varnish coredump every day
Varnish
varnish-bugs at varnish-cache.org
Mon Aug 5 12:02:31 CEST 2013
#1331: Varnish coredump every day
-------------------------+--------------------
Reporter: jinjian.1@… | Owner:
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.3
Severity: critical | Resolution:
Keywords: coredump |
-------------------------+--------------------
Description changed by tfheen:
Old description:
> we encountered varnish coredump issue everyday in this week. My version
> is 3.0.3
>
> From var/log/messages:
>
> Aug 2 07:50:26 ip-10-36-1-238 varnishd[28776]: Child (28777) not
> responding to CLI, killing it.
> Aug 2 07:50:36 ip-10-36-1-238 varnishd[28776]: Child (28777) not
> responding to CLI, killing it.
> Aug 2 07:50:47 ip-10-36-1-238 varnishd[28776]: Child (28777) not
> responding to CLI, killing it.
> Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: {client} Connection closed
> (in data)
> Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: ipaddress :10.36.1.238
> accept!
> Aug 2 07:50:57 ip-10-36-1-238 varnishd[28776]: Child (28777) not
> responding to CLI, killing it.
> Aug 2 07:51:02 ip-10-36-1-238 stud[10104]: {backend} Connection reset by
> peer
> Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not
> responding to CLI, killing it.
> Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not
> responding to CLI, killing it.
> Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) died
> signal=3 (core dumped)
> Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: child (20041) Started
> Aug 2 07:51:04 ip-10-36-1-238 varnishd[28776]: Child (20041) said Child
> starts
>
> from coredump:
>
> (gdb) bt
> #0 0x00007fdce4b41054 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1 0x00007fdce4b3c388 in _L_lock_854 () from /lib64/libpthread.so.0
> #2 0x00007fdce4b3c257 in pthread_mutex_lock () from
> /lib64/libpthread.so.0
> #3 0x0000000000434350 in vsl_get ()
> #4 0x0000000000434508 in VSLR ()
> #5 0x00000000004346d2 in VSL ()
> #6 0x00007fdce66d2d95 in cls_vlu2 (priv=0x7fdce3d42780,
> av=0x7fd96e85b500) at cli_serve.c:292
> #7 0x00007fdce66d347b in cls_vlu (priv=0x7fdce3d42780, p=0x2 <Address
> 0x2 out of bounds>) at cli_serve.c:339
> #8 0x00007fdce66d6e09 in LineUpProcess (l=0x7fdce3d1d730) at vlu.c:154
> #9 0x00007fdce66d3e7d in VCLS_Poll (cs=0x7fdce3d03290, timeout=<value
> optimized out>) at cli_serve.c:528
> #10 0x000000000041aa41 in CLI_Run ()
> #11 0x000000000042ea01 in child_main ()
> #12 0x000000000044155c in start_child ()
> #13 0x0000000000441ee8 in MGT_Run ()
> #14 0x000000000045037f in main ()
>
> Our system is down for almost 1 minute during the recover process.
>
> The issue is very similar with https://www.varnish-
> cache.org/trac/ticket/516 and https://www.varnish-
> cache.org/trac/ticket/1054. But i could not find any solution there. Do
> anybody could put some lights on it?
New description:
we encountered varnish coredump issue everyday in this week. My version is
3.0.3
From var/log/messages:
{{{
Aug 2 07:50:26 ip-10-36-1-238 varnishd[28776]: Child (28777) not
responding to CLI, killing it.
Aug 2 07:50:36 ip-10-36-1-238 varnishd[28776]: Child (28777) not
responding to CLI, killing it.
Aug 2 07:50:47 ip-10-36-1-238 varnishd[28776]: Child (28777) not
responding to CLI, killing it.
Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: {client} Connection closed (in
data)
Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: ipaddress :10.36.1.238 accept!
Aug 2 07:50:57 ip-10-36-1-238 varnishd[28776]: Child (28777) not
responding to CLI, killing it.
Aug 2 07:51:02 ip-10-36-1-238 stud[10104]: {backend} Connection reset by
peer
Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not
responding to CLI, killing it.
Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not
responding to CLI, killing it.
Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) died
signal=3 (core dumped)
Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: child (20041) Started
Aug 2 07:51:04 ip-10-36-1-238 varnishd[28776]: Child (20041) said Child
starts
}}}
from coredump:
{{{
(gdb) bt
#0 0x00007fdce4b41054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007fdce4b3c388 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007fdce4b3c257 in pthread_mutex_lock () from
/lib64/libpthread.so.0
#3 0x0000000000434350 in vsl_get ()
#4 0x0000000000434508 in VSLR ()
#5 0x00000000004346d2 in VSL ()
#6 0x00007fdce66d2d95 in cls_vlu2 (priv=0x7fdce3d42780,
av=0x7fd96e85b500) at cli_serve.c:292
#7 0x00007fdce66d347b in cls_vlu (priv=0x7fdce3d42780, p=0x2 <Address 0x2
out of bounds>) at cli_serve.c:339
#8 0x00007fdce66d6e09 in LineUpProcess (l=0x7fdce3d1d730) at vlu.c:154
#9 0x00007fdce66d3e7d in VCLS_Poll (cs=0x7fdce3d03290, timeout=<value
optimized out>) at cli_serve.c:528
#10 0x000000000041aa41 in CLI_Run ()
#11 0x000000000042ea01 in child_main ()
#12 0x000000000044155c in start_child ()
#13 0x0000000000441ee8 in MGT_Run ()
#14 0x000000000045037f in main ()
}}}
Our system is down for almost 1 minute during the recover process.
The issue is very similar with https://www.varnish-
cache.org/trac/ticket/516 and https://www.varnish-
cache.org/trac/ticket/1054. But i could not find any solution there. Do
anybody could put some lights on it?
--
--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1331#comment:1>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list