varnishncsa logs split per domain

Admin Beckspaced admin at beckspaced.com
Fri Dec 2 19:30:01 CET 2016


On 15.11.2016 17:57, Admin Beckspaced wrote:
>
> Am 15.11.2016 um 16:51 schrieb Dridi Boukelmoune:
>>> a bit more hints and info would be nice ;)
>> man vmod_std
>> man varnishncsa
>>
>> That's how much "nice" I'm willing to do :p
>>
>> Dridi
>>
>>
> ok. first I want to say thanks for being nice and pointing me to the 
> man pages.
>
> after a bit of reading I finally found the parts I was looking for:
>
>     import std;
>
>     sub vcl_recv {
>
>     std.log(“myhost:” + regsub(req.http.Host, "^www\.", "") );
>
>     }
>
> in varnishncsa:
>
>      # varnishncsa -F ‘%h %l %u %t “%r” %s %b “%{Referer}i” 
> “%{User-agent}i %{VCL_Log:myhost}x’
>
>
> not yet tested but I think this is what Dridi was pointing to?
>
Hello again,

sorry, it has been a while but I just thought to finish the thread I 
started and point to the solution I decided to go with at last.

The question in the beginning was: How can I split the varnishncsa logs 
per domain.

My first thinking was to use the query -q option, e.g. varnishncsa -q 
"ReqHeader ~ '^Host: .*example.com'"

But this approach would end up in a lot varnishncsa instances, as 
pointed out by Andrei, and also the problem with not being able to 
normalize the Request Host header.
Then Dridi pointed me to man vmod_std and using std.log in VCL, which 
was the final bit needed ;)

So here's my current solution:

I run a single instance of varnishncsa with the following params:

VARNISHLOG_PARAMS="-f /etc/varnish/varnishncsa-log-format-string -a -w 
/var/log/varnish/varnish.log"

the varnishncsa-log-format-string is as follows:

%{VCL_Log:myhost}x %h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"

at the beginning VCL_Log:key     The value set by std.log("key:value") 
in VCL, more on that later

My varnish sits in front of Apache with 30 something different domains. 
Currently I don't use varnish for all domains, but have setup varnish 
VCL in such a way that I can filter on the domains and decide to cache 
with varnish or just skip caching and pass to the apache backend.

My varnish VCL is based on the varnish boilerplate which I found on the net:

http://verticalprogramming.com/2013/09/15/varnish-virtual-host-boilerplate

So if I decide to cache a particular domain with varnish I can normalize 
the Request Host header in VCL

sub vcl_recv {

     if (req.http.host ~ "somedomain\.com") {

         # tilde ~ uses regex
         if (req.http.host ~ "^(www.)?somedomain\.com$") {

             //normalize the req.http.host
             set req.http.host = regsub(req.http.Host, "^www\.", "");

             std.log("myhost:" + req.http.Host );
...

so this std.log(myhost:somedomian.com) gets picked up by varnishncsa and 
the custom format string, see above.

which then produces a nice & steady varnish.log file with a normalized 
host at the beginning for domains I want to cache and an empty space if 
I don't want to.

then we got the split-logfile from apache:

https://httpd.apache.org/docs/2.4/programs/split-logfile.html

which was exactly made for a setup with the host names at the very 
beginning of the log file. Only thing someone needs to take care of is 
that the logfiles will get created in the directory where the script is 
run. so therefore I created a small bash script wrapper split-logfile.sh 
which first changes to the right working directory:

#!/bin/bash
cd /var/log/varnish
/usr/bin/split-logfile < varnish.log

and on the daily logrotate on the /var/log/varnish/varnish.log I added 
the following:

/var/log/varnish/varnish.log {
     ...
     prerotate
      /var/log/varnish/split-logfile.sh
     endscript
     ...
}

so before the varnish.log gets rotated split-logfile.sh gets called and 
creates the different log files per normalized host and an access.log 
for all the requests without a hostname at the beginning.
After a view logrotate runs the /var/log/varnish/ could look like that:

mydomain.com.log
myotherdomain.com.log
access.log
varnish.log
varnish.log-20161130
varnish.log-20161201
varnish.log-20161202

which finally give me exactly what I wanted! A single instance of 
varnishncsa producing a varnish.log for all domains.
a per domain split via split-logfile.sh on each logrotate run resulting 
in log files per domain ready to use with webalizer, which gets also 
called on prerotate the logfile:

/var/log/varnish/mydomain.com.log {
     ...
     prerotate
      /usr/bin/webalizer -qc /etc/webalizer/mydomain.conf
     endscript
     ...
}

perhaps this might help someone here looking for something similar?

thanks & greetings
becki





More information about the varnish-misc mailing list