VFP and VDP configurations

Sun Dec 10 17:36:06 UTC 2017

Was thinking about how configurable processors could work, so just throwing
out some ideas. Going to focus on VFPs. Basically, the user adds VFP
processors via VCL as usual with an optional position. These VFPs are put
on a candidate list. Then:

   - Each VFP defines a string which states its input and output format
   - When Varnish constructs the VFP chain, it starts at beresp and uses
   that as the first output, and then it chains together VFPs matching inputs
   with outputs. It uses the candidate list and priorities to guide this
   construction, but it will move things around to get a best fit.
   - Varnish has access to builtin VFPs. These VFPs are always available
   and are used to fill in any gaps when it cannot find a way to match and
   output and input when constructing the chain.

So from VCL, here is how we add VFPs:

    VOID add_vfp(VFP init, ENUM position = DEFAULT);

VFP is "struct vfp" and any VMOD can return that, thus registering itself
as a VFP. This contains all the callback and its input and output
requirements.

    position is: DEFAULT, FRONT, MIDDLE, LAST, FETCH, STEVEDORE

DEFAULT lets the VMOD recommend a position, otherwise it falls back to
LAST. FETCH and STEVEDORE are special positions which tells Varnish to put
the VFP in front or last, regardless of actual FRONT and LAST.

So this would be our current list of VFPs with the format
(input)name(output):

    (text,plain,none)esi(esitext)
    (text,plain,none)esi_gzip(gzip)
    (text,plain,none)gzip(gzip,gz)
    (gzip,gz)gunzip(text,plain,none)

gzip and gunzip have a prefered position of STEVEDORE. This means they will
behave the same as beresp.do_gzip and beresp.do_gunzip when added by the
user. Also, gzip and gunzip are builtin, so they never need to be
explicitly added if they are needed by other other VFPs. (From here on out
I will simplify text, plain, and none to text).

Also, when a VFP is successfully added from the candidate list to the
actual chain, it is initialized. During that initialization, it can see
beresp and all the VFPs in front of it and the other candidates. It can
then add new VFPs to the candidate list, remove itself, remove other VFPs,
or delete itself or other VFPs. Orphaned VFPs get put back on the candidate
list.

So for example, anytime the builtin gunzip VFP is added, it will add gzip
as a STEVEDORE VFP candidate (unless a gunzip VFP is already there). This
means content will always maintain its encoding going to storage, but the
user can override.

Example:

    import myvfp;

    sub vcl_backend_response
    {
        add_vfp(myvfp.init());
        add_vfp(esi);
    }

So we start at beresp.http.Content-Encoding to figure out the output of
beresp. We can also optionally look at Content-Type. So in this example, we
have a gzip response:

    VFP chain: beresp(gzip)
    VFP candidates: (text)myvfp(text), (text)esi(esitext)
    VFP builtin: (gzip)gunzip(text), (text)gzip(gunzip)

The algorithm for building the chain attempts to place candidates in order
from the candidates to the actual chain by matching output to input. There
is some flexibility in that it can reorder the candidates if that allows a
match. FETCH and STEVEDORE need to always be first and last, if possible.
Finally, if it cannot match anymore candidates, it then starts considering
the builtins and the process repeats until its not possible to add anymore
VFPs. This means its possible some VFPs cannot be added if there input
cannot be generated from the beresp.

So the above example:

    VFP chain: beresp(gzip)
    VFP candidates: (text)myvfp(text), (text)esi(esitext)
    VFP builtin: (gzip)gunzip(text), (text)gzip(gunzip)

Neither myvfp or esi can be placed since they do not match gzip. Varnish
then goes thru the builtins and it finds gunzip will allow a match to
happen and adds it:

    VFP chain: beresp(gzip) > (gzip)gunzip(text)
    VFP candidates: (text)myvfp(text), (text)esi(esitext)
    VFP builtin: (gzip)gunzip(text), (text)gzip(gunzip)

When gunzip gets initialized, it will add gzip to a stevedore position:

    VFP chain: beresp(gzip) > (gzip)gunzip(text)
    VFP candidates: (text)myvfp(text), (text)esi(esitext),
STEVEDORE:(text)gzip(gunzip)
    VFP builtin: (gzip)gunzip(text), (text)gzip(gunzip)

Next, all the VFPs are now added since their outputs and inputs match up,
giving us the final configuration:

    VFP chain: beresp(gzip) > (gzip)gunzip(text) > (text)myvfp(text) >
(text)esi(esitext)
    VFP candidates: STEVEDORE:(text)gzip(gunzip)
    VFP builtin: (gzip)gunzip(text), (text)gzip(gunzip)

gzip cannot be used since esi outputs a special text format, esitext, which
prevents any further processing. ESI could have had a little bit of
intelligence as it knows it has a gzip counterpart. It could have seen that
a gzip output VFP is in the candidate list, deleted itself, and added
esi_gzip back to the candidates. This would have given us:

    VFP chain: beresp(gzip) > (gzip)gunzip(text) > (text)myvfp(text) >
(text)esi_gzip(gzip)

Brotli example

Lets say we have vmod brotli and it has these VFP:

(text)brotli(brotli,br)
(brotli,br)unbrotli(text)

Also, during init, these 2 VFPs are added to the builtin. So now Varnish
has these builtins:

    VFP builtin: (gzip)gunzip(text), (text)gzip(gunzip),
(text)brotli(brotli,br), (brotli,br)unbrotli(text)

Varnish can use these anywhere to make the VFP chain work. So in the
previous example (minus esi), we could still get our VFPs working when the
beresp is brotli. unbrotli will queue brotli at the STEVEDORE and content
will go into cache as brotli and our VFPs still got text:

    VFP chain: beresp(br) > (br)unbrotli(text) > (text)myvfp(text) >
(text)brotli(br)

Transient buffer example

We could build a theoretical transient VFP vmod which buffers the VFP input
and passes it on as transient storage as 1 large contiguous buffer. It
would look like:

(text)buffer(buffertext)

And this would be added as a builtin. We could then have a regex
substitution vmod like this:

(buffertext)regex(text)

And our VCL would look like:

    sub vcl_backend_response
    {
        add_vfp(regex.vfp());
        regex.add("<title>.*</title>", "<title>new title</title>");
        regex.add("host", "newhost");
    }

This will give us:

    VFP chain: beresp(gzip)
    VFP candidates: (buffertext)regex(text)
    VFP builtin: (gzip)gunzip(text), (text)gzip(gunzip), (text)brotli(br),
(br)unbrotli(text), (text)buffer(buffertext)

Since regex cannot be placed on gzip, we find the gunzip > buffer
combination gives us what we need. gunzip adds gzip and we end up with this:

    VFP chain: beresp(gzip) > (gzip)gunzip(text) > (text)buffer(buffertext)
> (buffertext)regex(text) > (text)gzip(gunzip)

Anyway, I could go on with all kinds of other cool examples, but hopefully
I got my idea across. Thank you for reading thru this long email!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-dev/attachments/20171210/61b7fde7/attachment-0001.html>