wiki:PostTwoShoppingList

Version 79 (modified by phk, 4 years ago) (diff)

--

  1. Overview of Major Work Packages
    1. 1. Persistent Storage
    2. 2. Hash director
    3. 3. DNS controlled director
    4. 4. ESI 304 handling
    5. 5. More ESI features
    6. 6. Gzip support
    7. 7. SSL support
    8. 8. Varnishlog filtering language
    9. 9. Dynamic stats counters
    10. 10. Forced Purges
    11. 11. Backend revalidation
    12. 12. Streaming pass/fetch
    13. 13. Range header support
    14. 14. Etags header support
    15. 15. File upload buffering
    16. 16. DAV support
    17. 17. WCCP support
    18. 18. HTCP support
    19. 19. Better broken backend handling
    20. 20. VarnishNCSA improvements
    21. 21. VCL cookie handing
    22. 22. Large dataset performance improvements
    23. 23. High traffic performance improvements
    24. 24. Purge "nuke" option
    25. 25. Forced grace
    26. 26. Outgoing IP#
    27. 27. Synthetic content
    28. 28. Sticky director
    29. 29. Synth content from file
  2. Minor changes
    1. Varnishstat
    2. VCL
      1. New variables
      2. Features
      3. A header-washing function
      4. VCL generation
      5. Generating VCL
    3. Varnishstat
    4. TCP_DEFER_ACCEPT
  3. Detailed descriptions
    1. 3. DNS controlled director
      1. Architects comments
  4. Random notebook
    1. Priority director
  5. Protocols
    1. HTTP
  6. Performance
    1. Small object handling
    2. Disable SHM log
    3. Improve large object handling
    4. Performance tuning for large data sets
  7. Bugs and misfeatures
    1. Object WS overflow
    2. Session WS info
    3. Thread Pool Sizing
  8. Feature requests
    1. X-sendfile header
    2. Restarts
    3. TAR-file storage
    4. Log free workspace
    5. Handle "out of workspace" more gracefully
    6. Pipeline on backend connections (optional)
    7. Put out a warning if we have many sessions parked on a busy object
    8. Put a limit on the number of sessions parked on a busy object
    9. Expiry randomization
    10. XID's for backend, using XID instead of fd in shmlog
    11. Logging from VCL
  9. Testing
  10. Backend connection management

Overview of Major Work Packages

Please keep these descriptions short, long explanations goes into the detail section below.

1. Persistent Storage

Already agreed to.

2. Hash director

Select a backend based on hash string, so that the load gets distributed evenly and consistently over backends.

This would make two-level varnish setups easier to implement.

3. DNS controlled director

A director which chooses backend by looking up the target server (ie: Host: header) under a subdomain.

4. ESI 304 handling

Make If-Modified-Since operate on all ESI included objects instead of only on the root object. (param option)

5. More ESI features

Cookies, conditionals, onerror, timeout etc. (please tell us which!)

6. Gzip support

Ability to compress objects. (vcl option). Ability to ESI process compressed objects.

See discussion in ticket #42. See also #352 for ESI issues.

7. SSL support

Ability to do HTTPS

8. Varnishlog filtering language

Aim for the kind of filtering flexibility tcpdump gives:

varnishlog txstatus != 200 and txstatus != 302 and not rxurl ~ ".(png|jpg|css|gif)"

This should also apply to varnishncsa, varnishtop and varnishhist.

9. Dynamic stats counters

Right now we only have compile time statistics counters, that means that we have no per-backend counters.

In theory, the relevant statistics can be pulled out of the shmlog in real-time, but having real statistics counters would probably be a good idea too.

10. Forced Purges

Purges are only processed when the relevant object is hit.

In heavy purge environments, that can lead to multiple, even many purge records for the same regexp.

Parking a thread on the purge list to actively seek out and evict purged objects would shorten the list considerably. (param option)

Update: We have added "purge_dups" which runs through the purgelist and marks any identical purges as "gone", this may solve the issue.

11. Backend revalidation

Support using conditional GETs to revalidate objects with the backend. (vcl option)

12. Streaming pass/fetch

Start delivering the object to the client as soon as it arrives from the backend. (vcl option)

This is much closer now that vcl_fetch is moved between the header and body.

Things we can do:

Pipe-lines pass (with a configurable limit on buffersize) (This would also solve enormous object issues, provided the size is declared in Content-Length: -- See also #503)

We could also allow pipe at vcl_fetch time now (video streaming)

Force all transient objects to -smalloc (pipe, pass)

Select stevedore per object (TTL stratification for -spersistence)

13. Range header support

14. Etags header support

15. File upload buffering

Receive the entire body from the client, before bothering backend (vcl option)

16. DAV support

Not analyzed, so I have no idea what sort of scope this is. (possibly param option)

 http://ftp.ics.uci.edu/pub/ietf/webdav/

17. WCCP support

Not analyzed.

 http://ftp.ipsyn.net/pub/mirrors/cisco/public/cons/isp/documents/WCCP_Presentation-1up.pdf

 http://www.wrec.org/Drafts/draft-wilson-wrec-wccp-v2-00.txt

18. HTCP support

Not analyzed.

Maybe only HTCP::CLR necessary.

See RFC 2756 (relevant for mediawiki ?)

19. Better broken backend handling

Presently varnish sort of assume the backends work, more paranoia could include:

  • timing out unresponsive backend connections faster.
  • Pooling slow backend connections in eventdriven threads.
  • Verify the data sent from the backend and better error handling. Now Varnish child dies if backend response is : "200 OK" instead of etc "HTTP/1.1 200 OK"

20. VarnishNCSA improvements

Specification of custom formats (like apache's % notation ?)

Multiple output files and a way to steer vhosts to them. (See also point 8 above).

21. VCL cookie handing

Cookies are special enough that a specific syntax extension is warranted.

Access to request cookies could for instance be req.cookie.USER

Access to object/response cookies something similar. (NB: Multiple Set-Cookie headers).

Cookie2 support ?

22. Large dataset performance improvements

Making Varnish run faster with large datasets (that do not fit RAM).

This will to a large extent be about optimizing storage access patterns, both for speed and compactness.

23. High traffic performance improvements

Making Varnish handle higher traffic levels.

This will be mostly about tuning the network/thread-pool/CPU side of varnish for higher req/s.

24. Purge "nuke" option

Ability to purge all "Vary" variants of an object in one go once you have a cache hit on one of them.

25. Forced grace

Make it possible to force grace from VCL, if a backend is down or giving the wrong reply (HTTP 503 etc.). See ticket #369.

26. Outgoing IP#

Add a backend-property for the outgoing IP# to use when connecting to that backend.

Does this make sense ? Routing is based on the routes, and the outgoing address should be the one that matches the route interface ?

27. Synthetic content

Make it possible to generate a synthetic response anywhere in VCL:

     sub vcl_pipe {
          if (req.url ~ "Open_pod_bay_door") {
               synth {
                     set resp.status = 400;
                     set resp.body = "I'm sorry Dave, I cannot do that";
                     if (client.ip ~ crew) {
                         set resp.http.set-cookie: "SOURCE=HAL9000";
                     }
               }
          }
     }

after synth, we always go to vcl_deliver{}

This eliminates the "error" primitive and reserves vcl_error{} for internally generated errors (like 503)

28. Sticky director

We'd like to be able to have a sticky director. When varnish first starts, it'd look for any healthy backends, but would then remain with that backend until it became unhealthy. It'd then look for another healthy backend.

(see ticket #537)

29. Synth content from file

See Ticket #587


Minor changes

Varnishstat

Show "avoided backend traffic" counter. (see ticket #302)

Show average service time. Squid has this in its SNMP agent.

VCL

New variables

Access to hostname in vcl_error message creation

client.bandwidth

Make req.* availabel in vcl_deliver (#246)

Features

String generality:

Access to all variables in string format

Make string concat work wherever a string is called for (#216)

CDB file access (#530) - this idea was inspired by Apache's mod_rewrite access to BDB files. Similar to the GeoIP example, a CDB file could be used for rewriting url requests, or specifying backends:

If the host header, concatenated with "nocache", is present in the cdb file, then pass.

    if (req.http.host "nocache" == cdb.file) {
        pass;
    }

or

If the cdb file contains a key for the host header and its value is equal to nocache, then pass.

    if (cdb.file(req.http.host) == "nocache") {
        pass;
    }

A header-washing function

"delete all headers but these" (See ticket #204)

VCL generation

Generating VCL

Just like we have C{...}C for embedding inline C-code in VCL, we could add !{...}! for inlining shell-scripts.

This would allow ACLS or lists of backends to be pulled out of databases or other files at compile time.

The commands would run with manager process credentials.

Varnishstat

Make it possible to reset varnishstat counters.

TCP_DEFER_ACCEPT

We use Accept Filters on FreeBSD. Linux has something similar, but more primitive called TCP_DEFER_ACCEPT. We should take a look at enabling that for better performance.


Detailed descriptions

3. DNS controlled director

Running with dynamically assigned backend servers (via DNS) is currently not supported. We need a way to dynamically select a backend server, based on the hostname of the request.

For example, consider a request for "foo.com" reaching varnish. Varnish will then check an internal DNS-record for "foo.com" (perhaps with pseudo-TLD appended, as in "foo.com.int.tld" to avoid confusion with external DNS), determining through e.g. a CNAME that it should go to backend server "backend02.int.tld".

This will make it easy to support failovers for backend servers via DNS, and to manage large numbers of served domains going to several backends without having to configure large number of static rules in VCL.

Architects comments

Truly dynamic backends would be a lot of work, but maybe it is enough with a "dns-lookup director" ?

You would still have to define all the physical backends in VCL, but the director would use DNS to choose, something like:

director foo Adns {
    .dnssuff = ".int.tld";
    {
        { .host = "192.168.0.1"; }
        { .host = "192.168.0.2"; }
        ...
    }
}

One trouble spot here is that looking domains up through getaddrinfo(3) does not return the DNS TTL information, so in theory we are required to do the lookup for every single transaction, leaving caching of the results to the DNS implementation.

(with respect to having to list all the backends, see later in this file about generated VCL)


Random notebook

Priority director


Protocols

HTTP

We may be to aggressive when closing TCP connections, more shutdown(2) calls may be a good idea.


Performance

Small object handling

Varnish was not prepared for websites that have only small objects, and as a result the VM overhead for small objects is excessive. Allocating the object structure and sufficient space for small objects may make sense to avoid this.

Disable SHM log

Consider making it possible to disable the shm log (run-time) for performance reasons.

Improve large object handling

It is silly to receive an gigabyte sized object into VM before starting transmission to the client, in particular for pass.

Trouble is, if we don't know the length up front and the client does not handle HTTP/1.1, we cannot use chunked encoding. Fail back to "close when done" transmission.

Moving vcl_fetch up to before body reception could offer greater flexibility here.

Performance tuning for large data sets

Running with large data sets (1 million objects and beyond), Varnish frequently gets high load/context switch peaks. This needs to be adressed.

The current hashing algorithm does not scale gracefully. Investigate better structures like patricia trees. Related somwhat to persistent storage though multi-attemt hashrequirement.


Bugs and misfeatures

Object WS overflow

Objects overflowing their workspace cause trouble, bugs happen when there is an odd object exceeding the limit (#228). It would be useful to easily log objects (HTTP headers and data) that were too big, as they might be backend bogosities to be dealth with.

Session WS info

As http_workspace was split up in object and session workspace, there is a need for logging information about sessions and how much space they actually need/how much is free. A varnishstat counter for how many sessions overflowed (like n_objoverflow, but for sessions) is also needed.

Thread Pool Sizing

The current thread pool sizing is still too aggressive, it will create too many threads too fast, a more adaptive decision algorithm is necessary.


Feature requests

X-sendfile header

Tells Varnish to send a named file as response. The file would live on the varnish host.

 http://blog.lighttpd.net/articles/2006/07/02/x-sendfile

Restarts

Purge & compress the session workspace when we restart. The hash is recomputed on restart and puts a lot of stress on the session workspace. It is not obvious that spending time compressing the workspace is always the best choice performance wise, even at 100 threads having a 64k session workspace is just 6.4MB of RAM.

TAR-file storage

This idea has kicked around since the projects beginning: Have a storage method that mmap's a tar file for emergency/static content. The real question is how and what to put into the hash. In reality, it is not a storage module we're talking about, but a pseudo-backend. Varnish is not really geared for that right now.

Log free workspace

Handle "out of workspace" more gracefully

Pipeline on backend connections (optional)

Put out a warning if we have many sessions parked on a busy object

Put a limit on the number of sessions parked on a busy object

Expiry randomization

When a varnish host starts they will pick up a lot of content fast, since most sites have only a few standard expiry times, for instance one week, a lot of objects will be expired at the same time one week later. The attachment lemming.png (link below), show that this effect is very much relevant, and that even after the third cycle the inherent randomization is not enough to smooth things out.

It may make sense to randomly reduce the TTL by up some randomized percentage, to help spread them out more.

XID's for backend, using XID instead of fd in shmlog

see #224

Logging from VCL

It should be possible to log to the shmlog and syslog from VCL.


Testing

We need to think about how to test varnishlog and friends in a sensible way.

---

CLI command to show actual argc/argv so -s and -h arguments can be examimed post-start.

---

Backend connection management

From #553:

Perlbal has a great feature where it checks the backend with an 'OPTIONS *' request at the start of each backend keep-alive session. It'd be nice to have this in Varnish, too.

To make this reasonable Perlbal also keeps reusing this same backend connection until disconnected, until a configurable number of requests have been done or until there are more than N idle backend connections (and it'll disconnect some.

It makes the "bad backend server" detection almost perfect and it makes "perfect load balancing" trivial -- just only allow CPUs * 1.5 simultaneous connections (or whatever) on each backend server and let Varnish try opening as many connections as it'd like.

Attachments