<div dir="ltr"><div>Hi Poul-Henning,</div><div><br></div><div>Thank you for your answer!</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jul 28, 2020 at 5:01 PM Poul-Henning Kamp <<a href="mailto:phk@phk.freebsd.dk">phk@phk.freebsd.dk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">--------<br>

Martin Grigorov writes:<br>

<br>

> Any feedback and ideas how to tweak it (VCL or even patches) are very<br>

> welcome!<br>

<br>

First you need to tweak your benchmark setup.<br>

<br>

   aarch64<br>

<br>

          Thread Stats   Avg      Stdev     Max   +/- Stdev<br>

            Latency   655.40us  798.70us  28.43ms   90.52%<br>

<br>

Strictly speaking, you cannot rule out that the ARM machine<br>

sends responses before it receives the request, because your<br>

standard deviation is larger than your average.<br></blockquote><div><br></div><div>Could you explain in what case(s) the server would send responses before receiving a request ?</div><div>Do you think that there might be negative values for the latency of some requests ?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

In other words:  Those numbers tell us nothing.<br>

<br>

If you want to do this comparison, and I would love for you to do so,<br>

you really need to take the time it takes, and get your "noise" down.<br>

<br>

Here is how you should do it:<br>

<br>

        for machine in ARM, INTEL<br>

                Reboot machine<br>

                For i in (at least) 1-5:<br>

                        Run test for 5 minutes<br>

<br>

If the results from the first run on each machine is very different<br>

from the other four runs, you can disrecard it, as a startup/bootup<br>

artifact.<br>

<br>

Report the numbers for all the runs for both machines.<br>

<br>

Make a plot of all those numbers, where you plot the reported<br>

average +/- stddev as a line, and the max value as a dot/cross/box.<br>

<br>

If you want to get fancy, you can do a Student's T test to tell<br>

you if there is any real difference.  There's a program called<br>

"ministat" which will do this for you.<br></blockquote><div><br></div><div>ministat looks cool! Thanks!</div><div>I think I can save the raw latencies for all requests into a file and feed ministat with it!</div><div><br></div><div>Gil Tene also didn't like how wrk measures the latency and forked it to <a href="https://github.com/giltene/wrk2">https://github.com/giltene/wrk2</a>. wrk2 measures the latency by using constant rate/throughput, while wrk focuses on as high throughput as possible and just reports the latency percentiles.</div><div>wrk2 also prints detailed latency distribution as at <a href="https://github.com/giltene/wrk2#basic-usage">https://github.com/giltene/wrk2#basic-usage</a> (not as plot chart but still useful).</div><div><br></div><div>The only problem is that wrk2 is not well maintained and it doesn't work on modern aarch64 due to the old version of Lua. I'll try to upgrade it.</div><div> </div><div>Regards,</div><div>Martin</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Also:  I can highly recommend this book:<br>

<br>

   <a href="http://www.larrygonick.com/titles/science/the-cartoon-guide-to-statistics/" rel="noreferrer" target="_blank">http://www.larrygonick.com/titles/science/the-cartoon-guide-to-statistics/</a><br>

<br>

-- <br>

Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20<br>

phk@FreeBSD.ORG         | TCP/IP since RFC 956<br>

FreeBSD committer       | BSD since 4.3-tahoe    <br>

Never attribute to malice what can adequately be explained by incompetence.<br>

</blockquote></div></div>