All graphs show the load as computed by the kernel at a frequency of 1/5.01sec in red, the load as computed at 1/4.61sec using the same algorithm in blue and the "real" load in between probes based on checking runnable tasks every millisecond in green.
They were created using loadavg.c and this shell script loadavg-rrd.
first some artificial load with the first 30 ticks in each second having load 1 (./loadavg-rrd -l30):
first 4 ticks in each second having load 1 (./loadavg-rrd -l4):
An hour of real load from a server which is mostly idle,
but gets regular probes from two loadbalancers via SSL
Here the kernel load is even worse than with artifical data of the same average load 0.04. While loadavg.c is running during this hour and may be seen occasionally by the kernel load probe, it's the same patterns throughout the day (the detailled hour above was the last in this graph):
It's perfectly ok for the load to jump by around .08 when it sees a runnable process. That's due to the algorithm that computes the new value for the "1 minute average" as 8%*current + 92%*old (92% ~ 1884/2048; 1884 is EXP_1 from include/linux/sched.h).
It's also ok to sit at load 0.00 when it's really 0.04. If the chance of seeing no runnable task is 96%, the odds for 12 probes of 0 in a row are still 61%.
You can see what's really going on in this detailled view of minutes 8-10 from the above hour, scaled by 100 so 1sec here is a tick. The image is actually 12.000 pixels wide, click for the full show.
There is a load spanning a few ticks, and the time at which the kernel checks(i.e. steps in the kernel loadavg) moves slowly through this load left to right. That's exactly the Moiré effect. (Note there is some delay in the kernel's update, so there are increases even a bit to the right of the load).
Finally, here goes the patch. Adjusting the exponents is not really important, the "1-min" average is still more or less for 1 minute without. It's not exact anyway.
--- a/include/linux/sched.h 2010-12-09 22:29:45.000000000 +0100 +++ b/include/linux/sched.h 2011-11-16 11:49:55.000000000 +0100 @@ -123,10 +123,10 @@ #define FSHIFT 11 /* nr of bits of precision */ #define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */ -#define LOAD_FREQ (5*HZ+1) /* 5 sec intervals */ -#define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point */ -#define EXP_5 2014 /* 1/exp(5sec/5min) */ -#define EXP_15 2037 /* 1/exp(5sec/15min) */ +#define LOAD_FREQ (4*HZ+61) /* 4.61 sec intervals */ +#define EXP_1 1896 /* 1/exp(4.61sec/1min) as fixed-point */ +#define EXP_5 2017 /* 1/exp(4.61sec/5min) */ +#define EXP_15 2038 /* 1/exp(4.61sec/15min) */ #define CALC_LOAD(load,exp,n) \ load *= exp; \
Mingye Wang suggested the following improvement: 4*HZ+61 comes with an assumption that the HZ is 100, making it unsuitable for kernels built using other configurations. I recommend replacing the expression with just (60*HZ/13)
* For HZ = 100, it still gives 461 ticks (4.61s). * For the common HZ = 250, it gives 1153 ticks (4.612s). * For HZ = 300, it gives 1384 ticks (4.6133s). * For HZ = 1000, it gives 4615 ticks (4.615s). The exponents shouldn't need to be changed. PS: Overflow with 60*HZ should not be a concern since Linux does not work with 16-bit platforms. If it magically ends up there, use 4*HZ+3*HZ/5+HZ/65. The two are equivalent for all HZ divisible for 5, and compilers evaluate this constant expression anyways.Thanks a lot!