[kwlug-disc] Load average record?

Khalid Baheyeldin kb at 2bits.com
Tue Jan 16 14:33:57 EST 2024


Anyone seen load average in the thousands?
I never saw that, until today ...
Load average on a server was over 4000!

Turns out it was a backup disk that failed, and tar was backing up a
huge directory to it.
The backup disk also held the Varnish cache, and it was being hammered
by requests.

This is right after telling Varnish to ignore the faulty disk
Note the 15 minute load average was 2500 several minutes after the bad
disk was bypassed.
Wait for I/O was 15.

    ---- --total-cpu-usage-- ---load-avg--- ---procs--- ---system-- -dsk/total-
Time    |usr sys idl wai stl| 1m   5m  15m |run blk new| int   csw | read  writ|
10:09:21| 10   3  73  15   0|20.4  935 2510|7.0 8.0 0.6|  20k   33k| 628k   11M|
10:09:26| 15   3  69  13   0|20.3  920 2497|6.0 8.0   0|  24k   42k| 258k   12M|

And this is after things stabilized:
The load average is 15 (a bit higher than usual during peak days/hours)
Wait for I/O is 4.

14:08:29| 16   3  78   4   0|15.1 15.5 15.6| 10 3.0 2.5|  24k   46k|  93k 5068k|
14:08:44| 17   3  77   4   0|15.2 15.5 15.6| 14 3.0 2.1|  22k   44k|  86k 4923k|

The failed disk was created using mdam though as RAID1, but I cannot
remove it from the array for some reason.
-- 
Khalid M. Baheyeldin



More information about the kwlug-disc mailing list