How to work around the nagios-plugins check_disk perfdata bug in CentOS
The check_disk monitoring plugin, installed from the nagios-plugins-disk package from EPEL, contains a bug which does use invalid UOM (Unit of Measurement) values.
Instead of the previously known Byte values (B, MB, GB), the check_disk plugin shows values in MiB and GiB:
[root@centos7 ~]# /usr/lib/nagios/plugins/check_disk -w 15% -c 5% -W 10% -K 5% -p / DISK OK - free space: / 8640 MiB (76.21% inode=95%);| /=2696MiB;10171;11368;0;11967
Compared to check_disk on a Ubuntu 18.04, installed from the monitoring-plugins package, the difference in UOM can be seen:
root@ubuntu1804:~# /usr/lib/nagios/plugins/check_disk -w 15% -c 5% -W 10% -K 5% -p / DISK OK - free space: / 8128 MB (28% inode=95%);| /=20511MB;25592;28603;0;30109
But what's the problem?
Obviously for the alerting (whether you are using Nagios, Icinga, Naemon, checkMK or another monitoring core software) this doesn't bother much. The thresholds are still correctly compared against the reported usage (value). The problem is rather in the background when trying to parse the performance data; the information behind the | (pipe) character.
[2021-02-17 06:27:56 +0100] warning/InfluxdbWriter: Ignoring invalid perfdata for checkable 'centos7!Diskspace /' and command 'nrpe' with value: /=2696MiB;10171;11368;0;11967
Because the performance data is seen as invalid by the monitoring core application (above log shows Icinga 2), the values are not written into the graph database (InfluxDB in this example). Therefore graphs are not created for this service check.
The Nagios Plugin Development Guidelines (which in general apply to monitoring plugins) define the performance data's UOM as:
UOM (unit of measurement) is a string of zero or more characters, NOT including numbers, semicolons, or quotes. Some examples:
1. no unit specified – assume a number (int or float) of things (eg, users, processes, load averages)
2. s – seconds (also us, ms)
3. % – percentage
4. B – bytes (also KB, MB, TB)
5. c – a continous counter (such as bytes transmitted on an interface)
The invalid UOM was introduced as a regression bug in nagios-plugins 2.3.0 but was, according to the nagios-plugins release notes, fixed in 2.3.2 where the old default using binary units was applied again.
However when manually compiling the latest nagios-plugins 2.3.3 release, the check_disk plugin still shows the invalid UOM:
# ./check_disk -V check_disk v2.3.3 (nagios-plugins 2.3.3) # ./check_disk -w 15% -c 5% -W 10% -K 5% -p / DISK OK - free space: / 8447 MiB (89.03% inode=95%);| /=1040MiB;8512;9514;0;10015
How to workaround this?
The fastest workaround is to use the -m parameter, which forces check_disk to use an output in MB (MegaBytes):
# ./check_disk -w 15% -c 5% -W 10% -K 5% -p / -m DISK OK - free space: / 8447 MB (89.03% inode=95%);| /=1040MB;8512;9514;0;10015
If you are using NRPE as remote plugin execution daemon, simply append the -m parameter to the check_command definition:
command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -W $ARG1$ -K $ARG2$ -p $ARG3$ -m