[Check_mk (english)] Check crash : netapp_api_vs_traffic

Jam Mulch spammagnet10 at gmail.com
Tue Jul 5 19:54:38 CEST 2016


Here's what i get when I do it:

Traceback (most recent call last):
   File "/omd/sites/foo/share/check_mk/modules/check_mk.py", line 5288, 
in <module>
     exit_status = do_check(hostname, ipaddress, check_types)
   File "/omd/sites/foo/share/check_mk/modules/check_mk_base.py", line 
1206, in do_check
     do_all_checks_on_host(hostname, ipaddress, only_check_types)
   File "/omd/sites/foo/share/check_mk/modules/check_mk_base.py", line 
1468, in do_all_checks_on_host
     result = sanitize_check_result(check_function(item, params, info), 
check_uses_snmp(checkname))
   File "/omd/sites/foo/share/check_mk/modules/check_mk_base.py", line 
1775, in sanitize_check_result
     return sanitize_yield_check_result(result, is_snmp)
   File "/omd/sites/foo/share/check_mk/modules/check_mk_base.py", line 
1781, in sanitize_yield_check_result
     subresults = list(result)
   File "/omd/sites/foo/share/check_mk/checks/netapp_api_vs_traffic", 
line 87, in check_netapp_api_vs_traffic
     rate = get_rate(what, now, int(data[what]) * scale)
KeyError: 'nfsv4_read_ops'
OMD[oitprod2r]:~$


I believe this means that the (unspecified) element being checked is 
missing the nfsv4_read_ops data.
I would suggest adding exception handling to the get_rate function to 
return 0 for 'KeyError exceptions'.

From:
         for what, perfname, perftext, scale, format_func in values:
             rate = get_rate(what, now, int(data[what]) * scale)
             yield 0, "%s %s: %s" % (protoname, perftext, 
format_func(rate)), [(perfname, rate)]

To:

         for what, perfname, perftext, scale, format_func in values:
             try:
                rate = get_rate(what, now, int(data[what]) * scale)
             except KeyError:
                rate = 0
             yield 0, "%s %s: %s" % (protoname, perftext, 
format_func(rate)), [(perfname, rate)]



On 07/05/2016 01:29 PM, Marcel Schulte wrote:
>
> Hi Kyâne,
>
> Your can also check the output of "cmk --debug -npvvvv --checks 
> netapp_api_vs_traffic AFFECTEDHOSTNAME", maybe that sheds some light 
> on the issue...
>
> Regards,
> Marcel
>
>
> Jam Mulch <spammagnet10 at gmail.com <mailto:spammagnet10 at gmail.com>> 
> schrieb am Di., 5. Juli 2016 19:10:
>
>     As I vaguely remember it...the netapp_api_vs_traffic check looks for
>     several pieces of data and if any
>     one is missing, the check crashes. (sent_errors recv_errors,
>     cifs_latency, afg_write_latency, etc....)
>
>     I added exceptions back around 1.2.8p1 to return a 0 result for any
>     results that depended on missing data,
>     but in later versions I just Acknowledged any crashed Traffic
>     checks (5
>     or so on one of my clusters,
>     and 2 on another).
>
>     To verify that this is your problem, look at the raw data from
>     agent_netapp and see if the volumes that
>     are crashing are missing one or more pieces of data that the ones
>     which
>     work are not missing.
>
>     Here is what 1.2.8p5 expects for the various types:
>
>     def check_netapp_api_vs_traffic(item, _no_params, parsed):
>          protocol_map = {
>              "lif:vserver": ("Ethernet",
>                   # ( what                 perfname perftext
>     scale     format_func)
>                  [  ("recv_data",          "if_in_octets", "received
>     data",      1,        get_bytes_human_readable),
>                     ("sent_data",          "if_out_octets", "sent
>     data",          1,        get_bytes_human_readable),
>                     ("recv_errors",        "if_in_errors", "received
>     errors",    1,        int),
>                     ("sent_errors",        "if_out_errors", "sent
>     errors",        1,        int),
>                     ("recv_packet",        "if_in_pkts", "received
>     packets",   1,        int),
>                     ("sent_packet",        "if_out_pkts", "sent
>     packets",       1,        int)]),
>
>              "fcp_lif:vserver": ("FCP",
>                  [  ("avg_read_latency",   "fcp_read_latency",  "avg.
>     Read latency",  0.001,    lambda x: "%.2f ms" % (x * 1000)),
>                     ("avg_write_latency",  "fcp_write_latency", "avg.
>     Write latency", 0.001,    lambda x: "%.2f ms" % (x * 1000)),
>                     ("read_data",          "fcp_read_data", "read
>     data",          1,        get_bytes_human_readable),
>                     ("write_data",         "fcp_write_data",  "write
>     data",         1,        get_bytes_human_readable)]),
>
>              "cifs:vserver": ("CIFS",
>                  [  ("cifs_read_latency",  "cifs_read_latency", "read
>     latency",       0.000000001, lambda x: "%.2f ms" % (x * 1000)),
>                     ("cifs_write_latency", "cifs_write_latency",  "write
>     latency",      0.000000001, lambda x: "%.2f ms" % (x * 1000)),
>                     ("cifs_read_ops",      "cifs_read_ios", "read
>     OPs",           1,        int),
>                     ("cifs_write_ops",     "cifs_write_ios",  "write
>     OPs",          1,        int)]),
>
>              "iscsi_lif:vserver": ("iSCSI",
>                  [  ("avg_read_latency",   "iscsi_read_latency",  "avg.
>     Read latency",  0.001,    lambda x: "%.2f ms" % (x * 1000)),
>                     ("avg_write_latency",  "iscsi_write_latency", "avg.
>     Write latency", 0.001,    lambda x: "%.2f ms" % (x * 1000)),
>                     ("read_data",          "iscsi_read_data", "read
>     data",          1,        get_bytes_human_readable),
>                     ("write_data",         "iscsi_write_data",  "write
>     data",         1,        get_bytes_human_readable)]),
>
>              "nfsv3": ("NFS",
>                  [  ("nfsv3_read_ops",     "nfs_read_ios",  "read
>     OPs",           1,        int),
>                     ("nfsv3_write_ops",    "nfs_write_ios", "write
>     OPs",          1,        int)]),
>
>              "nfsv4": ("NFSv4",
>                  [  ("nfsv4_read_ops",     "nfsv4_read_ios",  "read
>     OPs",           1,        int),
>                     ("nfsv4_write_ops",    "nfsv4_write_ios", "write
>     OPs",          1,        int)]),
>
>              "nfsv4_1": ("NFSv4.1",
>                  [  ("nfsv4_1_ops",        "nfsv4_1_ios",
>     "OPs",                1,        int) ])
>          }
>          vserver = item.split(" ", 3)
>
>          now = time.time()
>          for protocol, (protoname, values) in protocol_map.items():
>              data = parsed.get("%s.%s" % (protocol, item))
>              if not data:
>                  continue
>              for what, perfname, perftext, scale, format_func in values:
>                  rate = get_rate(what, now, int(data[what]) * scale)
>                  yield 0, "%s %s: %s" % (protoname, perftext,
>     format_func(rate)), [(perfname, rate)]
>
>
>     On 07/05/2016 12:34 PM, Kyâne PICHOU wrote:
>     > Hello,
>     >
>     > I use Check_MK raw 1.2.8p5 and I have an issue with the
>     > netapp_api_vs_traffic check.
>     >
>     > For two vservers (using nfsv4) I have an Unknown result and the
>     > message "UNKNOWN - check failed - please submit a crash
>     report!". And
>     > I also have a "No crash dump is available for this service."
>     > I check and it seems that the agent return valid data for those
>     > vserver, but the check crash.
>     >
>     > I don't know why and I don't have a crash report. What can I do ? Is
>     > there a way to fix the "no crash dump" thing ?
>     >
>     > Regards
>     >
>
>     _______________________________________________
>     checkmk-en mailing list
>     checkmk-en at lists.mathias-kettner.de
>     <mailto:checkmk-en at lists.mathias-kettner.de>
>     http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mathias-kettner.de/pipermail/checkmk-en/attachments/20160705/b87b3a0b/attachment-0001.html>


More information about the checkmk-en mailing list