[omd-users] Check_mk own check PEND - Cannot compute check result: Value overflow

Milan Jeskynka Kazatel KazatelM at seznam.cz
Fri Mar 13 16:56:16 CET 2020


Hi,



it seems to be a python limit - 2^63 - 1

I´m pretty sure, that the counter did not get that value, even if the get_
rate() does a subtraction of two metrics values, the serverside check is 
below, can someone look on the code a help me with a more sophisticated 
method, than just google it?
I´m not a programmer, then it is for me a try and burns method. Yes, the 
code is my own but is based on Check_mk published examples.
Maybe someone has more experience with troubleshooting methods in Check_mk 
and can share his approach. 

many thanks,
-- 
Smil Milan Jeskyňka Kazatel

---------- Původní e-mail ----------
Od: Stan Brown <stanbrow at gmail.com>
Komu: Milan Jeskynka Kazatel <KazatelM at seznam.cz>
Datum: 13. 3. 2020 15:09:06
Předmět: Re: Check_mk own check PEND - Cannot compute check result: Value 
overflow 
"
I am not the expert on this, but i believe this is all done with shell 
scripts. You should be able to Google the maximum value that can be store in
a shell integer, not certain what to do other than delete the check that is 
returning a value that is too big.



On Fri, Mar 13, 2020 at 8:20 AM Milan Jeskynka Kazatel <KazatelM at seznam.cz
(mailto:KazatelM at seznam.cz)> wrote:

"
Hello,



yes, very likely.

How to debug which variable is bigger or how to protect the check regarding 
unexpected results? I hoped, that the get_rate() should handle it.

Best regards, 
-- 
Smil Milan Jeskyňka Kazatel

---------- Původní e-mail ----------
Od: Stan Brown <stanbrow at gmail.com(mailto:stanbrow at gmail.com)>
Komu: Milan Jeskynka Kazatel <KazatelM at seznam.cz(mailto:KazatelM at seznam.cz)>
Datum: 13. 3. 2020 13:03:21
Předmět: Re: [omd-users] Check_mk own check PEND - Cannot compute check 
result: Value overflow 
"
Looks to me like some of your results are bigger than expected. total 
queries for instance.



On Fri, Mar 13, 2020 at 4:52 AM Milan Jeskynka Kazatel <KazatelM at seznam.cz
(mailto:KazatelM at seznam.cz)> wrote:

"
Hello,



I´m facing an unexpected behavior in my own check_mk check for DNS Unbound 
(cumulative statistic: yes) where seems to be the server-side check somehow 
broken.

It can be inventoried, it shows metrics, but sometimes it has shown a 
service stale status.




In command line is visible a message: unbound              PEND - Cannot 
compute check result: Value overflow
I´m not able to figure out what is wrong.




Could you please someone hint me for debug? The agent output is normal 
integer counters which are continuously increased.




Server side:



#!/usr/bin/python

# -*- encoding: utf-8; py-indent-offset: 4 -*-




unbound_queries_default_levels = (1250,2000)




def inventory_unbound (info): 

    if len(info):

        return [("unbound", "unbound_queries_default_levels")] 




def check_unbound(item, params, info):

    warn, crit = params

    perfdata = []

    status = 0

    message = ""

    now = time.time()

    for line in info:

        name = line[0]

        value = int(line[1])

        menofunkce = "f_%s" % (name)

        rate = get_rate(menofunkce, now, value)

        perfdata.append(( name, rate ))

        if (name in "total_num_queries" and rate >= crit):

            status = 2

            message = "Crit - unexpected trafic total_num_queries %.2f /sec 
(warn/crit above %s/%s )" % (rate, warn, crit)

        elif (name in "total_num_queries" and rate >= warn):

            status = 1

            message = "Warn - increased trafic total_num_queries %.2f /sec 
(warn/crit above %s/%s )" % (rate, warn, crit)

        elif (name in "total_num_queries" and rate < warn):

            message = "total_num_queries %.2f /sec (warn/crit above %s/%s )"
% (rate, warn, crit)

    return(status, message, perfdata)







# declare the check to Check_MK

check_info['unbound'] = {

    "check_function"      : check_unbound,

    "inventory_function"  : inventory_unbound,

    "service_description" : '%s',

    "has_perfdata"        : True,

}





Agent side:


#!/bin/sh

if  command -v unbound-control > /dev/null 2>&1

then

  echo '<<<unbound>>>'

  unbound-control status > /dev/null 2>&1

  status=$?

  echo "status $status"

    if [ "$status" -eq 0 ]

    then

    unbound-control stats | sed 's/=/ /' | tr '.' '_' | grep -v "histogram\|
\time\|\total_requestlist"

    fi

fi






Check_mk command line output:


OMD[dev]:~$ check_mk --debug -vv --checks=unbound DNSRVU

[cpu_tracking] Start with phase 'busy'

Check_MK version 1.5.0p23

+ FETCHING DATA

[cpu_tracking] Push phase 'agent' (Stack: ['busy'])

 [agent] No persisted sections loaded

 [agent] Not using cache (Don't try it)

 [agent] Execute data source

 [agent] Connecting via TCP to 172.50.1.3:6556(http://172.50.1.3:6556) (5.0s
timeout)

 [agent] Reading data from agent

 [agent] Write data to cache file /omd/sites/devel/tmp/check_mk/cache/DNSRVU

[cpu_tracking] Pop phase 'agent' (Stack: ['busy', 'agent'])

[cpu_tracking] Push phase 'agent' (Stack: ['busy'])

 [piggyback] No persisted sections loaded

 [piggyback] Execute data source

[cpu_tracking] Pop phase 'agent' (Stack: ['busy', 'agent'])

unbound              PEND - Cannot compute check result: Value overflow

[cpu_tracking] End

OK - [agent] Version: 1.4.0p31, OS: linux, execution time 0.7 sec | 
execution_time=0.745 user_time=0.020 system_time=0.010 children_user_time=
0.000 children_system_time=0.000 cmk_time_agent=0.715






Agent output:


<<<unbound>>>

status 0

thread0_num_queries 15588707

thread0_num_queries_ip_ratelimited 0

thread0_num_cachehits 15588703

thread0_num_cachemiss 4

thread0_num_prefetch 0

thread0_num_zero_ttl 0

thread0_num_recursivereplies 4

thread0_requestlist_avg 0

thread0_requestlist_max 0

thread0_requestlist_overwritten 0

thread0_requestlist_exceeded 0

thread0_requestlist_current_all 0

thread0_requestlist_current_user 0

thread0_tcpusage 0

thread1_num_queries 4625290

thread1_num_queries_ip_ratelimited 0

thread1_num_cachehits 4625270

thread1_num_cachemiss 20

thread1_num_prefetch 0

thread1_num_zero_ttl 0

thread1_num_recursivereplies 20

thread1_requestlist_avg 0

thread1_requestlist_max 0

thread1_requestlist_overwritten 0

thread1_requestlist_exceeded 0

thread1_requestlist_current_all 0

thread1_requestlist_current_user 0

thread1_tcpusage 0

thread2_num_queries 1719352

thread2_num_queries_ip_ratelimited 0

thread2_num_cachehits 1719344

thread2_num_cachemiss 8

thread2_num_prefetch 0

thread2_num_zero_ttl 0

thread2_num_recursivereplies 8

thread2_requestlist_avg 0

thread2_requestlist_max 0

thread2_requestlist_overwritten 0

thread2_requestlist_exceeded 0

thread2_requestlist_current_all 0

thread2_requestlist_current_user 0

thread2_tcpusage 0

thread3_num_queries 15583658

thread3_num_queries_ip_ratelimited 0

thread3_num_cachehits 15583658

thread3_num_cachemiss 0

thread3_num_prefetch 0

thread3_num_zero_ttl 0

thread3_num_recursivereplies 0

thread3_requestlist_avg 0

thread3_requestlist_max 0

thread3_requestlist_overwritten 0

thread3_requestlist_exceeded 0

thread3_requestlist_current_all 0

thread3_requestlist_current_user 0

thread3_tcpusage 0

total_num_queries 37517007

total_num_queries_ip_ratelimited 0

total_num_cachehits 37516975

total_num_cachemiss 32

total_num_prefetch 0

total_num_zero_ttl 0

total_num_recursivereplies 32

total_tcpusage 0

mem_cache_rrset 66072

mem_cache_message 66289

mem_mod_iterator 16588

mem_mod_validator 66352

mem_mod_respip 0

mem_streamwait 0

num_query_type_A 37516990

num_query_type_PTR 1

num_query_type_AAAA 16

num_query_class_IN 37517007

num_query_opcode_QUERY 37517007

num_query_tcp 0

num_query_tcpout 0

num_query_tls 0

num_query_tls_resume 0

num_query_ipv6 36356

num_query_flags_QR 0

num_query_flags_AA 0

num_query_flags_TC 0

num_query_flags_RD 37517007

num_query_flags_RA 0

num_query_flags_Z 0

num_query_flags_AD 108911

num_query_flags_CD 0

num_query_edns_present 108911

num_query_edns_DO 0

num_answer_rcode_NOERROR 37516974

num_answer_rcode_FORMERR 0

num_answer_rcode_SERVFAIL 32

num_answer_rcode_NXDOMAIN 1

num_answer_rcode_NOTIMPL 0

num_answer_rcode_REFUSED 0

num_query_ratelimited 0

num_answer_secure 0

num_answer_bogus 0

num_rrset_bogus 0

num_query_aggressive_NOERROR 0

num_query_aggressive_NXDOMAIN 0

unwanted_queries 0

unwanted_replies 0

msg_cache_count 1

rrset_cache_count 0

infra_cache_count 26

key_cache_count 0

num_query_authzone_up 0

num_query_authzone_down 0











regards, 
-- 
Smil Milan Jeskyňka Kazatel


_______________________________________________
omd-users mailing list
omd-users at lists.mathias-kettner.de
(mailto:omd-users at lists.mathias-kettner.de)
Manage your subscription or unsubscribe
https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/omd-users
(https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/omd-users)
"




-- 





UNIX is basically a simple operating system, but you have to be a genius to 
understand the simplicity.

Dennis Ritchie(https://www.brainyquote.com/authors/dennis-ritchie-quotes)




"

"




-- 





UNIX is basically a simple operating system, but you have to be a genius to 
understand the simplicity.

Dennis Ritchie(https://www.brainyquote.com/authors/dennis-ritchie-quotes)




"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mathias-kettner.de/pipermail/omd-users/attachments/20200313/f95c0943/attachment-0001.html>


More information about the omd-users mailing list