[Check_mk (english)] Bug: windows agent - local check data truncation

bob glugingboulder at gmail.com
Wed Feb 4 04:05:57 CET 2015


Hmm, interesting.
Yeah we ended up adding SetProcessAffinityMask in main, forcing the program
to run under a single core.

Perhaps a solution would be to toggle multicore or single via config file
in future versions?


On Fri, Jan 30, 2015 at 9:32 PM, Vogt, Jonathan <
j.vogt at neue-pressegesellschaft.de> wrote:

>  Hello,
>
>
>
> Yes, we also have this issue with the veeam backup script when outputting
> about 100 vms. Currently use a manual pinning to one core.
>
>
>
> Regards
>
> Jonathan
>
>
>
> *Von:* checkmk-en-bounces at lists.mathias-kettner.de [mailto:
> checkmk-en-bounces at lists.mathias-kettner.de] *Im Auftrag von *bob
> *Gesendet:* Donnerstag, 29. Januar 2015 22:20
> *An:* Andreas Döhler
> *Cc:* checkmk-en at lists.mathias-kettner.de
> *Betreff:* Re: [Check_mk (english)] Bug: windows agent - local check data
> truncation
>
>
>
> Hi There,
>
>
>
> Tried testing with the async options still has the issue. Has anyone else
> experienced truncation with large output from local checks?, we are no
> where near the MAX buffer size.
>
> We only get this issue on multicore servers.
>
>
>
> This is the section of code that the issue occurs.
>
> Another workaround is setting the buffer size of "buf" = 512, seems to
> keep the stream steady.
>
>
>
>
>
> while (!buffer_full) {
>
>             PeekNamedPipe(read_stdout, buf, sizeof(buf), &bread, &avail,
> NULL);
>
>             if (avail == 0)
>
>                 break;
>
>
>
>             while (out_offset + bread > current_heap_size) {
>
>                 // Increase heap buffer
>
>                 if (current_heap_size * 2 <= HEAP_BUFFER_MAX) {
>
>                     cont->buffer_work = (char *)
> HeapReAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY,
>
>
>  cont->buffer_work, current_heap_size * 2);
>
>                     current_heap_size = HeapSize(GetProcessHeap(), 0,
> cont->buffer_work);
>
>                 }
>
>                 else {
>
>                     buffer_full = true;
>
>                     break;
>
>                 }
>
>             }
>
>             if (buffer_full)
>
>                 break;
>
>
>
>             if (bread > 0) {
>
>                 memset(buf, 0, sizeof(buf));
>
>                 ReadFile(read_stdout, buf, sizeof(buf) - 1, &bread, NULL);
>
>                 out_offset += snprintf(cont->buffer_work + out_offset,
> current_heap_size - out_offset, buf);
>
>             }
>
>         }
>
>
>
> Luka
>
>
>
> On Tue, Jan 20, 2015 at 8:36 PM, Andreas Döhler <andreas.doehler at gmail.com>
> wrote:
>
> You should test one thing. With the last stable 1.2.4p5 and your version
> it is possible to define the execution type of your local checks.
> Beside the execution also a caching can be activated.
>
>
>
> For scripts with large output i use a configuration like.
>
> [global]
>
>     async_script_execution = parallel
>
>
>
> [local]
>
>     execution my_local_check.cmd = async
>
>     timeout my_local_check.cmd = 60
>
>     cache_age my_local_check.cmd = 120
>
>
>
> If you can use the parallel script execution depends on your check scripts.
>
>
>
> Best regards
>
> Andreas
>
>
>
>
>
>
>
> bob <glugingboulder at gmail.com> schrieb am Mon Jan 19 2015 at 20:20:31:
>
> Hi There,
>
>
>
> We have been using the windows check_mk agent (1.2.4p2) to do local checks
> for SQL Server.
>
> Local checks using all default configs, issue exists in all versions and
> architectures for windows.
>
>
>
> With small sets of data from the local check output from a c# app there is
> no issue, when large sets of data is outputted it sometimes truncates.
>
> From tests anything > 30000 characters replicates this behaviour, it’s
> also random when it truncates.
>
>
>
> Looking at the source we have found two workarounds.
>
> 1). Adding a Sleep(10); before the PeekNamedPipe, this appears to give the
> threads better synchronisation, greatly reduces the occurrences of
> truncation (Peek show 0 bytes available when there actually is and exits
> the stream early).
>
> 2). Forcing the whole program to run on a single CPU stops the issue.
>
>
>
> From these fixes its looks like a synchronisation issue between processes,
> cannot figure out where exactly this is occurring.
>
>
>
> Thanks
>
>
>
> Luka
>
>
>
> _______________________________________________
> checkmk-en mailing list
> checkmk-en at lists.mathias-kettner.de
> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mathias-kettner.de/pipermail/checkmk-en/attachments/20150204/5a48f139/attachment.html>


More information about the checkmk-en mailing list