[Check_mk (english)] Checking if backups are working

Laura DiMauro ldimauro at unm.edu
Tue Jul 26 16:24:43 CEST 2016

Thank you all for the suggestions!

From: tosolini, walter <walter.tosolini at cogitoweb.it>
Sent: Tuesday, July 26, 2016 7:29:12 AM
To: Ralph Bolton
Cc: Allan Thorne; Mathieu Levi; Laura DiMauro; checkmk-en
Subject: Re: [Check_mk (english)] Checking if backups are working

For Bacula system i found a plugin on Github.com and works fine, but you have to use postgree as db.


Distinti Saluti,
Walter Tosolini
System Engineer

Via Tavagnacco 63, 33100 Udine
Tel. +39 0432 486316<tel:%2B39%200432%20486316>
Fax +39 0432 1632281<tel:%2B39%200432%201632281>

Dati societari: www.cogitoweb.it<http://www.cogitoweb.it/> - e-mail: info at cogitoweb.it<mailto:info at cogitoweb.it>

Il contenuto di questa mail è tutelato dalla normativa sulla Privacy :

2016-07-26 14:30 GMT+02:00 Ralph Bolton <ralph.bolton at calltracks.com<mailto:ralph.bolton at calltracks.com>>:
We use Bacula for backups, which does things differently, but seems pretty similar too.

I had a similar problem - how to monitor something per-client that is run on a central server? The approach I took was:

- Get Bacula to run a script on each client after the backup is successful. It just drops a one line text file that has the date and the backup job number in it (it happens to be nicely human readable too)
- Write a local check to read that file and parse the date out of it. If the date > 30 hours, then go to Warning, if > 50 hours, then go to critical. That means if Bacula fails to visit at least daily, then we'll get an alert from the system that's been affected. If everything is okay, the check just echoes out the contents of the file.
- Have Bacula notify me of all failures (and actually, it's producing a summary like 'the backups on these 49 servers all succeeded:' too) - we use Slack, so it's posting those notification to a Slack channel, but email would be fine too.

The last step is the alert that tells you to go do some work (and that we actually ran some backups last night). The first two steps are really just a secondary check to make sure Bacula is running properly, that we didn't misconfigure a firewall which prevents backups to a host, and that all production hosts are in Bacula's config.

In your case, if you have a summary file on your Tivoli server with one line per client, then you can write a local check (on your tivoli server) that parses that file and looks for failures (and goes critical if there are any). You could get all fancy and have it count successes and failures and put them in as 'performance stats' so you get graphs in CheckMK too, if you wanted.

In fact, having just thought through your situation, I might well do something similar for myself - that way we'd get a CheckMK alert (as well as a Slack notification) when a backup fails in production - the good thing is that when a failed backup was run successfully, it would clear the alert (which Slack/email doesn't show very well). Could be cool :-)


On Mon, Jul 25, 2016 at 11:10 PM, Allan Thorne <allan.thorne at monash.edu<mailto:allan.thorne at monash.edu>> wrote:
We used the piggyback method:

1) On our central Tivoli Server we produce a daily report of backups - 1 line per server
2) From this report a local check (which goes into /usr/lib/check_mk_agent/local is generated which creates piggyback data
    which can reside on the Tivoli server or in our case copied to the Monitoring master server.


    echo "<<<<server-a.com<http://server-a.com>>>>>"
    echo "<<<local>>>"
    echo "0 tsm_backup - Backup Completed"
    echo "<<<<>>>>"
    echo "<<<<server-a.com<http://server-a.com>>>>>"
    echo "<<<local>>>"
    echo "2 tsm_backup - Backup Terminated"
    echo "<<<<>>>>"

     Please note the use of two and four groups.

     Once this piggy back data is sent, there will be a service 'tsm_backup' appear on each host specified.

       Allan Thorne
       Monash University

On 26 July 2016 at 06:16, Mathieu Levi <mlevi at collective.com<mailto:mlevi at collective.com>> wrote:
There are a few different ways.

One way is to copy off the /opt/omd/sites/SITENAME/var/check_mk/wato/snapshots/ dir, and if you need to restore, simply restore the latest snapshot.  The downside is that this auto-snapshot doesn't do more than back up the basic configuration and event console data.  If you want the performance data, etc., then you'll need to create the snapshot by hand.  Would love for someone how to tell me how to control what's in the snapshot if I want more than basic config backed up.

Another way is via the check_mk command (i.e. cmk --backup FILE.tar.gz and --restore).  I have a script that runs nightly to back it up to a network share.

Another way is to use the omd command (omd backup) but I haven't tested that.

"Date: Mon, 25 Jul 2016 14:40:43 +0000
From: Laura DiMauro
To: "checkmk-en at lists.mathias-kettner.de<mailto:checkmk-en at lists.mathias-kettner.de>"
        <checkmk-en at lists.mathias-kettner.de<mailto:checkmk-en at lists.mathias-kettner.de>>
Subject: Re: [Check_mk (english)] Checking if backups are working
        <BN6PR07MB27703291448508699DABECADD30D0 at BN6PR07MB2770.namprd07.prod.outlook.com<mailto:BN6PR07MB27703291448508699DABECADD30D0 at BN6PR07MB2770.namprd07.prod.outlook.com>>

Content-Type: text/plain; charset="iso-8859-1"


I am trying to find a way to monitor our backups on network drives using Check_MK.

We are using Tivoli for backups.

I am assuming I will need to create some kind of script but I am not sure where to start.

Any help or advice will be appreciated.

Best Regards,


checkmk-en mailing list
checkmk-en at lists.mathias-kettner.de<mailto:checkmk-en at lists.mathias-kettner.de>

Allan Thorne

Infrastructure Operations and Support
Monash University
738 Blackburn Road, Clayton
Monash University, VIC 3800
Telephone: +61 3 9905 4791<tel:%2B61%203%209905%204791>
Mobile: +61 (0)408 991 028<tel:%2B61%20%280%29408%20991%20028>
Email: allan.thorne at monash.edu<mailto:allan.thorne at monash.edu>
<mailto:jan.neale at monash.edu>

CRICOS Provider 00008C/ 01857J

checkmk-en mailing list
checkmk-en at lists.mathias-kettner.de<mailto:checkmk-en at lists.mathias-kettner.de>


Ralph Bolton
Systems Administrator

Calltracks Ltd

Email:   ralph.bolton at calltracks.com<mailto:ralph.bolton at calltracks.com>
Web:    www.calltracks.com<http://www.calltracks.com/>
Tel:      +44 20 3199 9000<tel:%2B44%2020%203199%209000>
Fax:     +44 20 3199 9009<tel:%2B44%2020%203199%209009>

High availability call monitoring, tracking and NTS services. The opinions expressed are those of the individual and not the company. Internet communications are not secure and therefore Calltracks Ltd ("the company") does not accept liability for any claims arising as a result of the use of this medium for transmissions by or to the company. This email and any files transmitted with it are confidential. If you are not the intended recipient, you are hereby notified that any disclosure, distribution or copying of this communication is strictly prohibited. Whilst we take every reasonable precaution to screen out computer viruses from emails, attachments to the email may contain such viruses. We cannot accept liability for loss or damage resulting from such viruses. Calltracks Ltd is registered in England and Wales 6539973 at Unit 15, 3rd Floor, 23-28 Penn Street, London, N1 5DL

checkmk-en mailing list
checkmk-en at lists.mathias-kettner.de<mailto:checkmk-en at lists.mathias-kettner.de>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mathias-kettner.de/pipermail/checkmk-en/attachments/20160726/730316ad/attachment-0001.html>

More information about the checkmk-en mailing list