[Check_mk (english)] Checking if backups are working

Ralph Bolton ralph.bolton at calltracks.com
Tue Jul 26 14:30:35 CEST 2016

We use Bacula for backups, which does things differently, but seems pretty
similar too.

I had a similar problem - how to monitor something per-client that is run
on a central server? The approach I took was:

- Get Bacula to run a script on each client after the backup is successful.
It just drops a one line text file that has the date and the backup job
number in it (it happens to be nicely human readable too)
- Write a local check to read that file and parse the date out of it. If
the date > 30 hours, then go to Warning, if > 50 hours, then go to
critical. That means if Bacula fails to visit at least daily, then we'll
get an alert from the system that's been affected. If everything is okay,
the check just echoes out the contents of the file.
- Have Bacula notify me of all failures (and actually, it's producing a
summary like 'the backups on these 49 servers all succeeded:' too) - we use
Slack, so it's posting those notification to a Slack channel, but email
would be fine too.

The last step is the alert that tells you to go do some work (and that we
actually ran some backups last night). The first two steps are really just
a secondary check to make sure Bacula is running properly, that we didn't
misconfigure a firewall which prevents backups to a host, and that all
production hosts are in Bacula's config.

In your case, if you have a summary file on your Tivoli server with one
line per client, then you can write a local check (on your tivoli server)
that parses that file and looks for failures (and goes critical if there
are any). You could get all fancy and have it count successes and failures
and put them in as 'performance stats' so you get graphs in CheckMK too, if
you wanted.

In fact, having just thought through your situation, I might well do
something similar for myself - that way we'd get a CheckMK alert (as well
as a Slack notification) when a backup fails in production - the good thing
is that when a failed backup was run successfully, it would clear the alert
(which Slack/email doesn't show very well). Could be cool :-)


On Mon, Jul 25, 2016 at 11:10 PM, Allan Thorne <allan.thorne at monash.edu>

> We used the piggyback method:
> 1) On our central Tivoli Server we produce a daily report of backups - 1
> line per server
> 2) From this report a local check (which goes into
> /usr/lib/check_mk_agent/local is generated which creates piggyback data
>     which can reside on the Tivoli server or in our case copied to the
> Monitoring master server.
>    eg.
>     #!/bin/sh
>     echo "<<<<server-a.com>>>>"
>     echo "<<<local>>>"
>     echo "0 tsm_backup - Backup Completed"
>     echo "<<<<>>>>"
>     echo "<<<<server-a.com>>>>"
>     echo "<<<local>>>"
>     echo "2 tsm_backup - Backup Terminated"
>     echo "<<<<>>>>"
>      Please note the use of two and four groups.
>      Once this piggy back data is sent, there will be a service
> 'tsm_backup' appear on each host specified.
> Regards
>        Allan Thorne
>        eSolutions
>        Monash University
> On 26 July 2016 at 06:16, Mathieu Levi <mlevi at collective.com> wrote:
>> There are a few different ways.
>> One way is to copy off the
>> /opt/omd/sites/SITENAME/var/check_mk/wato/snapshots/ dir, and if you need
>> to restore, simply restore the latest snapshot.  The downside is that this
>> auto-snapshot doesn't do more than back up the basic configuration and
>> event console data.  If you want the performance data, etc., then you'll
>> need to create the snapshot by hand.  Would love for someone how to tell me
>> how to control what's in the snapshot if I want more than basic config
>> backed up.
>> Another way is via the check_mk command (i.e. cmk --backup FILE.tar.gz
>> and --restore).  I have a script that runs nightly to back it up to a
>> network share.
>> Another way is to use the omd command (omd backup) but I haven't tested
>> that.
>> "Date: Mon, 25 Jul 2016 14:40:43 +0000
>> From: Laura DiMauro
>> To: "checkmk-en at lists.mathias-kettner.de"
>>         <checkmk-en at lists.mathias-kettner.de>
>> Subject: Re: [Check_mk (english)] Checking if backups are working
>> Message-ID:
>>         <
>> BN6PR07MB27703291448508699DABECADD30D0 at BN6PR07MB2770.namprd07.prod.outlook.com
>> >
>> Content-Type: text/plain; charset="iso-8859-1"
>> Hello,
>> I am trying to find a way to monitor our backups on network drives using
>> Check_MK.
>> We are using Tivoli for backups.
>> I am assuming I will need to create some kind of script but I am not sure
>> where to start.
>> Any help or advice will be appreciated.
>> Best Regards,
>> Laura"
>> _______________________________________________
>> checkmk-en mailing list
>> checkmk-en at lists.mathias-kettner.de
>> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
> --
> *Allan Thorne*
> Infrastructure Operations and Support
> Monash University
> 738 Blackburn Road, Clayton
> Monash University, VIC 3800
> Telephone: +61 3 9905 4791
> Mobile: +61 (0)408 991 028
> Email: allan.thorne at monash.ed*u* <allan.thorne at monash.edu>
> <jan.neale at monash.edu>
> CRICOS Provider 00008C/ 01857J
> _______________________________________________
> checkmk-en mailing list
> checkmk-en at lists.mathias-kettner.de
> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en


*Ralph Bolton*
Systems Administrator

*Calltracks Ltd*

Email:   ralph.bolton at calltracks.com
Web:    www.calltracks.com
Tel:      +44 20 3199 9000
Fax:     +44 20 3199 9009
High availability call monitoring, tracking and NTS services. The opinions
expressed are those of the individual and not the company. Internet
communications are not secure and therefore Calltracks Ltd ("the company")
does not accept liability for any claims arising as a result of the use of
this medium for transmissions by or to the company. This email and any
files transmitted with it are confidential. If you are not the intended
recipient, you are hereby notified that any disclosure, distribution or
copying of this communication is strictly prohibited. Whilst we take every
reasonable precaution to screen out computer viruses from emails,
attachments to the email may contain such viruses. We cannot accept
liability for loss or damage resulting from such viruses. Calltracks Ltd is
registered in England and Wales 6539973 at Unit 15, 3rd Floor, 23-28 Penn
Street, London, N1 5DL
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mathias-kettner.de/pipermail/checkmk-en/attachments/20160726/fbd090ca/attachment-0001.html>

More information about the checkmk-en mailing list