{"id":399,"date":"2015-09-24T15:24:17","date_gmt":"2015-09-24T12:24:17","guid":{"rendered":"http:\/\/www.bandidor.info\/wp\/?p=399"},"modified":"2015-10-16T18:46:21","modified_gmt":"2015-10-16T15:46:21","slug":"monitor-bacula-backup-jobs-with-zabbix","status":"publish","type":"post","link":"https:\/\/www.bandidor.info\/wp\/?p=399","title":{"rendered":"Monitor bacula backup jobs with zabbix"},"content":{"rendered":"<h1 style=\"color: #000000;\"><strong>Subject<\/strong><\/h1>\n<p style=\"color: #000000;\">There is an email notification mechanism built in into <em>bacula<\/em> out of the box. It really sends out email notification after each job.\u00a0This works, but in the case a job is stuck and who reads emails after all?<\/p>\n<p style=\"color: #000000;\">So I decided to make my <em>zabbix<\/em> monitoring solution to handle this.<\/p>\n<h1 style=\"color: #000000;\"><strong>Symptoms<\/strong><\/h1>\n<p style=\"color: #000000;\">Here is what I want my solution to accomplish. Discover active backup jobs, create items and triggers and also some nice graphs. This discovery part is not so important as my installation is pretty stable, new hosts or new backup jobs don&#8217;t come every day, still I don&#8217;t like manual typing.<\/p>\n<h1 style=\"color: #000000;\"><strong>Plattform\/Tools<\/strong><\/h1>\n<p style=\"color: #000000;\">My server is Ubuntu 14.04.3 LTS, bacula 5.2.6, zabbix 2.2.6.<\/p>\n<p style=\"color: #000000;\">I&#8217;ll be using my lovely perl language, currently v5.18.2 is installed.<\/p>\n<h1 style=\"color: #000000;\"><strong>Solution<\/strong><\/h1>\n<h2 style=\"color: #000000;\">Discovery script<\/h2>\n<h3>Create new Template<\/h3>\n<p>Template App Bacula<\/p>\n<h3>Create new Discovery rule<\/h3>\n<p>Discovery rule name:\u00a0<em>Bacula Jobs Discovery,\u00a0<\/em>discovery key:\u00a0<em>bacula.jobs.discovery.<\/em><\/p>\n<p><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot111.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-558 size-full\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot111.png\" alt=\"\" width=\"634\" height=\"479\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot111.png 634w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot111-300x227.png 300w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot111-199x150.png 199w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot111-150x113.png 150w\" sizes=\"auto, (max-width: 634px) 100vw, 634px\" \/><\/a><\/p>\n<p>Please also pay attention at the macro <code>{#JOB_NAME}<\/code>, it will be used to create discovery items.<\/p>\n<h3>Discovery script on the target server<\/h3>\n<p>To make our discovery work we need to add a <code>UserParameter <\/code>item to <em>zabbix<\/em> configuration script on the target machine\u00a0and also create a discovery script, which\u00a0will be used by this <em>zabbix<\/em> item.<\/p>\n<p>Create file <code>\/etc\/zabbix\/zabbix_agentd.d\/userparameter_bacula.conf<\/code>:<\/p>\n<div style=\"color: #000000;\">\n<pre class=\"brush: bash; light: true; title: ; notranslate\" title=\"\">\r\n#\r\nUserParameter=bacula.jobs.discovery, sudo \/usr\/lib\/zabbix\/externalscripts\/zabbix_bacula.pl -D\r\n#\r\n<\/pre>\n<\/div>\n<p style=\"color: #000000;\">Pay attention at the <code>bacula.jobs.discovery<\/code> key, it has to match the key, defined in the discovery rule. Also, make sure the <code>Include<\/code> directive is in your main <i>zabbix agent<\/i> configuration file.<\/p>\n<p style=\"color: #000000;\">Now to the discovery script <code>zabbix_bacula.pl<\/code>. When used with the command line option -D, it will deliver a JSON object with a\u00a0list of enabled backup jobs. Probably we will extend it in the future with other options to do something else.<\/p>\n<pre class=\"brush: perl; title: ; notranslate\" title=\"\">\r\n#!\/usr\/bin\/perl\r\nuse strict;\r\nuse warnings;\r\nuse Getopt::Std;\r\nuse JSON;\r\nuse Data::Dumper;\r\n\r\nmy $JOB_TYPE_BACKUP = 66;\r\n\r\n# declare the perl command line flags\/options we want to allow\r\nmy %options=();\r\ngetopts(&quot;Ds:&quot;, \\%options);\r\n\r\nif ($options{D}) {\r\n        my $arrays_found = undef;\r\n        open(my $fh, '-|', 'echo &quot;show jobs&quot; | bconsole') or die $!;\r\n        while (my $line = &lt;$fh&gt;) {\r\n                if ($line =~ \/^Job:(.*)\/) {\r\n                        my @tmp = split(\/\\s+\/,$1);\r\n                        my %job;\r\n                        foreach my $t (@tmp) {\r\n                                if (my ($k,$v) = split(\/=\/,$t)) {\r\n                                        $job{$k} = $v;\r\n                                }\r\n                        }\r\n                        if ($job{JobType} eq $JOB_TYPE_BACKUP) {\r\n                            if ($arrays_found) {\r\n                                push(@{$arrays_found-&gt;{'data'}},{'{#JOB_NAME}' =&gt; ($job{name})});\r\n                            } else {\r\n                                $arrays_found-&gt;{'data'}-&gt;&#x5B;0] = {'{#JOB_NAME}' =&gt; ($job{name})};\r\n                            }\r\n                        }\r\n\r\n                }\r\n        }\r\n        print encode_json($arrays_found) if ($arrays_found);\r\n}\r\n\r\n<\/pre>\n<p style=\"color: #000000;\">Here comes our nicely formatted JSON output:<\/p>\n<pre class=\"brush: xml; title: ; notranslate\" title=\"\">\r\n{\r\n    &quot;data&quot;:\r\n        &#x5B;\r\n            {\r\n                &quot;{#JOB_NAME}&quot;:&quot;BackupClient1&quot;\r\n            },\r\n            {\r\n                &quot;{#JOB_NAME}&quot;:&quot;BackupCatalog&quot;\r\n            }\r\n        ]\r\n}\r\n<\/pre>\n<p style=\"color: #000000;\">Finally create (or modify) <code>\/etc\/sudoers.d\/zabbix<\/code>\u00a0to enable <code>sudo <\/code>for <code>zabbix <\/code>user:<\/p>\n<pre class=\"brush: bash; light: true; title: ; notranslate\" title=\"\">\r\n\r\nzabbix ALL=NOPASSWD: \/usr\/lib\/zabbix\/externalscripts\/zabbix_bacula.pl -D\r\n\r\n<\/pre>\n<p style=\"color: #000000;\">That&#8217;s not all though, additionally we need to add user <em>zabbix<\/em> to the group <em>bacula<\/em> to be able to run the\u00a0<em>bconsole<\/em> command. Check <code>\/etc\/group<\/code>:<\/p>\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\n\r\nbacula:x:116:zabbix\r\n\r\n<\/pre>\n<p style=\"color: #000000;\">And the very final caveat was the timeout problem with <em>bconsole<\/em>. On my development system <em>bconsole<\/em> took about 15 seconds to run and I was seeing mysterious\u00a0<code>ZBX_NOTSUPPORTED<\/code>, which I thought was due to some incomplete <em>sudoers<\/em> configuration. Also error messages like:<\/p>\n<p style=\"color: #000000;\"><code>zbx_waitpid() killed by signal 15<\/code><\/p>\n<p style=\"color: #000000;\">in the<em> zabbix_agent<\/em> log file. Turned out it was really about the\u00a0timeout for external script in the <em>zabbix_agent<\/em> configuration file <code>\/etc\/zabbix\/zabbix_agentd.conf<\/code>:<\/p>\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\n#Option: Timeout\r\n# Spend no more than Timeout seconds on processing\r\n#\r\n# Mandatory: no\r\n# Range: 1-30\r\n# Default:\r\n# Timeout=3\r\nTimeout=30\r\n<\/pre>\n<p style=\"color: #000000;\">Setting timeout to 30 solved the problem. The timeout value has to be adjusted on the server side as well, otherwise mysterious <code>Interrupted system call<\/code> error messages will appear in the <em>zabbix<\/em> server log file.<\/p>\n<h3 style=\"color: #000000;\">Item prototypes<\/h3>\n<p style=\"color: #000000;\">For the start, we will have one reporting item per job name. Item prototypes will be created using the\u00a0<code>{#JOB_NAME}<\/code>\u00a0macro from the discovery rule. So we will expect items in the form of <code>bacula.job.exit_code[JOB_NAME]=JOB_EXIT_CODE<\/code>, where job exit code is one of OK, Error, etc.<\/p>\n<p style=\"color: #000000;\"><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot112.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-560\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot112.png\" alt=\"ScreenShot112\" width=\"458\" height=\"480\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot112.png 458w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot112-286x300.png 286w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot112-143x150.png 143w\" sizes=\"auto, (max-width: 458px) 100vw, 458px\" \/><\/a><\/p>\n<p style=\"color: #000000;\">The picture above is self-explanatory, just don&#8217;t forget to use\u00a0<em>Zabbix trapper<\/em> as an item type.<\/p>\n<h3 style=\"color: #000000;\">Trigger prototypes<\/h3>\n<p>We will have it simple, just one trigger to raise an alarm if backup job status is not OK. The below screenshot explains it:<\/p>\n<p><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot113.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-565\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot113.png\" alt=\"ScreenShot113\" width=\"640\" height=\"436\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot113.png 640w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot113-300x204.png 300w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot113-220x150.png 220w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot113-150x102.png 150w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>Please be careful to create this prototype on the Discovery rules and not directly in the Template.<\/p>\n<h2 style=\"color: #000000;\">Making <em>bacula<\/em> report job status to <em>zabbix<\/em><\/h2>\n<h3>Wrapper script for Message resource<\/h3>\n<p>The idea is to modify a Message resource in the <em>bacula<\/em> director configuration file to send messages to <em>zabbix<\/em>. Unfortunately, there is no option to run an external script or to specify an external script as a destination. So I decided to create a wrapper script, which will be used instead of the mail command in Message resource and do both &#8211; send an email and also send information to <em>zabbix<\/em>.<\/p>\n<p>Let&#8217;s create <code>\/etc\/bacula\/scripts\/bacula_message.pl<\/code>:<\/p>\n<pre class=\"brush: perl; title: ; notranslate\" title=\"\">\r\n#!\/usr\/bin\/perl\r\nuse strict;\r\nuse warnings;\r\nuse Getopt::Std;\r\nuse Data::Dumper;\r\nuse Sys::Syslog qw(:standard :macros);\r\n\r\nmy %options=();\r\ngetopts('c:d:e:i:j:l:n:r:s:t:MDO',\\%options);\r\n...\r\n<\/pre>\n<p>It will take all substitution variables as specified in the <em>bacula<\/em> documentation for mail command plus -s option for the mail Subject and one of -M,-O or -D options to correspond to MailCommand, OpertorCommand and Daemon message resource in the <em>bacula<\/em> director configuration file.<\/p>\n<p>Here are the substitution variables as specified in the <em>bacula<\/em> documentation:<\/p>\n<ul>\n<li>%c = Client&#8217;s name<\/li>\n<li>%d = Director&#8217;s name<\/li>\n<li>%e = Job Exit code (OK, Error, &#8230;)<\/li>\n<li>%i = Job Id<\/li>\n<li>%j = Unique Job name<\/li>\n<li>%l = Job level<\/li>\n<li>%n = Job name<\/li>\n<li>%r = Recipients<\/li>\n<li>%t = Job type (e.g. Backup, &#8230;)<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3>Modifying bacula config<\/h3>\n<p>Now we will use this wrapper script in the Message resource configuration in <code>\/etc\/bacula\/bacula-dir.conf<\/code>. Here are the excerpts alongside with the original mail command configurations commented out:<\/p>\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\n...\r\n# mailcommand = &quot;mail -s \\&quot;Bacula: %t job %n %e of %c %l\\&quot; %r&quot;\r\nmailcommand = &quot;\/etc\/bacula\/scripts\/bacula_message.pl -M -c '%c' -d '%d' -e '%e' -i '%i' -j '%j' -l '%l' -n '%n' -r '%r' -s 'Bacula: %t job %n %e of %c %l' -t '%t'&quot;\r\n...\r\n# operatorcommand = &quot;mail -s \\&quot;Bacula: Intervention needed for %j\\&quot; %r&quot;\r\noperatorcommand = &quot;\/etc\/bacula\/scripts\/bacula_message.pl -D -c '%c' -d '%d' -e '%e' -i '%i' -j '%j' -l '%l' -n '%n' -r '%r' -s 'Bacula: Intervention needed for %j' -t '%t'&quot;\r\n...\r\n# mailcommand = &quot;mail -s \\&quot;Bacula daemon message:\\&quot; %r&quot;\r\nmailcommand = &quot;\/etc\/bacula\/scripts\/bacula_message.pl -M -c '%c' -d '%d' -e '%e' -i '%i' -j '%j' -l '%l' -n '%n' -r '%r' -s 'Bacula daemon message' -t '%t'&quot;\r\n \r\n...\r\n<\/pre>\n<p>To be on the safe side, we use all substitution variables that are provided by <em>bacula<\/em> to the mail command\u00a0\u00a0and decide how to use them\u00a0later inside\u00a0our wrapper script.<\/p>\n<h3>Send mail<\/h3>\n<p>This is the simplest part of our script, we assume that if sending mail\u00a0worked directly from <em>bacula<\/em>, it will work from our script as well:<\/p>\n<pre class=\"brush: perl; title: ; notranslate\" title=\"\">\r\n...\r\n#Will send out an email only if one of -M,-D or -O defined and also %r for recipients provided\r\nif ($options{'r'} &amp;amp;amp;amp;&amp;amp;amp;amp; ($options{'M'} || $options{'O'} || $options{'D'})) {\r\n\tsystem(&quot;mail -s \\&quot;$options{'s'}\\&quot; $options{'r'}&quot;);\r\n}\r\n...\r\n<\/pre>\n<h3><\/h3>\n<h3>Send job status<\/h3>\n<p>Next step would be to send job status to <em>zabbix<\/em>. We will use job name as a key and job exit code as its value:<\/p>\n<pre class=\"brush: perl; title: ; notranslate\" title=\"\">\r\nmy $ZABBIX_SERVER = '127.0.0.1';\r\nmy $ZABBIX_HOST = 'Zabbix server';\r\n\r\nmy $JOB_EXIT_CODE_KEY = 'bacula.job.exit_code&#x5B;%s]';\r\n\r\nmy $zabbix_sender = `which zabbix_sender`;\r\nchomp($zabbix_sender);\r\nmy $zabbix_sender_cmd_line = &quot;$zabbix_sender -z $ZABBIX_SERVER -s \\&quot;$ZABBIX_HOST\\&quot; -k %s -o %s&quot;;\r\n\r\nsystem(sprintf($zabbix_sender_cmd_line,sprintf($JOB_EXIT_CODE_KEY,$options{'n'}),$options{'e'}) . &quot; &amp;amp;amp;gt;\/dev\/null&quot;);\r\n<\/pre>\n<p>Don&#8217;t forget to change <code>$ZABBIX_SERVER<\/code>\u00a0and <code>$ZABBIX_HOST<\/code>\u00a0variables to the real values!<\/p>\n<h3>Send extended job information<\/h3>\n<p>This is left for the next post.<\/p>\n<h1 style=\"color: #000000;\"><strong>Discussion<\/strong><\/h1>\n<p style=\"color: #000000;\">What has been done so far? We created a\u00a0<em>zabbix<\/em> Template with a Discovery rule to identify <em>bacula<\/em> backup jobs configured on the machine, where<em> bacula-director<\/em> is running. When these jobs are discovered,\u00a0an item and a trigger are created to monitor the exit status of these jobs and also to raise an alarm if this exit status indicates an error.<\/p>\n<p style=\"color: #000000;\">These are the artifacts created:<\/p>\n<ul>\n<li style=\"color: #000000;\"><em>zabbix<\/em> Template<\/li>\n<li style=\"color: #000000;\"><em>zabbix<\/em> configuration add-on script, which will add custom keys to <em>zabbix<\/em> agent configuration. This needs to be done on our target machine, where zabbix director is running<\/li>\n<li style=\"color: #000000;\">Discovery support script to be installed on the target machine<\/li>\n<li style=\"color: #000000;\">Wrapper script to be used by <em>bacula<\/em> director in place of traditional mail command, which will send information to <em>zabbix<\/em> server after each backup job completed. This script will be also installed on the target machine<\/li>\n<li style=\"color: #000000;\">And also we need to modify <em>bacula<\/em> director configuration file\u00a0to use our wrapper script instead of mail command<\/li>\n<\/ul>\n<p>In my next article, I&#8217;ll add more monitoring items to also monitor other backup job parameters and also the status for storage Pools.<\/p>\n<h1 style=\"color: #000000;\"><strong>Caveats<\/strong><\/h1>\n<p>Just one thing bothered me few times, at which level to create item and trigger prototypes. Although I used\u00a0<em>zabbix\u00a0<\/em>Template to create discovery rule, few times I mistakenly created item and trigger prototypes on the Template level and not in the Discovery rule.<\/p>\n<p>Also the issue with\u00a0<em>zabbix\u00a0<\/em> agent and server timeouts was a bit tricky, took some time to figure it out. This may\u00a0require some experimenting to find the proper values in the particular environment. On my test machine, it was about 20 sec for <code>bconsole <\/code>command, which is being run in the discovery script\u00a0to finish while it was about 3 sec in my production environment.<\/p>\n<div style=\"color: #000000;\"><\/div>\n<div style=\"color: #000000;\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Subject There is an email notification mechanism built in into bacula out of the box. It really sends out email notification after each job.\u00a0This works, but in the case a job is stuck and who reads emails after all? So I decided to make my zabbix monitoring solution to handle this. Symptoms Here is what&#8230;<\/p>\n","protected":false},"author":1,"featured_media":567,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[22,26],"tags":[27,17],"class_list":["post-399","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bacula","category-zabbix","tag-bacula","tag-zabbix"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/2014-11-07-18.41.38.jpg","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2EszU-6r","_links":{"self":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts\/399","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=399"}],"version-history":[{"count":45,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts\/399\/revisions"}],"predecessor-version":[{"id":596,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts\/399\/revisions\/596"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/media\/567"}],"wp:attachment":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=399"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=399"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=399"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}