{"id":571,"date":"2016-04-26T18:57:36","date_gmt":"2016-04-26T15:57:36","guid":{"rendered":"http:\/\/www.bandidor.info\/wp\/?p=571"},"modified":"2016-04-26T18:57:36","modified_gmt":"2016-04-26T15:57:36","slug":"monitor-bacula-backup-jobs-with-zabbix-part-ii","status":"publish","type":"post","link":"https:\/\/www.bandidor.info\/wp\/?p=571","title":{"rendered":"Monitor bacula backup jobs with zabbix &#8212; part II"},"content":{"rendered":"<h1><strong><span style=\"font-family: verdana, geneva;\">Subject<\/span><\/strong><\/h1>\n<div>In my <a href=\"http:\/\/www.bandidor.info\/wp\/monitor-bacula-backup-jobs-with-zabbix\/\">previous article<\/a> I described only a very simple part of my monitoring solution for <em>bacula<\/em>, the focus was on discovering backup jobs and only job exit status was monitored. In this article, I&#8217;ll add more monitoring parameters and create some graphs.<\/div>\n<h1><strong><span style=\"font-family: verdana, geneva;\">Symptoms<\/span><\/strong><\/h1>\n<div>The challenge here is that no further job parameters are provided by <em>bacula<\/em> director when it calls the mail command and thus my wrapper script. To make more job parameters available for monitoring I&#8217;ll extend my wrapper script to also look into the bacula underlying database and fetch job statistics from there.<\/div>\n<h1><span style=\"font-family: verdana, geneva;\"><b>Platform<\/b><\/span><strong><span style=\"font-family: verdana, geneva;\">\/Tools<\/span><\/strong><\/h1>\n<div>No news here, still my server is Ubuntu 14.04.3 LTS, bacula 5.2.6, zabbix 2.2.6. Also, MySQL version is\u00a05.5.44.<\/div>\n<h2><strong><span style=\"font-family: verdana, geneva;\">Solution<\/span><\/strong><\/h2>\n<h2>Monitoring individual jobs<\/h2>\n<h3>Zabbix items<\/h3>\n<div>\u00a0My plan is to monitor the number of files processed by the particular job, the number of bytes read and written. So for each job our <em>zabbix<\/em> discovery rule will have 3 additional item prototypes. Below I provide only one screenshot, other two are very similar:<\/div>\n<div><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot114.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-578\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot114.png\" alt=\"ScreenShot114\" width=\"451\" height=\"480\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot114.png 451w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot114-282x300.png 282w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot114-141x150.png 141w\" sizes=\"auto, (max-width: 451px) 100vw, 451px\" \/><\/a><\/div>\n<h3>Add new items to the wrapper script<\/h3>\n<div>The idea is simple, after we send the item reporting job exit status to <em>zabbix<\/em>, we use job name passed to our script with the option -j to go to the bacula database and fetch some job statistics.\u00a0This looks a bit clumsy at the first glance, our script will need to know where the database is located and also the user name and the password making our script less portable, but that&#8217;s the price.<\/div>\n<div>Here we go :<\/div>\n<pre class=\"brush: perl; title: ; notranslate\" title=\"\">\r\n...\r\nmy $MYSQL_SERVER = '127.0.0.1';\r\nmy $MYSQL_PORT = '3306';\r\nmy $MYSQL_USER = 'bacula';\r\nmy $MYSQL_PASSWORD = 'yRacJaj4Eujw';\r\nmy $DATABASE = 'bacula';\r\n...\r\nmy $JOB_FILES_KEY = 'bacula.job.sumFiles&#x5B;%s]';\r\nmy $JOB_BYTES_KEY = 'bacula.job.sumBytes&#x5B;%s]';\r\nmy $JOB_READ_BYTES_KEY = 'bacula.job.sumReadBytes&#x5B;%s]';\r\n...\r\nmy $dbh = DBI-&gt;connect(&quot;DBI:mysql:database=$DATABASE;host=$MYSQL_SERVER;port=$MYSQL_PORT&quot;,$MYSQL_USER, $MYSQL_PASSWORD,{'RaiseError' =&gt; 1});\r\n\r\n### Send statistics for individual job\r\nmy $sth = $dbh-&gt;prepare(&quot;SELECT JobFiles,JobBytes,ReadBytes, P.Name AS Pool, if(isnull(sum(`M`.`VolBytes`)),0,sum(`M`.`VolBytes`)) AS `PoolBytes`,P.NumVols FROM Job J, (`Pool` `P` left join `Media` `M` on((`P`.`PoolId` = `M`.`PoolId`))) WHERE J.PoolId = P.PoolId AND J.Job = ?;&quot;);\r\n$sth-&gt;execute($options{'j'}); #Job name in the form BackupCatalog.2015-02-01_23.10.00_03\r\nmy ($jobFiles,$jobBytes,$readBytes,$poolName,$poolBytes,$poolVols) = $sth-&gt;fetchrow_array(); #exactly one row expected\r\n$sth-&gt;finish();\r\n\r\n#Send for the job\r\nsystem(sprintf($zabbix_sender_cmd_line,sprintf($JOB_FILES_KEY,$options{'n'}),$jobFiles) . &quot; &gt;\/dev\/null&quot;);\r\nsystem(sprintf($zabbix_sender_cmd_line,sprintf($JOB_BYTES_KEY,$options{'n'}),$jobBytes) . &quot; &gt;\/dev\/null&quot;);\r\nsystem(sprintf($zabbix_sender_cmd_line,sprintf($JOB_READ_BYTES_KEY,$options{'n'}),$readBytes) . &quot; &gt;\/dev\/null&quot;);\r\n...\r\n$dbh-&gt;disconnect();\r\n...\r\n<\/pre>\n<div>Nothing special here, first we set all variables to access MySQL database, then we specify the item keys that follow the monitoring items we configured in <em>zabbix<\/em>. Please note that the SELECT statement we use is a bit more complex than expected, this we will need for further extensions.<\/div>\n<div>The last step, we send 3 monitoring items to <em>zabbix<\/em>.<\/div>\n<h3>Adding a trigger<\/h3>\n<div>We start with a simple trigger, which will check for the job return status. If it&#8217;s not OK &#8211; something is wrong:<\/div>\n<div><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot117.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-590\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot117.png\" alt=\"ScreenShot117\" width=\"558\" height=\"480\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot117.png 558w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot117-300x258.png 300w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot117-174x150.png 174w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot117-150x129.png 150w\" sizes=\"auto, (max-width: 558px) 100vw, 558px\" \/><\/a><\/div>\n<div>And we add another trigger, this one will check for the number of bytes written, if it is zero &#8211; most probably something is wrong. Still there is a small possibility that for example an incremental job has not found any changed files, therefore, if it will be only s Warning:<\/div>\n<div><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot118.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-589\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot118.png\" alt=\"ScreenShot118\" width=\"514\" height=\"479\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot118.png 514w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot118-300x280.png 300w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot118-161x150.png 161w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/10\/ScreenShot118-150x140.png 150w\" sizes=\"auto, (max-width: 514px) 100vw, 514px\" \/><\/a><\/div>\n<div>\n<h3>Adding a graph<\/h3>\n<\/div>\n<div>The last step in adding prototypes for a job would be to add a graph with the information about bytes read, bytes written and files processed. Graph configuration is absolutely straightforward, so I would better prove a nice picture after some backup cycles. Below is a 3-month graph for one of the backup jobs.<\/div>\n<div><\/div>\n<div><a href=\"https:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot111.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-646\" src=\"https:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot111.png\" alt=\"ScreenShot111\" width=\"640\" height=\"280\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot111.png 640w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot111-300x131.png 300w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot111-250x109.png 250w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot111-150x66.png 150w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/div>\n<div><\/div>\n<h2>What about 24-hour statistics?<\/h2>\n<p>There is a very nice feature in the <em>webacula<\/em> tool, I installed very long ago. It shows some aggregated statistics for the last 24 hours. Although we can do some aggregation inside <em>zabbix<\/em>, I decided to create additional monitoring items and make my wrapper script to also send aggregated statistics.<\/p>\n<h3>More zabbix items<\/h3>\n<p>Here I made my first mistake, I added aggregated item prototypes to the <em>zabbix<\/em> Discovery rule. This didn&#8217;t work because aggregated items are not bound to individual jobs, while zabbix discovery will try to create new items for every job discovered. So these items shall be created as template items:<\/p>\n<p><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot115.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-583\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot115.png\" alt=\"ScreenShot115\" width=\"460\" height=\"479\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot115.png 460w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot115-288x300.png 288w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot115-144x150.png 144w\" sizes=\"auto, (max-width: 460px) 100vw, 460px\" \/><\/a><\/p>\n<h3>Extending script<\/h3>\n<p>Same approach as before, we make sure that we use proper keys for our new items and a\u00a0different\u00a0SQL command:<\/p>\n<pre class=\"brush: perl; title: ; notranslate\" title=\"\">\r\n\r\n...\r\nmy $SUM_FILES_KEY = 'bacula.24hr.sumFiles';\r\nmy $SUM_BYTES_KEY = 'bacula.24hr.sumBytes';\r\nmy $SUM_READ_BYTES_KEY = 'bacula.24hr.sumReadBytes';\r\n...\r\n### Send statstics for the entire installation for the last 24 hr\r\n$sth = $dbh-&gt;prepare(&quot;SELECT sum(`JobFiles`),sum(`JobBytes`),sum(`ReadBytes`) from `Job` where (`StartTime` &gt; (now() - interval 1 day));&quot;);\r\nmy ($jobFiles24,$jobBytes24,$readBytes24) = $sth-&gt;fetchrow_array(); #exactly one row expected\r\n$sth-&gt;finish();\r\n\r\nsystem(sprintf($zabbix_sender_cmd_line,sprintf($SUM_FILES_KEY,$options{'n'}),$jobFiles24) . &quot; &gt;\/dev\/null&quot;);\r\nsystem(sprintf($zabbix_sender_cmd_line,sprintf($SUM_BYTES_KEY,$options{'n'}),$jobBytes24) . &quot; &gt;\/dev\/null&quot;);\r\nsystem(sprintf($zabbix_sender_cmd_line,sprintf($SUM_READ_BYTES_KEY,$options{'n'}),$readBytes24) . &quot; &gt;\/dev\/null&quot;);\r\n### Done with the 24-hr\r\n...\r\n\r\n<\/pre>\n<h3>Adding a trigger for 24 hr<\/h3>\n<div>While I have my jobs running on different schedules, I expect to backup something every night. So I decided to have a trigger, which will raise an alarm if no backups have run in last 24 hours. In fact, I&#8217;m checking for the number of bytes written in last 24 hours and my trigger will fire up if this is zero. Remember, this will be Template trigger and not a Discovery rule trigger prototype:<\/div>\n<div><a href=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot116.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-586\" src=\"http:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot116.png\" alt=\"ScreenShot116\" width=\"640\" height=\"434\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot116.png 640w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot116-300x203.png 300w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot116-221x150.png 221w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2015\/09\/ScreenShot116-150x102.png 150w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/div>\n<div><\/div>\n<div>\n<h3>24 hr graph<\/h3>\n<\/div>\n<div>Having collected information for the last 24 hours we have a very simple job to create a graph to show the number of bytes read, bytes written and files processed. Don&#8217;t forget though that this will be not a graph prototype, it will be a template graph. This is because we will have only one graph per host and not per discovered job.<\/div>\n<div><\/div>\n<div><a href=\"https:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot112.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-648\" src=\"https:\/\/www.bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot112.png\" alt=\"ScreenShot112\" width=\"640\" height=\"272\" srcset=\"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot112.png 640w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot112-300x128.png 300w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot112-250x106.png 250w, https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/ScreenShot112-150x64.png 150w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/div>\n<div><\/div>\n<div><strong><span style=\"font-family: verdana, geneva;\">Discussion<\/span><\/strong><\/div>\n<p>Well, graphs could have been sexier, but I leave it to the future articles.<\/p>\n<div><strong><span style=\"font-family: verdana, geneva;\">Caveats<\/span><\/strong><\/div>\n<div>None detected so far, but I must admit I&#8217;m publishing this article quite some time after I have implemented this solution and I don&#8217;t look at these graphs frequently.<\/div>\n<div><\/div>\n<div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Subject In my previous article I described only a very simple part of my monitoring solution for bacula, the focus was on discovering backup jobs and only job exit status was monitored. In this article, I&#8217;ll add more monitoring parameters and create some graphs. Symptoms The challenge here is that no further job parameters are&#8230;<\/p>\n","protected":false},"author":1,"featured_media":650,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[22,26],"tags":[31,27,32,17],"class_list":["post-571","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bacula","category-zabbix","tag-backup","tag-bacula","tag-monitoring","tag-zabbix"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/bandidor.info\/wp\/wp-content\/uploads\/2016\/04\/2014-11-02-20.02.56.jpg","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2EszU-9d","_links":{"self":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts\/571","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=571"}],"version-history":[{"count":17,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts\/571\/revisions"}],"predecessor-version":[{"id":652,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/posts\/571\/revisions\/652"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=\/wp\/v2\/media\/650"}],"wp:attachment":[{"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=571"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=571"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bandidor.info\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=571"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}