Adventures with Nagios and GlusterFS (monitoring and self-healing)

January 3rd, 2014

We recently added GlusterFS as the storage backed for glance and cinder on TryStack.org.

I’ve been working on setting up nagios monitoring for the cluster recently. I finally have gotten around to spending some time on making sure that Gluster is behaving. I also knew that I had had some sync issues with Gluster, but I hadn’t spent much time on it because all my content seemed to be ok These sync issues were also fixed this in the process so that my nagios checks all came back happy.

First a bit of arch, We have 3 hosts each with a bunch of 550G drives in them. For now the gluster setup is version 3.4, has 3 peers with 2 bricks per peer. The one volume is configured with 3 replicas (I think I’m saying that correct)

An initial google showed there where a couple options of nagios scripts to monitor the cluster, I started with http://www.gluster.org/pipermail/gluster-users/2010-April/027316.html which pointed to a git.gluster.org host that supposedly housed something called glfs-health.sh. Git.gluster.org just gives me an apache test page. Figuring it was git I tried the same user name at github: https://github.com/avati/glfs-health. Bingo.

Unfortunately it didn’t work so well, no matter what I tried I could only get something to the effect of:

[root@host13 ~]# sh glfs-heath.sh
Host unreachable
cat: /tmp/.glusterfs.pid.11992: No such file or directory
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec … or kill -l [sigspec]

Turns out that this was for older versions of GlusterFS and won’t work with gluster 3.4.

next I tried this thread: http://gluster.org/pipermail/gluster-users/2012-June/010798.html

This actually worked out of the box and reminded me that I had sync issues when it reported that I had some files out of sync. I initially called it check_gluster.sh

[root@host13 ~]# sh check_gluster.sh
check_gluster.sh CRITICAL peers: host14/ host15/ volumes: trystack/21 unsynchronized entries

I scanned through my google results to make sure there wasn’t anything else to peek at before I started moving forward with this and came across an updated version of this script that was posted on the nagios exchange.

http://exchange.nagios.org/directory/Plugins/System-Metrics/File-System/GlusterFS-checks/details

After installing a couple dependencies (I installed the nagios-plugins rpm for the utils.sh plugin and the bc rpm) I got these results:

[root@host13 export]# /usr/lib64/nagios/plugins/check_glusterfs -v trystack -n 3
/usr/lib64/nagios/plugins/check_gluster: line 101: -2: substring expression < 0
WARNING: 15 unsynched entries; found 1 bricks, expected 3

unsynched… heh, I want to spell that unsynced, so correct that and fix line 101.
The syntax used on line 101 requires bash 4.2, I’m running 4.1 on RHEL 6.5, So I’ll update the syntax to a 4.1 compatible syntax.

[root@host1 files]# diff -u check_glusterfs_orig check_glusterfs
--- check_glusterfs_orig	2014-01-03 13:24:12.020577771 -0800
+++ check_glusterfs	2014-01-03 11:22:04.943621593 -0800
@@ -81,7 +81,7 @@
 	fi
 done
 if [ "$heal" -gt 0 ]; then
-	errors=("${errors[@]}" "$heal unsynched entries")
+	errors=("${errors[@]}" "$heal unsynced entries")
 fi

 # get volume status
@@ -98,7 +98,8 @@
 		key=${field[@]:0:3}
 		if [ "${key}" = "Disk Space Free" ]; then
 			freeunit=${field[@]:4}
-			free=${freeunit:0:-2}
+			free=${freeunit%'GB'}
 			unit=${freeunit#$free}
 			if [ "$unit" != "GB" ]; then
 				Exit UNKNOWN "unknown disk space size $freeunit"

[root@host13 export]# /usr/lib64/nagios/plugins/check_glusterfs -v trystack -n 3
WARNING: 32 unsynced entries

That gets me a bit closer. Now to add this to nagios and figure out the unsynced entries. The nagios exchange page offers some sudo configs to give privileges to nagios to run the gluster commands. So next I made the sudo updates, added a service check into the nagios_service.cfg file for each host and an nrpe entry into each host’s nrpe.cfg. I actually did this in puppet, not directly in the files, but here’s the result in the nagios files:

nagios_service.cfg:

define service {
        check_command                  check_nrpe!check_glusterfs
        service_description            Gluster Server Health Check
        host_name                      10.100.0.13
        use                            generic-service
}

nrpe.cfg:

command[check_glusterfs]=/usr/lib64/nagios/plugins/check_glusterfs -v trystack -n 3

When nagios ran the check I got the error “No Bricks Found”, running the nrpe command from my nagios host confirms this:

[root@host1 trystack]# /usr/lib64/nagios/plugins/check_nrpe -H 10.100.0.13 -c check_glusterfs
CRITICAL: no bricks found

I wasted a good bit of time trying to figure this out. End result: turns out that the note on the nagios exchange page for this plugin didn’t address nrpe, it only referenced the nagios user. I had put my sudo configs in place using the user nagios, but when nrpe runs it runs as the user nrpe. So I updated my sudoers.d file:

[root@host13 export]# cat /etc/sudoers.d/nrpe
Defaults:nrpe !requiretty
nrpe ALL=(root) NOPASSWD:/usr/sbin/gluster volume status [[\:graph\:]]* detail,/usr/sbin/gluster volume heal [[\:graph\:]]* info

So now lets try and rerun the nrpe command from the nagios host to make sure it’s happy too:

[root@host1 trystack]# /usr/lib64/nagios/plugins/check_nrpe -H 10.100.0.13 -c check_glusterfs
WARNING: 32 unsynced entries

That looks better. On to figure out the sync issues.

I can’t say that I understand exactly what’s going on under the covers with gluster. I can tell you there’s two places you can work with on each brick to sort out your sync issues, the content you see on the brick and the .gluster directory. If you’re careful about it you can fix it by just deleting content directly off the bricks and wait for gluster to self-heal itself. Here’s what I did.

The script i just installed ran the volume heal info command to report sync issues, so I ran that by hand to see what it’s spitting out:

[root@host13 export]# gluster volume heal trystack info
Gathering Heal info on volume trystack has been successful

Brick host13:/export/sdb1
Number of entries: 1
/

Brick host14:/export/sdb1
Number of entries: 1
/

Brick host15:/export/sdb1
Number of entries: 1
/

Brick host13:/export/sdc1
Number of entries: 4
/glance/images/5518ec29-7555-4632-88c7-76b81432c1c2
/glance/images/83d90cd4-180a-4c6d-893d-2cd0d3dd4d3b
/
/glance/images/d7f5ba96-c741-4dd0-9cf9-94fc607034f7

Brick host14:/export/sdc1
Number of entries: 4
/glance/images/83d90cd4-180a-4c6d-893d-2cd0d3dd4d3b
/glance/images/5518ec29-7555-4632-88c7-76b81432c1c2
/
/glance/images/d7f5ba96-c741-4dd0-9cf9-94fc607034f7

Brick host15:/export/sdc1
Number of entries: 4
/
/glance/images/d7f5ba96-c741-4dd0-9cf9-94fc607034f7
/glance/images/5518ec29-7555-4632-88c7-76b81432c1c2
/glance/images/83d90cd4-180a-4c6d-893d-2cd0d3dd4d3b

I googled for a bit and found a couple things that referred to the brick content and each brick’s .gluster directory, as I just mentioned. Turns out to store the content there’s a bunch of hard links that connect the content you see in the bricks and content you see in the brick’s .gluster directories. The logs suggest for you to delete all but the version of the file you want to fix the sync, but say nothing about this .gluster directory. Turns out if you delete the content and the .gluster directory directly from the brick then gluster will rebuild it as part of its self heal process. I treated my host13 as the copy to rebuild from and host 14 and 15 as those to rebuild. So here we go:

**** BIG DISCLAIMER ****
I have no idea if this is recommended practice.

[root@host14 export]# cd sdb1
[root@host14 sdb1]# rm -rf *
[root@host14 sdb1]# rm -rf .gluster
[root@host14 sdb1]# cd ../sdc1
[root@host14 sdc1]# rm -rf *
[root@host14 sdc1]# rm -rf .gluster
[root@host15 export]# cd sdb1
[root@host15 sdb1]# rm -rf *
[root@host15 sdb1]# rm -rf .gluster
[root@host15 sdb1]# cd ../sdc1
[root@host15 sdc1]# rm -rf *
[root@host15 sdc1]# rm -rf .gluster

After a little while (It takes time to self-heal) my heal info command looked like this:

[root@host13 sdb1]# gluster volume heal trystack info

Gathering Heal info on volume trystack has been successful

Brick host13:/export/sdb1
Number of entries: 1
/

Brick host14:/export/sdb1
Number of entries: 0

Brick host15:/export/sdb1
Number of entries: 0

Brick host13:/export/sdc1
Number of entries: 1
/

Brick host14:/export/sdc1
Number of entries: 0

Brick host15:/export/sdc1
Number of entries: 0

That looks alot better, but there’s still those weird root entries that say they’re out of sync. A quick scan over each brick’s content across the 3 hosts and all looks like the content matches. So I went ahead and destroyed my host13 brick’s content:

[root@host13 export]# cd sdb1
[root@host13 sdb1]# rm -rf *
[root@host13 sdb1]# rm -rf .gluster
[root@host13 sdb1]# cd ../sdc1
[root@host13 sdc1]# rm -rf *
[root@host13 sdc1]# rm -rf .gluster

A little more time passes and eventually my nagios check starts reporting happiness:

[root@host1 trystack]# /usr/lib64/nagios/plugins/check_nrpe -H 10.100.0.13 -c check_glusterfs
OK: 6 bricks; free space 540GB

So in summary… GlusterFS is pretty cool stuff so far. I’m not sure what I did was sanctioned but it seemed to work. Nrpe checks happen as the nrpe user. Hope this helps save someone some time in the future.

I have more work to do, there’s thresholds you can add to the nagios command to alert you when you’re running out of space, and the “-n 3″ is a brick count, I’m not sure yet how that’s supposed to fit in yet. I have 6 bricks and I used a 3 and didn’t get any complaints.

Tomorrow is just another day :)

Leave a Reply

You must be logged in to post a comment.