tag:blogger.com,1999:blog-81383728916790345952024-02-07T10:09:38.780+01:00MonAMI at largePaul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-8138372891679034595.post-56763206430512644532009-11-16T20:14:00.012+01:002009-11-16T21:14:22.658+01:00watching the ink dry<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://farm4.static.flickr.com/3376/3634460864_6fa394c5a6_d.jpg"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 284px; height: 190px;" src="http://farm4.static.flickr.com/3376/3634460864_6fa394c5a6_d.jpg" alt="" border="0" /></a>Yeah, it's been far too long since the last bit of news so here's a entry just to announce that MonAMI now has a new plugin: inklevel.<br /><br />This plugin is a simple wrapper around Markus Heinz's <a href="http://libinklevel.sourceforge.net/">libinklevel</a> library. This is a nice library that allows easy discovery of how much ink is left in those expensive ink cartridges.<br /><br />The library allows one to check the ink levels of Canon, Epson and HP printers. It can check printers directly attached (via the parallel port or USB port) or, for Canon printers, over the network via BJNP (a proprietary protocol that has been reverse engineered).<br /><br />libinklevel supports <a href="http://libinklevel.sourceforge.net/#supported">many different printers</a>, but not all of them. There's a small collection of printers that the library doesn't work with. There are some printers that are neither listed as working or not working. If your printer isn't listed, please <a href="mailto:markus.heinz@uni-dortmund.de?subject=libinklevel">let Markus know</a> whether libinklevel works or not.<br /><br />Credit for the photo goes to Matthew (<a href="http://www.flickr.com/photos/purplemattfish/">purplemattfish</a>) for his picture <a href="http://www.flickr.com/photos/purplemattfish/3634460864/">CISS - Day 304 of Project 365</a>.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-76699024942719595162008-08-11T23:35:00.005+02:002008-08-12T00:14:21.405+02:00Drawing graphs with graphite<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6Ayzi4-5sT0h0UiVpXhGJIeh_q7DwS-y4NKkKpNUu9IvpET4TJUcQfZLQ0CvzJtESVgzoB-hnoqLjlXkrVBH9XPTUWGqpSWE0x8QoM2zrQhOCDrULJFJxplA6dKt2ykUGgTuMTowpCeui/s1600-h/graphite.png"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6Ayzi4-5sT0h0UiVpXhGJIeh_q7DwS-y4NKkKpNUu9IvpET4TJUcQfZLQ0CvzJtESVgzoB-hnoqLjlXkrVBH9XPTUWGqpSWE0x8QoM2zrQhOCDrULJFJxplA6dKt2ykUGgTuMTowpCeui/s200/graphite.png" alt="" id="BLOGGER_PHOTO_ID_5233383392230209298" border="0" /></a>Work continues a-pace ... well kinda. I've added a new reporting plugin for a new monitoring system: Graphite [see <a href="https://launchpad.net/graphite">launch-pad</a> and <a href="http://graphite.wikidot.com/">wiki</a> sites]. If you've not heard, is a funky new monitoring system that does away with the traditional RRDTool and does everything in python.<br /><br />There's two main components to Graphite: carbon and graphite.<br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwIBVge0UGF9ZXi2ooA5cRlLwnKDbXfROTnlq8vYewg9Fn5m9DZHwUSvPBAG3SQO-CWO3sYXc4gIzm6zUwjBWR2Glv1imE2SRA57rbGKlWadZPwfQsocrXH-GCcISoLNth5QSFz3cZhyH5/s1600-h/graphite.png"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwIBVge0UGF9ZXi2ooA5cRlLwnKDbXfROTnlq8vYewg9Fn5m9DZHwUSvPBAG3SQO-CWO3sYXc4gIzm6zUwjBWR2Glv1imE2SRA57rbGKlWadZPwfQsocrXH-GCcISoLNth5QSFz3cZhyH5/s200/graphite.png" alt="" id="BLOGGER_PHOTO_ID_5233383612274352066" border="0" /></a><br />Carbon is a recording daemon (in fact, a set of three daemons) that stores information efficiently on disk (using a custom format) and maintain a fast in-memory cache. Sending new metric values to the carbon agent is very simply.<br /><br />Graphite is a python web front-end that uses the <a href="http://www.djangoproject.com/">Django</a> framework and the <a href="http://www.extjs.com/">ExtJS</a> AJAX toolkit. Graph rendering is achieved using cairo (via python's cairo bindings). It's possible to run Graphite (Django) in stand-alone mode, but I guess most people will use mod_python and apache. Although there's a simple drag-and-drop compositor, the real power comes when using the CLI interface. There, each logged-in user can create their own custom graphs (multiple can be opened concurrently). These can be arranged on the screen and the resulting view saved for later recall.<br /><br />It's a bit of a faff to setup (although better with v0.9.3) and there's a few rough edges (again, better with v0.9.3). That said, it's already usable and the AJAX interface is pretty nice. It's early days, so I'm not sure where it will fit within the monitoring eco-system compared to established projects (e.g., ganglia, munin, cacti). I guess time will tell.<br /><br />Because of the way Graphite (and Carbon in particular) is designed, adding the MonAMI plugin to send it data is very easy. The code is now in CVS, ready for the next release. I've included a few screen shots that show the graph compositor.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-58168707364818473592008-04-30T11:46:00.003+02:002009-11-16T21:13:42.781+01:00Trouble at MillWith some unfortunate timing, it looks like the "Axis of Openness" webpages (SourceForge, Slashdot, Freshmeat, ...) have gone for a burton. There seems to be some networking problems with these sites, with web traffic timing out. Assuming traceroute output is valid, the problem appears soon after traffic leaves the Santa Clara location of the Savvis network [dead router(s)?]<br /><br />This is a pain because we've just done the v0.10 release of MonAMI and both the website and the file download locations are hosted by SourceForge. Whilst SourceForge is down, no one can download MonAMI!<br /><br />If you're keen to try MonAMI, in the mean-time, you can download the RPMs from the (rough and ready) dev. site:<br /><a href="http://monami.scotgrid.ac.uk/">http://monami.scotgrid.ac.uk/</a><br /><br />The above site is generously hosted by <a href="http://www.scotgrid.ac.uk/">the ScotGrid project</a> [their <a href="http://scotgrid.blogspot.com/">blog</a>].<br /><br />Thanks guys!Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com1tag:blogger.com,1999:blog-8138372891679034595.post-18069900968957115642008-04-28T10:56:00.003+02:002008-04-28T11:20:24.859+02:00Version 0.10 has left the buildingAfter many months of work, v0.10 has been tagged and source-/binary-RPMs and tar-balls are available.<br /><br />This is a major release with many enhancements to MonAMI. Perhaps the two improvements that top the list are:<br /><ul><li>adaptive monitoring,</li><li>writing monitoring data into a database.</li></ul>Some other note-worthy changes include:<br /><ul><li>New plugins:</li><ul><li>varnish (for monitoring a <a href="http://varnish.projects.linpro.no/">Varnish</a> server),</li><li>grmonitor (for reporting to <a href="http://users.actrix.co.nz/michael/grpage.html">gr_Monitor</a>, as <a href="http://monami-at-large.blogspot.com/2007/11/new-output-plugin-grmonitor.html">previously mentioned</a>),</li></ul><li>Updates to existing plugins:</li><ul><li>maui</li><ul><li>support for QoS (a maui term) monitoring added,<br /></li><li>added a timeout option (maui can take ages to reply sometimes).</li></ul><li>Torque</li><ul><li>better error handling (the library has a somewhat amusing way reporting problems),</li><li>enforce thread-safety (some torque library API isn't),</li></ul><li>Ganglia</li><ul><li>fixed gmond.conf parser,<br /></li><li>transmission now less bursty (reduces likelihood of overloading gmond)<br /></li><li>unicast support: sending data to just the one gmond, support for multiple gmonds (for failover in unicast deployments) pencilled in for the next release.</li></ul><li>null</li><ul><li>adjustable time delay (useful when playing with adaptive monitoring)</li></ul><li>MySQL</li><ul><li>added per-Table monitoring statistics (also can now act as a reporting plugin).</li></ul></ul><li>Other changes:</li><ul><li>Added the "<a href="http://monami.sourceforge.net/tutorial/">MonAMI by Example</a>" tutorial (has been available from the web for a while)</li><li>MonAMI-core will use the recent history of a monitoring target's response time when estimating how long it future requests will take. This uses quite a nice algorithm, which responds quickly to a service suddenly taking a longer time to respond, but isn't fooled if a service responds very quickly.</li><li>Added per-Thread CPU profiling. This is so, if someone says "MonAMI is consuming vast amounts of CPU" we can figure out why.</li><li>Spring-clean of user-guide and tutorial: <span style="font-style: italic;">lots</span> of effort has gone into this, mostly in ensuring a consistency in the typesetting. The document should look a lot nicer now and hopefully be easier to read.<br /></li></ul></ul>You can download MonAMI from the SourceForge page:<br /> <a href="http://sourceforge.net/project/showfiles.php?group_id=151885">http://sourceforge.net/project/showfiles.php?group_id=151885</a><br />or configure your YUM to download it automatically. Details are available here:<br /> <a href="http://monami.scotgrid.ac.uk/">http://monami.scotgrid.ac.uk/</a><br /><br />Enjoy!Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-32354355043658651192007-11-12T20:45:00.001+01:002008-08-12T00:15:47.682+02:00New output plugin: grmonitor<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAoNc1gxc4BvHZvEOIaTYoWc5q9sp1ZSLGe3EJ6aMVN5ID_j5V7UX2O5ezvSUHO4_-mk5bJKYT01Kr_F98dJoI4bMjyatKfFsuQy5j26Ng8qC2SabDkjmqi0EjmPlkrD5OJ3UEIQ-4hho0/s1600-h/grmon-torque-cropped.png"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAoNc1gxc4BvHZvEOIaTYoWc5q9sp1ZSLGe3EJ6aMVN5ID_j5V7UX2O5ezvSUHO4_-mk5bJKYT01Kr_F98dJoI4bMjyatKfFsuQy5j26Ng8qC2SabDkjmqi0EjmPlkrD5OJ3UEIQ-4hho0/s200/grmon-torque-cropped.png" alt="" id="BLOGGER_PHOTO_ID_5132043786747725442" border="0" /></a>Ladies and Gentlemen, MonAMI now has a new output plugin: grmonitor. This allows the latest version of <a href="http://users.actrix.co.nz/michael/grpage.html">gr_monitor</a> (available from the project's home page) to connect to MonAMI and fetch the data it then plots.<br /><br />gr_monitor plots data in 3D using an <a href="http://en.wikipedia.org/wiki/Opengl">OpenGL</a> library (e.g. the open-source <a href="http://www.mesa3d.org/">Mesa</a>). This allows you to pan around and see the live data from different points of view. On the right is a screen snapshot showing several Torque metrics.<br /><br />gr_monitor expects data in a series of regular n-by-m grids. This is quite different to how MonAMI sees data (a tree structure) so the configuration has to map between the two. This makes it slightly verbose, but I'm hoping to add a few tricks to improve this.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-65643679527618231942007-11-12T19:49:00.000+01:002007-11-12T20:27:55.597+01:00Hands-on workshop at Imperial College, LondonThe recent <a href="http://hepwww.rl.ac.uk/sysman/">HEP-SysMan</a> workshop was <a href="http://hepwww.rl.ac.uk/sysman/oct2007/main.html">dedicated to monitoring</a>: what software is available and how to configure it. I was honoured and delighted to be asked to give a presentation on MonAMI.<br /><br />Well, given the meeting was a "workshop", I wanted to get people working! What better way than a hands-on tutorial: a step-by-step guide that walking you through increasingly more complex examples.<br /><br />Pete and I had previously started something similar before as a <a href="http://www.gridpp.ac.uk/">GridPP</a> wiki page, I wanted to convert this to <a href="http://en.wikipedia.org/wiki/DocBook">DocBook</a> so people had a good looking tutorial to work from. Since I wasn't too sure how long people would take, some extra material was added (e.g. using the MySQL plugin to save monitoring data). It took a surprisingly long time to get the tutorial good, which is one of the reasons things have been so quite recently.<br /><br />This also <span style="font-style: italic;">finally</span> forced me to figure out how to produce diagrams of datatrees. Thank's to <a href="http://www.graphviz.org/">GraphViz</a> and some XSLT, the tutorial sports some nice diagrams. (Just need to add some to the user-guide now!)<br /><br />The logistics were fun. Everyone needed their own environment to play with. Some people were able to used a spare machines at their home institute, but the rest used some 20 virtual machines that Ewan MacMahon managed to throw together. Each VM had its own install of Torque, maui and MySQL. Big thanks to Ewan!<br /><br />Many people helped in getting this tutorial together. Mike Kenyon, Andrew Elwell, Caitriana Nicholson, Graeme Stewart and Tom Doherty (sorry if I've forgotten anyone!) all helped in proof reading and a big thanks also to Mona Aggarwal for organising the printed versions.<br /><br />The meeting went well and people were happy with what they were doing.<br /><a href="http://hepwww.rl.ac.uk/sysman/"></a>Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-36904570717178866642007-09-18T13:45:00.001+02:002008-08-12T00:16:32.262+02:00Storing monitoring data in a database? no problem!<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mysql.com/common/logos/mysql_100x52-64.gif"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 200px;" src="http://www.mysql.com/common/logos/mysql_100x52-64.gif" alt="" border="0" /></a><a href="http://ganglia.sourceforge.net/">Ganglia</a> is a monitoring system that uses <a href="http://oss.oetiker.ch/rrdtool/">RRDTool</a> for its storage and graphs. This provides an excellent solution for monitoring, but suffers from data becoming less detailed ("averaged out") when you look further back in time. This is deliberate, but does make later analysis of the data difficult.<br /><br />If you wanted to keep detailed records of monitoring data with MonAMI that don't degrade over time, now you can, I've committed changes to the mysql plugin in CVS. In addition to monitoring a MySQL database, the plugin can now store information. You tell it which table and how to map the information into that table and it does the rest, it'll even create the table if it doesn't exist.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com2tag:blogger.com,1999:blog-8138372891679034595.post-49197872953110166792007-09-05T01:11:00.000+02:002007-11-13T10:30:57.365+01:00Greetings from CHEP 2007!<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjeW9Zf1Mg3tIm3EyWAKY4Hr9vpbautBrnK8idXFdraIbe1U0CO73f6Mlf141UqVJy2u36UXiCfO9Ii17-fSL4T8USgml0PFWkNzMSlCkztLFHG6So7d19N3xbAd-F8qXKMoFPS22RgQZB/s1600-h/crw_4150.jpg"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjeW9Zf1Mg3tIm3EyWAKY4Hr9vpbautBrnK8idXFdraIbe1U0CO73f6Mlf141UqVJy2u36UXiCfO9Ii17-fSL4T8USgml0PFWkNzMSlCkztLFHG6So7d19N3xbAd-F8qXKMoFPS22RgQZB/s200/crw_4150.jpg" alt="" id="BLOGGER_PHOTO_ID_5106848339588272994" border="0" /></a><br /><br />Greetings from Victoria!<br /><br />For those that thought things have been a bit quiet recently; well, yes, they have been. Recently, all my time has been spent preparing for the <a href="http://www.chep2007.com/">CHEP 2007</a> and <a href="http://www.allhands.org.uk/">All Hands 2007</a> conferences.<br /><br />CHEP has now started, wi<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeY0b5Zhq5WW0sKkXZTwJYEuOOpE3tY0CHjGqMJkHz2VgV1ks56R7-aDTiWxpc0HOoaFAx5xrw97LuXuQCF7-JtCpZR2UCIY_LZ9WGJLBuE6xRz7ciHSfilFbIy1kuXwZNPkeNSMgBXDl0/s1600-h/crw_4135.jpg"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeY0b5Zhq5WW0sKkXZTwJYEuOOpE3tY0CHjGqMJkHz2VgV1ks56R7-aDTiWxpc0HOoaFAx5xrw97LuXuQCF7-JtCpZR2UCIY_LZ9WGJLBuE6xRz7ciHSfilFbIy1kuXwZNPkeNSMgBXDl0/s200/crw_4135.jpg" alt="" id="BLOGGER_PHOTO_ID_5106848193559384914" border="0" /></a>th various GridPP people here. Graeme, Greig and myself are giving a <a href="http://ppewww.physics.gla.ac.uk/%7Epaul/MonAMI/monami-chep2007.pdf">poster-presentation of MonAMI</a> at CHEP. The poster is deliberately "visual": I'm aiming to use it to talk people through the concepts, rather than providing a poster that has lots of text.<br /><br />For those interested, the poster was put together using <a href="http://www.inkscape.org/">Inkscape</a>: a <a href="http://en.wikipedia.org/wiki/Scalable_Vector_Graphics">SVG</a> editor. The whole poster is made up of SVG graphics with the only exception of the GridPP backgrounds and University logos (which are, unfortunately, large bitmaps). Inkscape is a very powerful editor. If you are doing anything involving SVG, I would<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiae-a4Ygp_-wl46bE9YNdyOgnoz5K8tmGfjQKg0SkzOjH1BpJbMjqZWyl9J4lu0BhixBW29tPr_6Uo5irizCriOIFkOBMWC-SSINoDxEyq7rh1DYdwiAjMj7WiZFq4F1rpWUIzbZUVb9Z0/s1600-h/monami-chep2007.jpg"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiae-a4Ygp_-wl46bE9YNdyOgnoz5K8tmGfjQKg0SkzOjH1BpJbMjqZWyl9J4lu0BhixBW29tPr_6Uo5irizCriOIFkOBMWC-SSINoDxEyq7rh1DYdwiAjMj7WiZFq4F1rpWUIzbZUVb9Z0/s200/monami-chep2007.jpg" alt="" id="BLOGGER_PHOTO_ID_5106849361790489458" border="0" /></a> recommend inkscape. Be sure to take the tutorials: they're both easy to follow and will <span style="font-style: italic;">gre</span><span style="font-style: italic;">atly</span> increase your productivity.<br /><br />CHEP itself is an excellent conference. There's lots of people in the HEP computing field often facing similar computational challenges. I'm looking forward to meeting more people during the poster sessions.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com2tag:blogger.com,1999:blog-8138372891679034595.post-28438659049791010662007-09-05T00:18:00.000+02:002007-09-05T01:01:26.016+02:00Monitoring grid jobs by VO from the RBs point-of-view.<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiEV95dbF58ONXKMqt0QRblCD5hX_HllF-1VbkrnElBblsrzS4YUsQ8eWiogJN8p-GEp-wViUHPMvMA9-SlTUJM4qACy6bmP7L7l7AJntq6p7IZEbw9Bi_bFLpT1GC926FGrCXYVd2C9xd/s1600-h/Running_8_days.png"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiEV95dbF58ONXKMqt0QRblCD5hX_HllF-1VbkrnElBblsrzS4YUsQ8eWiogJN8p-GEp-wViUHPMvMA9-SlTUJM4qACy6bmP7L7l7AJntq6p7IZEbw9Bi_bFLpT1GC926FGrCXYVd2C9xd/s320/Running_8_days.png" alt="" id="BLOGGER_PHOTO_ID_5106487729839133506" border="0" /></a><br />Gidon Moont (of the 3D Real-Time Monitor fame) has come up with another monitoring tool. Using the data collected from all the WLCG Resource Brokers, graphs are generated that plot the number of jobs each VO has running and queued at your site. More information is available at the <a href="http://gridportal.hep.ph.ic.ac.uk/rtm/">Real Time Monitoring page</a> (the "GridLoad Graphs" section).<br /><br />What's particularly nice is he's included support for Google Gadgets. Gadgets, if you've not come across them, are a small bit of web content wrapped up so they're easy to handle. You can add Gadgets to <a href="http://www.google.com/ig">iGoogle</a>, to <a href="http://desktop.google.com/plugins/">your desktop</a> or even <a href="http://www.google.com/ig/directory?synd=open">within to your webpages</a>.<br /><br /><a href="http://monami.sourceforge.net/">MonAMI</a> includes <a href="http://monami-at-large.blogspot.com/2007/08/external-repository.html">a framework</a> that (amongst other things) extends <a href="http://www.ganglia.info/">Ganglia</a>'s default web front-end to include support for Gadgets (e.g. <a href="http://monami-at-large.blogspot.com/2007/08/external-repository.html">Glasgow's Torque</a> monitoring).<span style="text-decoration: underline;"></span><br /><br />So, with Google Gadgets, you can see your local batch system monitoring side-by-side with a per-VO view of what the Resource Brokers think your site is up to.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-54583430809500450402007-08-24T13:29:00.001+02:002007-08-24T16:37:41.457+02:00The case of the missing metrics...<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYRV_HdTG1RX6rdPe2HnxL5WBppKG8AJHB5y8vybFkM5cB_2ZGT8_cHP7KuLhBQI_GsNscnurUTJLJQNeyAeOeq3XWUgV-l-0-vqmhhaugs21xpuZmodgUG2qoM7FUqs_b_U5vSN5Ltw28/s1600-h/durham-maui-graph.php"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYRV_HdTG1RX6rdPe2HnxL5WBppKG8AJHB5y8vybFkM5cB_2ZGT8_cHP7KuLhBQI_GsNscnurUTJLJQNeyAeOeq3XWUgV-l-0-vqmhhaugs21xpuZmodgUG2qoM7FUqs_b_U5vSN5Ltw28/s320/durham-maui-graph.php" alt="" id="BLOGGER_PHOTO_ID_5102228402181755698" border="0" /></a>Yesterday, I helped Phil install MonAMI on the Durham CE and update his Ganglia web front-end so it now has the nice graphs.<br /><br />However, we hit a snag: a few of the metrics "disappear" every so often. This is most likely happening because gmond is loosing the UDP (multicast) metric update messages. After "a while" (the DMAX value), gmond assumes that the metric is no longer being monitored and purges it. The purged metrics no longer have their data written to the RRD file by gmetad, leaving a gap in the graph.<br /><br />When we encountered this with Glasgow it was caused by incoming UDP packets overflowing gmond's network buffer. The ganglia MonAMI plugin has a work-around: every 200 packets it will pause "a while" (100 ms by default). Looks like this isn't enough for Durham.<br /><br />The long term solution is for someone to fix gmond: it should be multithreaded (to stop gmetad downloads from blocking metric updates) or for it to accept data using a reliable transport (e.g. TCP).Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-50305457424583359452007-08-15T23:54:00.000+02:002007-08-16T00:21:05.098+02:00The "external" repositoryHow do you best configure Ganglia to work with MonAMI? What's a good Nagios configuration? MonAMI is designed to fit in with existing monitoring tools; but, sometimes those external tools need to be tweaked to get the best out of the available data.<br /><br />External is a collection of scripts, configuration hints, and similar "useful stuff". It's material not for MonAMI, but rather for the programs MonAMI communicates with (hence "external").<br /><br />The current focus has been on getting decent graphs within Ganglia. External has a framework for building RRDTool graphs, pie charts, and frames of related information. It also includes a fair number of examples showing how to use the framework. The torque and maui frames are excellent examples: see the <a href="http://svr031.gla.scotgrid.ac.uk/ganglia/?r=day&sg=no&c=Grid+Servers&h=svr016.gla.scotgrid.ac.uk">output from UKI-SCOTGRID-Glasgow</a>.<br /><br />For now, external is available as a CVS module (<a href="http://monami.cvs.sourceforge.net/monami/external/">browse</a>, <a href="http://sourceforge.net/cvs/?group_id=151885">instructions</a>).<br /><br />Enjoy!Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com3tag:blogger.com,1999:blog-8138372891679034595.post-80039574727367262512007-08-15T19:13:00.001+02:002007-08-15T19:25:02.761+02:00New release: v0.9Yes, finally release v0.9.<br /><br />This new version is the first to feature Torque and Maui monitoring plugins and includes a better Ganglia plugin.<br /><br />At the moment, the Torque plugin is limited to monitoring the number of jobs in each queue (and queue-group) and the efficiency (CPU time / wallclock time) in 5 bands (0%-20%, 20%-40%, etc).<br />I'm hoping to add support for asynchronous monitoring by watching the accounting log files. MonAMI already has a generic file-watcher component, so this should be fairly straight forward.<br /><br />The Maui plugin is quite primitive, compared to what it <span style="font-style: italic;">could</span> monitor. At the moment its limited to providing just the fair-share information (still very useful!), but I'd guess there's more information that could be gathered.<br /><br />The ganglia plugin is now looking pretty nice. It has a dmax value (so ganglia will purge old metrics automatically) which is based on how long (in practice) it took to gather the data. So, if the computer slows down substantially, it'll carry on working.<br /><br />The plugin also has a number of work-arounds for problems with Ganglia. For example, when MonAMI is monitoring torque and maui, it can provide hundreds of metrics. If gmond (the ganglia daemon) doesn't consume them fast enough, some will be lost, so MonAMI pauses whilst sending the metrics, allowing gmond to prevent metric loss.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0tag:blogger.com,1999:blog-8138372891679034595.post-25938216065493160762007-08-15T18:50:00.000+02:002007-08-15T18:55:38.070+02:00Its aliveYes, here's a new blog about MonAMI; some news from the sharp end of monitoring. Promising new, rants, planned features and random ideas, all about the world of computer monitoring.Paul Millarhttp://www.blogger.com/profile/00579155910701290130noreply@blogger.com0