16 Feb 2008

Balcony server

At my last place, I had a dedicated room full of servers. It was lovely mix of cra^Wold hardware running various flavors of Linux, BSD and Solaris. At my new place, we didn't have that much space so I was forced to do a cleanup. I bought a powerful server with sufficient RAM, CPU and disk. Now I have one server and a whole bunch of virtual machines running on it. (Throw in a couple of Linksys devices running openwrt and dd-wrt and I was happy.) There was one "problem" - the server had to be placed out on the balcony. It has been running out there for over a year now - how did that go?

When I started, I had two challenges: First, I had to build some kind of box to protect the machine from wind, rain and snow. Next, since we use the balcony a lot during summertime, the machine had to be fairly quiet.

Also, since the server is running at all time, I had to get some decent disks. I bought four "Western Digital Caviar RE2 500GB SATA2 16MB 7200RPM (WDC WD5000YS-01M)" which has a pretty high MTBF. They've been running in RAID 5 and have not failed me yet.

Since it is a sunny balcony and it can get pretty hot during the summer, the box had to have some kind of ventilation. But the ventilation could not allow snow drifting into to box during winter. After my carpenter work and a paint job, the box fit nicely into the corner of the balcony.





Neither drifting snow, wind or rain have been any problem. A bigger problem have actually been pollen grains during spring and summer. The box and chassis get full of it and have to be cleaned at least once during the summer.

I often get questions about humidity - isn't that a problem? The answer is no. I've had no problem with it at all. But keep in mind that the server is running at all time - if I turn it off, wait until it cools, and then turn it back on again, we can have condensation which can be catastrophic.

We all know that the operating temperature is really important for hard drives. So I do get a little worried when it's really hot during the summer. So I monitor the hard drives using Munin, and so far I've been within the temperature limits for the disks (5°C - 60°C).

During wintertime, the server is running happier (nice and cold) than ever:

15 Feb 2008

How to monitor Bind with Munin

Unix sysadmin and never heard of Munin? Good news for you: You have a great tool waiting. Munin monitors your servers, stores the results and generates pretty graphs for you to interpret. Munin itself is written in Perl, but uses plugins, written in language of choice, to fetch relevant data. The default install comes with a number plugins that works out-of-the-box - most of them written in Perl or shell. But some plugins, or services, require manual intervention to work. Bind is such a service, so let's see how we can monitor Bind with Munin.

I install Munin everywhere I can. It's a really helpful tool. After I've started using Munin (and Nagios), I'm puzzled of how I managed without before. Munin gives you historical graphs and enables you to predict resource consumption trends: "Is there any memory increase during the last year? Are the number of mail/spam increasing? What about CPU load? Network throughput?" etc.

Some time ago, I was at a customer and installed Munin on a bunch of servers. The next day, the sysadmin called and thanked me. He finally knew why he had to reboot two of his Oracle server every week. There was some kind of memory leak eating away all memory before the server crashed. He contacted Oracle to come up with a fix.

Another example: You arrive at work, and a server has crashed/rebooted/panicked during the night. Now, why did it do that? If you know why, perhaps you can prevent it from happening again. Munin can be of great help here: Check the graphs right before the crash - seeing anything unusual? Increase in network traffic? What about CPU load? Memory? Number of processes? It can give you a really good indication of what went wrong.

Munin do have some limitations. It does not scale well (to hundreds of servers) and I find it particularly painful to create aggregated graphs (for example aggregated network graph of two or more hosts). But I know these issues are being worked on.

Okay, enough talk - let's monitor Bind:

First we need enable logging. Create a log directory and add log directives to the Bind configuration file (here on Debian):

  # mkdir /var/log/bind9
  # chown bind:bind /var/log/bind9
  # cat /etc/bind/named.conf.options
  ...
  logging {
        channel b_log {
                file "/var/log/bind9/bind.log" versions 30 size 1m;
                print-time yes;
                print-category yes;
                print-severity yes;
                severity info;
        };

        channel b_debug {
                file "/var/log/bind9/debug.log" versions 2 size 1m;
                print-time yes;
                print-category yes;
                print-severity yes;
                severity dynamic;
        };

        channel b_query {
                file "/var/log/bind9/query.log" versions 2 size 1m;
                print-time yes;
                severity info;
        };

        category default { b_log; b_debug; };
        category config { b_log; b_debug; };
        category queries { b_query; };
  };

Restart bind:

  # /etc/init.d/bind9 restart
  Stopping domain name service: named.
  Starting domain name service: named.

You can now see log files are being populated under /var/log/bind9/*

Next, configure Munin:

Make sure the munin-user ("munin") can read you bind log files.

We need two additional plugins: "bind" and "bind_rndc". If you can't find them in your default install, head over here.

The "bind" plugin should work right away. "bind9_rndc" however need to read the "rndc.key file, which only are readable by the user "bind". You have two options, either run the plugin as root or add the user "munin" to the group "bind" and enable the group "bind" to read the rndc.file. For the sake of simplicity, I run the plugin as root here. So you need to add:

  # cat /etc/munin/plugin-conf.d/munin-node
  ...
  [bind9_rndc]
  user root
  env.querystats /var/log/bind9/named.stats
  ...

Next restart Munin:

  # /etc/init.d/munin-node restart
  Stopping munin-node: done.
  Starting munin-node: done.

Munin run every five minutes, so go take a coffee. Wait.

After a while, graphs arrive:

And the bind_rndc plugin:

(Consult the "BIND 9 Administrator Reference Manual" if you have trouble interpreting the results.)

Nice huh?

2 Feb 2008

Linux and Logitech QuickCam Pro 9000

I've been on the lookout for a decent webcam. After some searching, the choice fell on Logitech QuickCam Pro 9000, which should be supported according to the Linux UVC driver page. It's not one of the cheaper models, but not the most expensive either. It also has "HD-quality" (which in this case translates to resolution up to 1600x1200). So how does this camera works under Linux?

My first thought after unwrapping was "Is that it?". It was smaller than I had anticipated. But when it comes to webcam, smaller is better I guess.

Ubuntu 7.10 (i386) ships with UVC drivers, but they are too old. So we install new ones from trunk:

(Update! This webcam works out of the box on Ubuntu 8.04)

  $ svn checkout svn://svn.berlios.de/linux-uvc/linux-uvc/trunk
  $ cd trunk
  $ make
  $ sudo make install

When we now plug in the camera, it's detected properly:

  $ dmesg
  ...
  [14323.676000] usb 5-1: new high speed USB device using ehci_hcd and address 7
  [14323.932000] usb 5-1: configuration #1 chosen from 1 choice
  [14324.056000] Linux video capture interface: v2.00
  [14324.168000] usbcore: registered new interface driver snd-usb-audio
  [14324.180000] uvcvideo: Found UVC 1.00 device  (046d:0990)
  [14324.196000] usbcore: registered new interface driver uvcvideo
  [14324.200000] USB Video Class driver (v0.1.0)

  $ lsusb
  ...
  Bus 005 Device 007: ID 046d:0990 Logitech, Inc.

We see the modules are loaded:

  $ lsmod | grep uvc
  uvcvideo               48644  0
  compat_ioctl32          2304  1 uvcvideo
  videodev               29312  1 uvcvideo
  v4l1_compat            15364  2 uvcvideo,videodev
  v4l2_common            18432  2 uvcvideo,videodev
  usbcore               138632  10 snd_usb_audio,uvcvideo,snd_usb_lib,hci_usb,appleir,xpad,usbhid,ehci_hcd,uhci_hcd

The camera also has a built in microphone, which is detected and works (number #1 here):

  $ cat /proc/asound/cards
   0 [Intel          ]: HDA-Intel - HDA Intel
                        HDA Intel at 0x90440000 irq 21
   1 [U0x46d0x990    ]: USB-Audio - USB Device 0x46d:0x990
                        USB Device 0x46d:0x990 at usb-0000:00:1d.7-1, high speed

Time for testing!

A capable webcam viewer is luvcview. It has the ability to take snapshot (photos), record video (avi), change resolution etc. We download and install luvcview from here.

One nice feature is to list all supported resolutions:

  $ luvcview -L
  luvcview version 0.2.1
  Video driver: x11
  A window manager is available
  video /dev/video0
  /dev/video0 does not support read i/o
  { pixelformat = 'MJPG', description = 'MJPEG' }
  { discrete: width = 160, height = 120 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 176, height = 144 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 320, height = 240 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 352, height = 288 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 640, height = 480 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 800, height = 600 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 960, height = 720 }
          Time interval between frame: 1/15, 1/10, 1/5,
  { pixelformat = 'YUYV', description = 'YUV 4:2:2 (YUYV)' }
  { discrete: width = 160, height = 120 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 176, height = 144 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 320, height = 240 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 352, height = 288 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 640, height = 480 }
          Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 800, height = 600 }
          Time interval between frame: 1/25, 1/20, 1/15, 1/10, 1/5,
  { discrete: width = 960, height = 720 }
          Time interval between frame: 1/10, 1/5,
  { discrete: width = 1600, height = 1200 }
          Time interval between frame: 1/5,

1600x1200 is bigger than my screen here, so 960x720 will have to do. I had to disable SDL hardware acceleration to use resolution above 800x600, or else luvcview crashed:

  $ luvcview -w -s 960x720

The colors look good, it adapt well to light and I've had no stability issues (yet). The camera also works with ekiga (gnomemeeting):

Kopete:

And Skype (2.0 beta) (the microphone also works):