Saturday, August 30, 2008

An introduction to Puppet(config management)

Puppet is a configuration management tool and more. If you have the same configuration, set of packages, or simply files that you'd like to roll out to multiple machines, puppet is bound to make your life easier.

If it's less than a half dozen machines, you can likely get away with clusterssh, which allows you to control multiple machines at once via ssh. But if you have more, or you want a more elegant and centralized way of managing configuration, you want Puppet. Yes, there's also cfengine, but puppet is said to be more flexible. I can't comment on that, since I've only used cfengine briefly, and thought it was too complicated to be worth it. Having said that Puppet has a fairly steep learning curve as well.

Puppet has a client-server architecture. The client is "puppet" the server is "puppetmaster". Installing puppetmaster will automagically install puppet on the same host. For other hosts that you want to control via your main puppetmaster host, simply install just the puppet package.

By default puppet clients expect their master to be called "puppet" in DNS, but you can change this. If you plan to have multiple puppetmasters(for whatever reason, such as separate networks/clients etc) it's probably a good idea to change this(see below on how to do that). Having said that, the puppet system is clever enough that it won't just start changing things on clients that you specify on the puppetmaster. In fact it's the clients that poll the server for changes, and will only apply a change to themselves, if they've exchanged keys with the server beforehand.


So how do I get the clients to talk to the master?

On each client do:

puppetd --server yourpuppetmaster --waitforcert 60 --test

The puppetmaster can list which clients have asked to be controlled by it:

puppetca --list

Finally, if the server wants to control that client, it should sign it's certificate that the client requested in the previous steps:

puppetca --sign puppetclientname

Note, the puppet client on the puppetmaster server itself, is already authorized, and doesn't need to go through the above steps.
Ok, so let's test it

Let's first try creating a file. Puppet can push out existing files, but it can also create new ones. For this first example, we'll try the latter.

You put the configs in /etc/puppet/manifests, and by default, puppet expects there to be a file called "site.pp" You can split up your configs and have other files in the same directory, and then link them from site.pp, but we'll do that later. For now just add this to your site.pp file(which you'll create):
# Create "/tmp/testfile" if it doesn't exist.
class test_class {
file { "/tmp/testfile":
ensure = present,
mode = 644,
owner = root,
group = root }

}
# tell puppet on which client to run the class

node yourpuppetclient { #this is the name of one or more of your puppet clients
include test_class
}
 
Here's another simple example for running a script.
Notice the "require" statement which is where Puppet's power lies.
class test2 {
exec { "my test program"
cwd "/var/tmp",
command = "/var/tmp/test.sh",
alias = "testscript",
# require = User['tibor'], #require that the user "tibor" exists before running the script }
}
#And then specify which client to apply it to:

node yourpuppetclient { include test }

So when will the changes be applied?

By default puppet applies its changes every 30min. If you want to manually apply an update, you can run

puppetd -o -v

Changing the puppet master name from the default "puppet"
This is optional...In /etc/puppet/puppet.conf on each client add
[puppetd]
server=yourpuppetmasterserver

and on the server only under the [puppetmasterd] section
  certname=yourpuppetmasterserver
 
To make sure this post is not too overwhelming, I'll stop here. Next post about puppet, I'll include some more complex examples to show the power of Puppet.
 
-T  

Thursday, August 7, 2008

OpenID and Googlepedia

Googlepedia is a Firefox extension that combines google search results with wikipedia pages for that specific search item. How does it do that? It creates a second windowpane on the right(of your google result page), that contains the wikipedia article for your search string. And if you navigate the Wikipedia links, it will take those links and google search them for you. If it gets in your way, you can hide it. I've found it quite useful as I'm often switching between the two sites.

I've been starting to see OpenID login options on several websites, and always wondered what it was. So I thought I'd try it out. But first, what is it? It's an easier way to login without the pain of having to remember multiple usernames and passwords. It's also decentralized and free.

Let's say you have a Yahoo account, and you want to post a comment on Blogger(google's site). By default only people with google accounts can post, or the blog owner has the choice of opening up comments to anyone, which is just asking for spam trouble.

Enter OpenID. Instead of having to create a new Google account, you enter your OpenID, which is a URL(that you sign up for at the OpenID provider) that then takes you back to login to your yahoo account, asks you if you want to login to the new site, and then proceeds. One important distinction here is that you can tell the openID provider site to remember that you've ok'd a certain site, so it doesn't keep prompting you.

And then you're authenticated to the blogger site and can post your comment. It is all done over SSL, so it's encrypted, and your password is not sent between the two sites, only an authentication token. Clever aye?

Or, let's say you have a sourceforge account, with a unique username and password, that you can never remember. Use their new OpenID login instead. The first time you use it, you'll need to login to the actual Sourceforge account, using your username and password(to link the two), but after that you can always just login with the URL(which again, if you're not logged into your openID provider, will prompt you to login.

So how do you get an OpenID? From an OpenID provider, or if you have your own server, you can become your own OpenID provider. If you have a google account, then you already have an OpenID, it's the URL of your blog site, although you'll need to use the beta draft.blogspot.com as your dashboard to enable it for your blog. Yahoo's openID site is openid.yahoo.com. For theirs, you go through a couple of steps to create one, but you can make it custom one(ie. me.yahoo.com/whateveryouwant_here_that's_not_already_taken) I only mention these two cause I have accounts with them. Here's a more complete list of OpenID providers:
http://openid.net/get/

So OpenID is a great idea, but it's just starting to catch on. Some people argue that the password manager within a browser already does what OpenID is attempting to do(ie. save people from having to remember lots of different passwords). That's true, but OpenID works if you're away from your usual computer, and don't have your saved passwords handy. It also doesn't stop blog spammers, just slows them down.

I believe the idea will catch on, as more and more websites start using it. The extent to which one site will trust another, especially competitor's openId provider will likely, and sadly always be limited. A nice exception here is sourceforge, although it's limited to which openID providers it will accept(it appears anyway)

As a final note, Drupal (popular CMS application) now has support for OpenID logins, and the OpenID project is offering a $5000 bounty to other projects that implement it. Nice.
-T

Saturday, August 2, 2008

openldap sync replication instead of slurpd

syncrepl is a new replication mode, first introduced in openldap 2.2, and used exclusively in 2.4, where slurpd is deprecated. So if you're running Etch, you can use both methods, side by side even.

So why would you want to use it(besides the fact that slurpd will be obsolete in Lenny)? Well it provides a smarter way of replication, starting with the fact that your replica can start out completely empty, so no more having to copy DB's to slaves. Also, no more having to restart the master or add config changes when you want to setup a new slave. And reportedly more reliable replication(which I'm keen to see)

There are a couple of concepts in syncrepl that may be confusing at first. First, the "master" is called the "provider" and the slaves are called "consumers". Secondly, the basic setup of syncrepl(called refreshOnly) is a pull-based replication. So the consumer pulls updates from the provider.

So let's say you already have an ldap master configured, and your slaves are configured with the old slurpd replication. How do you start to migrate? In this example, we'll setup a new slave that will use syncrepl. It assumes you already have a replication user that has full read access to the master(you should have this if your use slurpd). It also assumes that you have the directive "lastmod on" enabled on your master. By default it is on, but to get replication working between etch and sarge ldap instances you may have it off. So if you still have sarge boxes in your replica chain, then stop now, otherwise you'll break them :)

First add the following 4 lines to your master:
#Under the Global Directives section
moduleload syncprov.la
#Under your Database definition
overlay syncprov
syncprov-checkpoint 100 10
syncprov-sessionlog 100
--------------------------------------------------
Don't define the new slave on the master, as you do with slurpd replication.

On the slave, copy the slapd.conf from the master(minus the replica & replogfile lines), and make sure your slave has all the same schemas(in /etc/ldap/schema) that your master does. Then add the following 12 lines to your new slave.
#Under the database definition
syncrepl rid=1 #Identification number for the provider, max 3 digits long
provider=ldap://ldap #your master or rather "provider" ldap server
type=refreshOnly #we want pull-based to start with
interval=00:00:05:00 #schedule a replication event every 5 minutes
searchbase="dc=example,dc=com" #your search base
filter="(objectClass=*)" #get all elements
attrs="*"
scope=sub
schemachecking=on #ensure schema is not violated
bindmethod=simple #authentication method
binddn="cn=replica,dc=example,dc=com" #your replication user
credentials="secret" #your replication password

Now simply restart your slave and watch /var/lib/ldap increase as the data is pulled from the master. Beautiful aye? If you don't particularly like the 5 minute wait, you can decrease that value, or look at setting up refreshandPersist replication "type". Haven't tried that yet, so can't comment on it.

-T

Thursday, July 31, 2008

Splatd, the glue between LDAP and your home directory

LDAP is awesome for central authentication, and even more advanced things like mail routing and database info. But there are some things that it doesn't handle like creating and later cleaning and archiving user home directories. Or easily pushing out authorized_keys files for ssh. This is where splatd comes in.

Splatd can create home directories based on criteria that it can gather from ldap(such as min and maximum uidNumber), can copy your authorized_keys file from ldap, handle .forward files for users(again gathered from ldap), and finally can archive, and later delete home directories for users based on the criteria that you specify.

Unfortunately splatd doesn't have a Debian(etch) package, but it's fairly painless to use install it from source, then take the config and init script from an Ubuntu package. The only thing to adjust in the init script is the location of the binary, and away you go. You can tell it how often to query ldap for updates(default is 10 minutes), and apply its changes.

Update: To get authorized_keys working, you'll need to copy ooo.schema and ooossh.schema to /etc/ldap/schema on all your ldap instances, which allows you to set the sshAccount objectClass, and under that sshPublicKey. You can have multiple public keys.

In my tests it worked very nicely, and I really liked how easy the config file was. I'm pretty sure all of these actions could be done by something like Puppet(which I'll be blogging next week), but splatd made it easy.

Update: Speaking of ldap, it appears that slurpd replication no longer works in 2.4(I'm guessing Debian Lenny) so I'll also be investigating changing that to the new "syncrepl" replication.
-T

Thursday, July 24, 2008

Positive Stress

When is stress good? When it's a .deb package :) What does it do?

It allows you to put the CPU, memory, hard disk, or i/o systems (or all at once if you want) into a loop so you can do stress testing on your system. Why would you want to do that? Well, you can see how your application perform under load, or to identify a bad piece of hardware. Some examples:

Run a CPU test for 30 seconds
stress -c 10 --timeout 30s

Run a memory test for 60 minutes
stress -m 10 --timeout 60m

Run a combined test for 2 days:
stress -m 10 -c 5 -d 2 -i 9 --timeout 2d

Notice how you can specify the number of "hogs"(love that term) for each subsystem.

Be careful that the disk test(-d) will write files and may even fill up your disk(if it's small). Happened to me, but it was very smart and quickly removed its temp files, and exited with an error to let you know what happened.

Also, goes without saying, watch the load on your system and your logfiles to make sure you haven't DOS-ed any of your services. Of course you shouldn't run this outside a scheduled maintenance window, right? :)

Tuesday, July 1, 2008

vnstat-daily network statistics from the CLI

I found vnstat a few days ago, when I was researching netflow monitors for Cacti. Cacti is great for providing a visual display of almost anything that you can query through SNMP, which, provided the extendability of SNMP, can be numeric output from any script over time.

Sometimes, it's nice to have a CLI tool though, that can provide both an active and a historical view of traffic on an interface.It would also be an added benefit, if you didn't _have_ to be root, and more importantly didn't need to sniff the network interface(which is usually quite CPU/memory intensive) vnstat fills this requirement very nicely, and it is an (k)ubuntu package, so just apt-get install it.

After you install it, you need to run
vnstat -u -i eth0 (or eth1, or whatever interface you want to monitor)

It's possible to monitor multiple interfaces

then wait a while for it to gather some data(it reads /proc btw), and then you can have it report by hour(-h) by day(-d) by month(-m) or top10(-t):
Example: vnstat -u -i ath0

ath0 22:56
^ r
| r
| r
| r
| r
| r
| r
| rt
| rt
| rt
-+--------------------------------------------------------------------------->
| 23 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22

h rx (kB) tx (kB) h rx (kB) tx (kB) h rx (kB) tx (kB)
23 0 0 07 0 0 15 0 0
00 0 0 08 0 0 16 0 0
01 0 0 09 0 0 17 0 0
02 0 0 10 0 0 18 0 0
03 0 0 11 0 0 19 0 0
04 0 0 12 0 0 20 0 0
05 0 0 13 0 0 21 0 0
06 0 0 14 0 0 22 3 1

Monarch and Childsplay

Monarch is another Nagios configuration tool like Fruity which I tried earlier. They're both written by Groundworks OpenSource, although Fruity appears to have stalled in development. In a nutshell it has a more powerful web interface, but also has more prerequisites in setup.

While Fruity required only Mysql and PHP, Monarch requires MySQL and lots of perl modules. The package comes as a .tgz(tested with 2.0.2), which includes an install script and a README.TXT file, which is fairly easy to follow. The only thing I did differently, is instead of using cpan as the README suggests, I installed the perl module Debian packages using apt-get. Here's the list of what it needed:

sudo apt-get install libcgi-session-perl libclass-accessor-perl libcgi-ajax-perl libxml-sax-perl libxml-libxml-common-perl libdbi-perl libdbd-mysql-perl libxml-libxml-perl libarchive-tar-perl

The install script, asks you several questions, many of which can later be altered on the web interface. After installation, it's easy to import your nagios configuration, simply by going to
Control->Main Nagios Configuration->Load from file and then to
Control->Load

The "load" dumps the database, and reloads it with the new data. This is a bit clunky, as they could've done it with version control, a new set of tables, or perhaps given the option to create a new database, but easily roll back to the previous one.

The interface is fairly easy to follow, and you can drill down from a hostgroup to hosts, and then their related services. The nice thing about all the levels in the tree is the pre-populated menus. For example when looking at services for a host, you can easily add a new service just by choosing it from the list. On the other hand, I didn't like the "copy" function the way it was implemented.

The export is a bit dissapointing in that it's not flexible enough. Much like Fruity, it will take a nicely laid out conf.d directory(and any subdirectories) and squish it into a single fileset. Also it still requires work

Conclusion: If you want a nice tool to get a better overview of your existing Nagios configuration(or if you're starting with a fresh configuration), and you'd rather click than edit config files, then Monarch is a nice tool.

Childsplay is a nice set of educational games for young children(ages 2-11). It features sound and number identification games, memory card games,easy picture puzzles, and for the older ones, typing, spelling and math.

One of the first things I really liked about Childsplay is that it starts out as full screen, so kids can't accidentally click outside the window and lose their game. The interface is easy and graphical and the icons and games are colorful and fun. And everything has audio feedback. My 22 month old thoroughly enjoyed it to her full attention span(about 10minutes). Highly recommended.

-T