You need to know how to perform/setup/provide the following:
We build cid on the server and a node first so that we when we install the images later on, everything is ready to go. Being a fairly standard daemon, cid is much easier to get going. At this stage, there are no extra configure options, so the standard configure, make, make install, should work fine.
Server:
3/9/07 16:15:12: cidd.c:73 main() --- Cid Daemon up and running on
port 38008
3/9/07 16:15:35: cidd.c:143 server_loop() --- server_loop (handler):
Dropping connection to node1.acrl.clusters.umaine.edu
Node:
3/9/07 16:11:59: cid_kid.c:79 main() --- Cidkid up and running!
3/9/07 16:11:59: cid_kid.c:80 main() --- Host (node1), Image
(<image_name>-<image_version>-<image_subversion>)
So long as this works correctly, cid is ready to go. Example rc scripts to boot both cidkid and cidd live in notp/etc.
We provide two scripts that should work fine for both Linux and Darwin. You will want to at least check the excluded directory variables to see if there is anything that you need to add or remove. The scripts, mkimg.linux and mkimg.darwin live in notp/bin
NOTE: A careful reader will notice that mkimg darwin needs to be executed from a machine running linux. We have experimented a great deal with various methods of building images under darwin, and the all failed to successfully run after being extracted from Blancmange. This does mean that you will probably LOSE all resource forks.
The images must be named <image name>-<version>-<subversion>.gz. As mentioned earlier, each image must live in the correct place as suggested by image_db.
Blancmange ends up being the initramfs that is used to netboot a machine. The process of building this image has been abstracted away by a collection of scripts in blancmange/bin. The file blancmange/top_config contains all of the information that Blancmange needs to build the image. This file is highly commented and should be quite easy to modify to suit your needs.
Blancmange needs to be built on a machine that will be provisioned using notP. Below are the outputless commands that build Blancmange.
node1: # cd notp/blancmange
node1: # bin/get_packages
node1: # bin/build_packages
node1: # bin/build_image
At this stage, blancmange-initramfs.gz is ready to be moved to your tftp directory and sent out to the nodes. When the machines are running, they will boot dropbear to provide sshd service. The only user is root and the password is "t00r".
A netbooting machine should reboot twice. Once to get the initramfs, which is then installed to the first few partitions of the drive, and a second time to boot into this newly installed image. At this point, partition 1-4 are in use ( for grub machines: grub, blancmange, swap, extended partition; for yaboot machines: partition map, yaboot, blancmange, swap).
The progress of a node will be traced in cidd as well.
Currently, notP has only been tested with the Moab scheduler from Cluster Resources, although it should work with Maui as well. To add support for provisioning to Moab, the simplest route is to add the following to your moab config file.
CLASSCFG[linux] JOBPROLOG='/usr/local/sbin/cidcapt -j $JOBID -n \
$HOSTLIST -i <image_name>'
CLASSCFG[darwin] JOBPROLOG='/usr/local/sbin/cidcapt -j $JOBID -n \
$HOSTLIST -i <darwin>'
When a job is submitted to the linux queue, cidcapt contacts the cidd server and requests that all of the nodes be put into the linux image. Cidd contacts the cidkid process on each node and begins the provisioning process. When all of the nodes are alive and running the correct image, cidcapt will exit, allowing Moab to continue with running the job.