Armada
What Is Armada?
This setup is no longer running. I'm keeping it here since someone may find it useful.
Armada was a collection of diskless, netbooting, distributed computing computers. Each computer uses PXE to load a Linux distribution called Linux Terminal Server Project (LTSP) off a central server. I use LTSP 4.2 because it supports local apps. I primarily run BOINC and distributed.net on them.
I launched the first version of Armada on October 16, 2003. It was based on a Folding@Home setup called yattamonster. I modified it to run the now defunct project Find-a-Drug. After Find-a-Drug ended, Armada was dead for over a year until I wrote version 2. Version 2 has the flexibility to run any project I choose. Eventually it was all shutdown in November 2010.
How Does It Work?
The Server
The server runs various daemons to provide the services required. The most critical are dhcpd, tftpd, and nfsd. dhcpd is configured to point the clients to the PXE boot image. tftpd serves the image to the clients. nfsd serves the client file system, including each client's home directory.
The V2 server is configured as follows:
Intel E4300 BSEL modded to run at 2.4GHz
MSI P965 NeoF
2GB Corsair XMS2 DDR2 675 C4
80GB Seagate 7200.10 SATA primary hard drive
400GB Seagate PATA storage hard drive (used for storing all my crap)
Two Intel Pro1000GT network cards
550W CoolerMaster eXtremePower power supply
An old, big, steel Antec midtower case
APC BackUPS XS1200 UPS
The server was eventually migrated to a Windows Server 2008R2 Hyper-V virtual machine running on a Dell PowerEdge T110 that was on sale.
The Client
Each client is is set to use PXE booting in the BIOS. When the client boots, it requests an IP address and the location of the boot image. It uses tftp to download the boot image and then boots the kernel. The kernel boots just like any other Linux distro at this point. The only difference is that it uses an nfs export instead of a hard drive. After the kernel boot, it then proceeds to run configuration scripts and finally the Armada script. The Armada script determines what distributed computing program is run.
E4400 Client Hardware
Intel E4400 @ 3.2GHz
Scythe Andy Samurai Master HSF
Asus P5K-VM motherboard (was orignally Abit F-I90HD but they were unreliable)
1GB generic DDR2 667
Broadcom 5507 gigabit NIC
8MB ATI RagePro PCI video card
400W Enermax Liberty power supply (were originally generic but I wanted more efficiency and fewer cables)
Q6600 Client Hardware
Intel Q6600 @ 3.0GHz
Xigmatek S1283/Kingwin RVT-12025 HSF
Asus P5K-VM motherboard
2GB Corsair XMS2 DDR2 675 C4
Broadcom 5507 gigabit NIC
8MB ATI RagePro PCI video card
400W Enermax Liberty power supply
Installation/Configuration
The majority of the setup instructions can be found in the LTSP 4.1 documentation. I do have some additional items to address though.
LTSP does not allow the time zone to be set. You'll need to go into the BIOS and set to the clock according to UTC. Remember, UTC does not observe DST.
The stock LTSP kernel does not support SMP. Either compile your own or download my SMP kernel. It's the same as the stock kernel but has 4-core SMP support.
You cannot change everything in /etc from the clients after they boot. Some changes must be made during boot. Do this by adding commands to /opt/ltsp/i386/etc/rc.sysinit on the server.
By default, LTSP uses the server for DNS. However, I don't won't to configure bind on my server so I point the clients to OpenDNS. To do this:
Open /opt/ltsp/i386/etc/rc.sysinit
Go to the section labeled "Setup the resolv.conf file"
Comment out or delete the line echo "nameserver ${DNS_SERVER}" >>tmp/resolv.conf
Add these lines in its place:
echo "nameserver 208.67.222.222" >>/tmp/resolv.conf
echo "nameserver 208.67.220.220" >>/tmp/resolv.conf
While you're in that section, you may want to set the DNS timeout. This is especially useful to work around a BOINC flaw that causes jobs to crash with "No heartbeat from core for 31 seconds." Simply add the line echo "options timeout:2">>/tmp/resolv.conf This will set it for the default of 2 retries with a 2 second timeout for each attempt. The goal is to keep the total time around 25 seconds or less.
The clients run the Armada script. The script is placed in /opt/ltsp/i386/etc/screen.d This script controls client installation, configuration, running the selected program, and even shutting down and rebooting. The key is placing the control files in the right location. The directory structure and control files are listed below.
Armada Directory Structure
/home/armada/
/home/armada/config/
hosts Any additional hosts I want to append to /etc/hosts
/home/armada/config/project/ Install sources and default configs
x.command x is the project name. File specifies the command to run the program file
x.configcommand File specifies the command to configure the project
x.configfile File specific the project's actual config/settings file
x.progfile File specifies the project's main program file
default File specifies the default project if one has not been selected
/home/armada/config/project/bin/ Install sources
/home/armada/config/project/bin/x/ x is the project name. The install sources and default config files are stored here
/home/armada/nodes/ Where the client projects and settings are stored
/home/armada/nodes/hostname/ hostname is the client's host name assigned through DHCP
/home/armada/nodes/hostname/bin/ Where the projects are stored
/home/armada/nodes/hostname/bin/x/ x is the project name
/home/armada/nodes/hostname/config/ Client project selection and some commands
project Specifies the selected project
node.reboot Tells the client to reboot after project stops running. 0 byte file
node.shutdown Tells the client to shutdown after project stops running. 0 byte file
x.configupdate Tells the client to update the config for project x. 0 byte file
x.progupdate Tells the client to update the program files for project x. 0 byte file
Other Info
I like hardware control also. That's why I have some networked I/O controllers. I'm using the Aviosys IP Power 9212. I've modded mine to have 8 NO contacts. Newer versions are supposedly selectable. I have them wired to monitor motherboard power and to activate the motherboard's reboot and power headers. I use software I wrote to control them.
I've also hacked up a web page to directly control the outputs using a web browser. I did this primarily so I could control them using my Windows Mobile phone instead of using pliers to short the connectors. Eventually, I'll port my Armada Commander to VB.NET and then target the compact framework so I have some proper control.
I also have a couple other systems running the Armada setup. One is a hacked up benchtop system. The other is my Dell Latitude D520 when it's docked.
Pictures
Two rows of clients Top row Armada 5-8, E4400s. Bottom row Armada 1-4, Q6600s.
Two rows closeup
Armada 3 closeup
Armada 8 closeup
Server and UPS
UPS monitor on server
Network switch and I/O controllers I really should upgrade to a gigabit switch (it finally was in March 2009. HP ProCurve 1400-24G)
Part of my work area
An overview of the mess
Another view of the mess
Armada Commander screenshot My program for hardware monitoring and control. Green is good. Red is bad. Yellow is no communication.
Armada Commander tray icon It changes colors also
Armada Subcommander Cheap, hacked up HTML version of the above