- What is Platform Open Cluster Stack (OCS)?
- Pre-installation
- Frontend node installation
- Compute node and appliance installation
- Basic Administration
- Advanced Administration
- Get Technical Support
- Copyright and Trademarks
[ Top ]
What is Platform OCS?
Building a Linux® cluster is a challenging and time-consuming task. There are many tools in the community and on the Internet for building, configuring, and managing Linux clusters. However, these tools typically assume a familiarity with Beowulf clusters and the concepts of Linux clusters.
Platform Open Cluster Stack (OCS) is a pre-integrated, vendor certified, software stack that enables the consistent delivery of scale-out application clusters. Platform OCS enables a new class of users by simplifying Linux® cluster application, deployment and management. Backed by global 24x7 enterprise support, Platform OCS is a modular and hybrid stack that transparently integrates open source and commercial software into a single consistent cluster operating environment.
This product includes software developed by the Rocks Cluster Group at the San Diego Supercomputer Center at the University of California, San Diego and its contributors. For more information, visit http://www.rocksclusters.org.
Platform OCS is fully supported by Platform Computing Corporation and requires a Red Hat® based operating system such as Red Hat® Enterprise Linux or CentOS Enterprise Linux.
Where to get Platform OCS?
Platform OCS 4.1.1 is released as two different editions: Enterprise and Standard edition. Before installing, make sure you have the following documentation for your particular edition, and have reviewed them before starting your installation:
If you plan on installing other third-party rolls, obtain the CD or DVD containing those rolls.
You can download Platform OCS Standard Edition from the Platform web site at http://my.platform.com/products/platform-ocs.
Contact Platform Computing to purchase Platform OCS Enterprise Edition.
[ Top ]
Pre-installation
The following steps summarize the Platform OCS pre-installation process:
Check the hardware configuration
Before Platform OCS is installed, a set of minimal hardware requirements must be satisfied. A typical Platform OCS cluster uses a Beowulf-type cluster setup consisting of the following types of hosts:
Frontend node
The frontend node (or head node) is responsible for the following:
- Administration, managing, and monitoring the cluster
- Installation of compute nodes
- User login, compilation, and submission of jobs to the cluster
- Acts as a firewall to shield the cluster from external hosts and networks
- Acts as a server for many important services: DHCP, NFS, DNS, NTP, HTTP, etc.
Minimal hardware requirements for a frontend are as follows:
- 512 MB of physical memory (RAM)
- 17 GB of free disk space
- Two Ethernet interfaces: one connected to the public network (eth0), and one to the private network (eth1)
Compute nodes
One or more compute nodes are responsible for the following:
Minimal hardware requirements for compute nodes are:
- 512 MB of physical memory
- 17 GB of free disks pace
- One Ethernet interface connected to the private network (eth0)
Optional hardware for compute nodes:
- Additional Ethernet interfaces for connecting to other networks
- Additional Interconnects for High-Performance message passing. Examples: Myrinet and Infiniband
Cluster setup
The following is a diagram that illustrates the cluster setup:
![]()
Check the network configuration
In the figure above, the frontend node connects to both a private network through the Ethernet interface mapped to eth0, and to the public network through the Ethernet interface mapped to eth1. The public network refers to the main network in your company or organization. A network switch connects the frontend and compute nodes together to form a completely private network. Other cluster configurations are possible such as exposing all of the compute nodes to the public network by connecting them directly to the public network and not hidden behind the frontend node; however, this type of configuration is not supported at install time.
The private network connecting the frontend and compute nodes is typically a Gigabit or 100Mb Ethernet network. In this simple setup, the private network serves three purposes:
However, it is common practice to perform message passing over a much faster network using a high-speed interconnect such as Myrinet or Infiniband. A fast interconnect provides benefits such as higher throughput and lower latency. For more information about a particular interconnect, please contact the appropriate interconnect vendor.
Testing network configuration
To ensure a successful Platform OCS installation, the Ethernet switches need to be configured properly. There are some installation issues caused by specific switch configurations.
- If spanning tree is enabled on the switch it dramatically slows down PXE installation because each port in the switch is trying to determine where it fits in the Spanning Tree to avoid loops in the network. Caution should be used when changing the spanning tree configuration options on your switch. A Platform OCS cluster with a single network switch will not need spanning tree configured because there is no possibility of loops in the Ethernet network. However, if multiple switches are required in the cluster then spanning tree is needed to ensure that no loops are created in the Ethernet network topology. Platform recommends disabling spanning tree.
- Check if PortFast is disabled on the switch. Different switch manufacturers may use different names. It is the forwarding scheme the switch uses. For best installation performance the switch should begin forwarding the packets as it is receiving them. This will speed the PXE booting process. Platform recommends enabling PortFast if it is supported by the switch.
- Check if Multi-casting is disabled on the switch. Certain switches may need to be configured to allow multi-cast traffic on the private network. Certain tools in Platform OCS such as Ganglia (Cluster Monitoring Tool) require multi-casting enabled to collect information correctly. The switch(es) should be configured for multi-cast traffic for proper Ganglia data collection.
- Run diagnostics on the switch to ensure the switch is connected properly, and there are no bad ports or cables in the configuration.
Network information
Information about your network is required during installation. Collect the following items from your company or organization's IT department:
[ Top ]
Frontend node installation
The following steps summarize the installation of Platform OCS on your frontend node:
- Start the Platform OCS installer
- Configure your frontend
- Partition your frontend
- Test the frontend node
Start the Platform OCS installer
Perform the following steps to start the Platform OCS installer:
- Insert the Platform OCS DVD into your frontend
After your hardware is setup and connected, you are ready to start installing your frontend.
- Power up the frontend node with the Platform OCS 4.1.1 DVD. If the DVD does not boot, you must configure the frontend's BIOS to boot from the DVD drive.
- You will see a splash screen, accompanied by a boot prompt. Type
frontendand press Enter. You need to be quick because the installer will start automatically if you do not type anything in the boot prompt within 10 seconds. If you miss typingfrontend, the installer assumes you are installing a compute node, and not a frontend. Simply power down the frontend and start again.After the splash screen, the installer loads the kernel and initial ramdisk. You can abort the loading process by pressing Ctrl-C when you see "
Loading vmlinuz..." or "Loading initrd.img...". This returns you to the boot prompt.- When you see the Available Rolls dialog, you are ready configure your frontend, as specified in Configure your frontend.
Optional steps for booting:
The Platform OCS installer can be booted with optional parameters. Some common boot parameters include:
User input
Subsequent installer screens require you to input some values. The following is a list of general tips for navigating between screen elements:
- To navigate between fields, use Tab and Alt-Tab.
- To push a button, use Space.
- Do not press F12 to advance to the next screen. If you do so, the installer will run incorrectly.
- Most screens have an Ok or a Yes button to confirm an action or accept input values. Screens also have a Back button to go back to a previous screen. You can use this Back button to go back and modify a value you typed in.
If you encounter an issue during the installation, you can look for more information in the following locations:
- The first virtual console,
tty1(Alt-F1) is the main console for the installation Installer screens are displayed on this console, including any error messages.- The second virtual console,
tty2(Alt-F2) is a command-line prompt allowing you to access the OS.- The third virtual console,
tty3(Alt-F3) displays all of the messages generated by the Platform OCS installer. To view the entire log, you can switch to the command-line, and open/tmp/anaconda.log.- The fourth virtual console
tty4, (Alt-F4) screen displays all of the messages generated by the Kernel. This screen may contain helpful messages to diagnose hardware problems, such as kernel driver issues. To view the entire log, you can switch to the command-line, and run "dmesg".Configure your frontend
At the Available Rolls dialog, select the rolls to install on your frontend, and add your cluster information.
About rolls
A roll groups together packages and configuration scripts that are used to install a specific component for a Platform OCS cluster. For example, a roll can install a batch job scheduler, a driver for an interconnect, or a cluster monitoring package. The DVD contains all of the Rolls you need to install a frontend. There are two types of Rolls:
- Required: These Rolls must be installed for Platform OCS to work. These Rolls are pre-selected and cannot be de-selected by the user.
- Optional: These Rolls install optional components. Users must manually select these Rolls to install them.
Selecting the rolls to install
Complete the following steps to select the rolls to install on your frontend node:
- In the Available Rolls dialog, select all of the rolls you want to install on your frontend. We recommend selecting the Lava Roll (batch job scheduler). The table below is a summary of what rolls are included on the DVD, grouped by category. Choose what you require for your new cluster. Note that your DVD may contain other rolls depending on what Platform OCS edition you have (either Enterprise or Standard edition).
- Press OK when you have finished selecting the rolls you wish to install.
- Select Yes to install more rolls on the frontend or No if you are finished adding Rolls.
The installer displays the list of rolls selected for installation from the DVD, and prompts you to enter additional CD or DVD to install more rolls. In most cases, the DVD contains all of the rolls you need for installation. However, if you have more rolls to install, select Yes. Otherwise, select No.
Links to additional rolls can be found on Platform web site at: http://my.platform.com/products/platform-ocs.
- Insert the CDs or DVDs containing the additional Rolls:
Skip this step if you selected No in the previous step. Otherwise, perform the following:
- Select Ok when prompted to insert a CD/DVD.
- A roll selection screen similar to the one for the boot DVD is displayed if the CD/DVD you inserted has multiple rolls. A roll selection screen is not shown if a CD/DVD contains one roll or has only required rolls. If so, the installer automatically selects all of the rolls.
- The installer will continue to prompt you for more CD/DVDs. Select No when you have added all the rolls.
- In the Cluster Information dialog, specify the details of your Platform OCS cluster.
Enter a Fully Qualified Domain Name (FQDN) for the hostname. The domain name should match your company or organization's domain name.
![]()
- When you see the Disk Partitioning Setup dialog, you are ready to partition the hard disk in your frontend, as described in Partition your frontend.
Partition your frontend
Partition the hard disk in your frontend. You need to decide whether to auto-partition your hard disk or manually partition your hard disk.
Auto-partitioning quickly partitions the first disk on your frontend using a default Platform OCS partition scheme. You can select an alternate disk to partition. Auto-partition uses the following partition scheme:
Partition Mountpoint Filesystem tyoe Minimum size Default size Root
/
ext3
6 GB
10 GB
Swap
None
swap
1 GB
4 GB
Export
/state/partition1
ext3
10 GB
Rest of disk
Manual partitioning requires you to manually set up the partition scheme. This includes setting the correct mount-points and specifying appropriate partition sizes.
We recommend Auto-partitioning for most users, You should only select Manual partitioning (Disk Druid) if you want more control over how the disk is partitioned.
At the Disk Partition Setup dialog, choose to auto-partition or manually partition your disk:
- To auto-partition your hard disk, follow the steps in Auto-partition your hard disk.
- To manually partition your hard disk, follow the steps in Manually partition your hard disk.
Auto-partition your hard disk
Auto-partition your hard disk using the following steps:
- At the Disk Partitioning Setup dialog, select Autopartition.
- Select the disk to partition and specify whether you want to preserve existing partitions.
The installer supports three options for preserving partitions on the disk in which Platform OCS is installed.
- Remove all Linux partitions.
This preserves any non-Linux partitions, such as Windows partitions (e.g. FAT, FAT32, and NTFS partitions), and any data on those partitions.
- Remove all partitions.
This wipes out all partitions on the disk. All data on those partitions will be lost. Note: On Dell systems the Dell utility partition will be preserved.
- Keep all partitions.
This preserves all existing partitions, including the data on those partitions. Partitions for Platform OCS are added in the available free space.
If you choose the option to preserve non-Linux partitions, or all partitions, the amount of free space on the disk must satisfy the minimum required disk space to install a Platform OCS frontend. Refer to the Install checklist for the minimum requirements.
If there isn't enough space left on the disk, the Platform OCS installer will display an error message to indicate it "Could not allocate requested partitions". The installer will not let you proceed. You have to select Ok to reboot the machine.
You can only select one disk to use for the installation. You have the option of selecting any disk currently attached to the machine, including any externally attached disks. If your machine has SCSI or SATA disks, the first disk is named "sda". If your machine has IDE disks, the first disk is named "hda". If you want to partition more than one disk, you have to select Manual Partitioning. You can select Back to return to the Disk Partitioning Setup dialog and select Manual Partitioning, and proceed to Manually partition your hard disk
Select Ok to proceed to the next screen.
- Specify the sizes for the default Platform OCS partitions.
The default partition scheme creates a root, swap, and export partition. These partitions are required for Platform OCS to function correctly. The root partition is where the Linux OS is installed, and the export partition is used to store the Platform OCS distribution and the Roll files that it uses.
You must set the partition sizes in Megabytes (MB). You have the option of setting the export partition size to a fixed size, or make it grow to fill the remaining space on the disk.
![]()
Select the Back to return to the Prepare Disk dialog. Select Ok to proceed to the next dialog.
- Review the automatically created partition layout
- If you select No to review the partitions, you will advance to the Boot Loader Configuration dialog. Proceed to Manually partition your hard disk, but skip the first step.
- If you select Yes, you are taken to the Disk Druid dialog to verify the partition scheme, and make changes if necessary. Proceed to Manually partition your hard disk.
Manually partition your hard disk
In the Disk Druid dialog, verify the partitioning scheme on your hard disk. If you did not choose to auto-partition your hard disk, you need to manually configure the partition scheme in this dialog.
- Update the partitioning layout with Disk Druid.
There are two possible paths that can bring you to this screen:
- You chose Auto-partitioning, and elected to review the partitioning scheme created
- You chose Manual partitioning
Disk Druid allows you to create, delete or modify partitions. If you are Auto-partitioning, you can augment the default scheme by creating new partitions. If you are Manually partitioning, you must create the minimum set of partitions required by Platform OCS. This includes the root, swap and export partitions. When you are satisfied with the partition layout, select Ok.
Do not select RAID as Platform OCS does not support Software RAID partitioning.
![]()
- Select the default partition to boot for the GRUB boot loader
The Platform OCS installer automatically adds boot entries to the GRUB boot menu for any operating systems it finds in any partitions that are preserved on the disk. This only occurs if you chose to preserve partitions. Only entries for non-Linux operating systems are added. If you like to add entries to the GRUB menu for Linux operating systems, you must add them manually after the frontend is installed.
To change the default partition to boot, select the partition and press F2. You can also change the label for a partition by selecting it and pressing Edit.
Completing the Installation
- In the Network Configuration for eth0 and eth1 dialogs, specify the IP address for the Private (eth0) and Public (eth1) Ethernet interfaces
A Platform OCS frontend requires two Ethernet interfaces to work correctly. The next two screens ask the user to enter the IP address and Netmask for the private and public interfaces.
For the private interface, only class-based networks are supported. Classless Inter-Domain Routing (CIDR) is not supported (e.g. subnetting or supernetting). The following is a list of valid Netmask values and the number of hosts each Netmask value supports. Choose the Netmask value that is appropriate for your cluster size.
Class Netmask value Number of hosts in the network A
255.0.0.0
16777214
B
255.255.0.0
65534
C
255.255.255.0
254
For the public interface, you need to contact your IT department to obtain a static IP address for the frontend, and the corresponding Netmask value. You cannot configure the frontend to use an IP address obtained via DHCP.
- In the Miscellaneous Network Settings dialog, specify your gateway and DNS IP addresses.
You may need to contact your IT department to obtain these addresses for your network.
- In the Time Configuration dialog, select your time zone from the list of servers and specify your network time server. If your node uses UTC time, select System clock uses UTC.
- In the Root Password dialog, select a root password that you will remember.
The installer will format the disk, copy the rolls from the DVD (and any other CD/DVDs you inserted) onto the disk, and install the packages.
After package installation completes, the boot loader is installed and the post-installation is executed. The machine then reboots. You have completed your installation, and are ready to test your frontend node as described in Test the frontend node.
Test the frontend node
Before installing the compute nodes, perform the following tests to verify your frontend is operational. Log in to your frontend as root with the password you used during the installation and perform the following steps:
- Check for hardware issues
In some cases, you might have hardware that is not detected by the running kernel, or you have a kernel driver that fails to load. Look through the following logs to identify any hardware issues:
- Check that the ethernet network is working:
- Check that both eth0 and eth1 interfaces are up:
#ifconfig- Verify the routing table is correct.
#routeWhen verifying the routing table, pay careful attention to the following:
- Traffic for the private network is routed over eth0, while traffic for the public network is routed over eth1.
- The default route will go through the gateway server you specified during installation.
- Multicast packets will be routed over eth0 (using 224.0.0.0 network)
- External hosts can be reached with the ping command
- Check that the High Performance Interconnect is working
If you installed an interconnect, you should verify the driver for the interconnect hardware was loaded correctly. In addition, the interconnect vendor may provide diagnostic tools to determine if the interconnect is working. We suggest you refer to the documentation for your particular interconnect.
- Check the required services
The frontend runs many services that are essential for cluster administration and installing compute nodes. You need to ensure all of the services listed below are running:
- Check the Platform OCS infrastructure
Run some basic Platform OCS commands, seen below, to verify the infrastructure is working. The commands should execute successfully.
- Login as root and start
insert-ethers, select compute node, then press F11 to exit.#insert-ethers
Important: If you run "insert-ethers", you might see a message that says "Rocks Distribution is not ready. Please wait for rocks-dist to complete". This is normal when you log into a frontend for the first time. A startup script runs rocks-dist in the background during the first boot-up. You have to wait for "rocks-dist" to finish running before you can run "insert-ethers".
- Test rebuilding the Platform OCS distribution
#cd /home/install ; rocks-dist dist- Check the added rolls
Verify that all of the rolls you selected during the frontend installation are added to the frontend:
#rollops -lYou can use the "rollops" command to install other rolls from the DVD. Simply insert the DVD, and run the following command. This command will display a menu from which you can select the roll you want to install:
#rollops -a- Start up X Windows
Run the following command to start X Windows:
#startxThis command will automatically probe for your video card, configure the settings for it, and start up X. It may be necessary to run
system-config-displayto configure the display correctly. You can configure Platform OCS to automatically start X every time you log in by changing the runlevel on theinitdefaultline from 3 to 5 in the/etc/inittabfile.- Check the Platform OCS Cluster home page
Verify that you can access the cluster home page. The page will load automatically when you start the browser. This Homepage gives you access to all of the Cluster Monitoring tools, and Platform OCS documentation for all of the installed rolls.
Follow the link near the bottom of the Homepage to register your Platform OCS cluster.
![]()
When all the above tests pass, you are ready to proceed with compute node installation. If you experience any issues or errors, contact Platform Support at support@platform.com.
[ Top ]
Compute node and appliance installation
Different types of nodes can be installed in a Platform OCS cluster. These different node types are referred to as appliances. The most common type is a compute node. The other appliances are listed in the table below. The set of available appliances will depend on what rolls you install. You can view the list of available appliances by running the insert-ethers command.
Note: Platform OCS provides an optional method to install compute nodes that involves pre-loading host information into the Platform OCS database to speed up the compute node installation process. This also allows system administrators to pre-configure the cluster naming and IP scheme making it independent of the order in which nodes are installed. This method requires a list of MAC addresses. To take advantage of this feature, you must obtain a list of MAC addresses for your compute nodes before installing the compute nodes.
The following steps summarize the installation of Platform OCS on your compute nodes and appliances:
- Prepare your compute node
- Install compute nodes
- Install other appliance types
- Test compute nodes and appliances
- Test the cluster installation
There are two methods for installing compute nodes and other appliance types: using the
insert-etherstool, or using theadd-hoststool. Choose the method that is appropriate for your cluster.About insert-ethers
Insert-ethers is a tool you run on your frontend to capture the DHCP requests broadcasted by the compute nodes. For each DHCP request, insert-ethers generates a hostname and IP address for the node and adds the new information to the Platform OCS database.
The system is then updated to reflect the addition of the new host. Various system configuration files are updated, and DHCP and DNS services are restarted. Once DHCP is updated, a compute node can obtain an IP address, allowing it to network boot, and start the install process. Insert-ethers should be used if you are deploying a small to medium sized cluster ( less than 128 nodes). Insert-ethers uses a node naming convention based on the assumption that your nodes are assembled in racks. The convention is:
<appliance type>-<rack>-<rank>where:
<appliance type>is the short-form name of the node's appliance type<rack>is the number of the rack in which the node is located<rank>is the location within the rack where the node is locatedFor example:
compute-0-0: This is a compute node that is located in the bottom-most node in the first rack.lsfhpc-5-5: This is a LSF HPC master candidate host that is located in the sixth row (from the bottom) in the sixth rack.The
insert-etherscommand assigns IP addresses to nodes starting from the top-most IP address for your subnet, and iterates through the address space in descending order.For example, given a frontend address of 10.1.1.1, and a netmask of 255.0.0.0, the first node is assigned 10.255.255.254, the second node is assigned 10.255.255.253, and so on.
About add-hosts
Add-hosts is a tool that pre-populates the Platform OCS database with host information. The tool enables the user to define their own hostnames and IP addresses for the compute nodes using an XML configuration file. This alleviates the need to run insert-ethers to capture DHCP requests, and auto-assign hostnames and IP addresses.
After the information is loaded into the database, the system is updated to reflect the addition of the new hosts, in the same way that
insert-ethersupdates the system.add-hostsshould be used if you are deploying a large cluster of greater than 128 nodes.add-hostsrequires a list of MAC addresses for your compute nodes. If you are purchasing new hardware for the cluster the hardware vendor can supply a list of MAC addresses for all nodes.Prepare your compute node
Before installing your compute nodes, consider customizing them to suit your requirements. The most common customizations are:
- Changing the default partition layout
- Adding additional RPM packages
- Adding additional post-installation configuration scripts
To customize your compute nodes, you need to update the Platform OCS distribution. Customizations are specified using XML files. Every change to an XML file requires a rebuild of the Platform OCS distribution.
The Platform OCS distribution is located in
/home/install/rocks-dist. To rebuild it login as root and, run:#cd /home/install ; rocks-dist dist
Important: Always rebuild the distribution in the/home/installdirectory. Rebuilding the distribution in other directories may result in corruption of the permissions in the/home/installdirectory.
The XML files for compute node customization are located in
/home/install/site-profiles/4.1.1/nodes. The XML files can be generated manually or generated using automated tools included with Platform OCS. Details are described in the next section.
- Changing the default partition layout
You can change the default partition sizes, or create your own partition layout to override the default Platform OCS partition layout. The default partition layout for compute nodes is the same as the layout for the frontend. Only the first disk is partitioned, other disks are left as is.
Partition Mountpoint Filesystem Type Minimum size Default size Root
/
Ext3
6 GB
10 GB
Swap
None
Swap
1 GB
4 GB
Export
/state/partition1
Ext3
10 GB
Rest of disk
- Changing the default partition sizes
If you're satisfied with the default layout, but want to change the root and swap partition sizes, use the custom-partition tool:
#custom-partition -r<root partition size in MB>-s<swap partition size in MB>-bFor example, to change the root partition size to 20 GB, and swap partition size to 2 GB, run the following command:
#custom-partition -r 20000 -s 2000 -bThe "custom-partition" tool creates the
/export/home/install/site-profiles/4.1.1/nodes/extend-a uto-partition.xmlfile and rebuilds the Platform OCS distribution.For more information about the custom-partition tool, refer to the manpage or the Readme for Platform OCS Rolls.
- Changing the default partition layout
To setup more complex partitioning, you need to manually create a
replace-auto-partition.xmlfile that will replace the default layout and rebuild the Platform OCS distribution.Run the following commands:
# cd /home/install/site-profiles/4.1.1/nodes # cp skeleton.xml replace-auto-partition.xmlOpen
replace-auto-partition.xmlwith a text editor and:
- Delete the
<package>and<post>sections- For each partition you want to define, create a line with the <part> tag in between the <main> and </main> tags
- Between the <part> and </part> tags, specify the parameters for your partition. The parameters used are the same as those used for the RedHat Kickstart "part" directive.
- For more information on the different partition parameters, please refer to the Advanced Partitioning section of this guide.
#cd /home/install ; rocks-dist distFor example:
Suppose you want to create a partition layout on the first SCSI disk consisting of a 15 GB root partition, 2 GB swap partition, 5 GB
/varpartition, and a/datapartition that takes up the rest of the disk. Here is what the XML file will look like:<?xml version="1.0" standalone="no"?> <kickstart> <description> </description> <changelog> </changelog> <main> <!-- Put your partitioning directives here --> <part> / --size 15000 --ondisk sda </part> <part> swap --size 2000 --ondisk sda </part> <part> /var --size 5000 --ondisk sda </part> <part> /data --size 1 --grow --ondisk sda </part> </main> </kickstart>- Adding additional RPM packages
The rocks-compute tool can be used to update the Platform OCS distribution with the user's own RPM packages. The tool allows you to add, list, or remove packages. The tool creates the
/export/home/install/site-profiles/4.1.1/nodes/extend-comp ute.xmlfile and rebuilds the Platform OCS distribution. You can add as many packages as needed.
- To add a custom package to the Platform OCS distribution and rebuild the distribution, run the following:
#rocks-compute -a -p<path to the RPM package>-b
Important: the rocks-compute tool does not check for RPM package dependencies for a given package. Ensure that you also add run the command above for any package dependencies.
- To list all of the packages you added:
#rocks-compute -l pThe above command will list a unique ID for each package. This ID is used to remove a package from the distribution.
- To remove a package from the Platform OCS distribution and rebuild the distribution, run:
#rocks-compute -d -p<package ID>-bExample: Adding your own RPM package
#rocks-compute -a -p /myshare/package-1.0.0.x86_64.rpm -bExample: Adding an RPM from the OS roll
The OS roll contains RPMs for the Linux operating system. There may be some RPMs in the OS roll that you want to install but didn't get installed on the compute node. The steps are:
- Look for the RPM you want to install. Let's install the "ncompress" package:
#find /home/install/ftp.rocksclusters.org/pub/rocks/rocks-4.1. 1/rocks-dist/rolls/os/4.1.1/x86_64/RedHat/RPMS/ -name 'ncompress*'- Add the
ncompresspackage to the Platform OCS distribution#rocks-compute -a -p /home/install/ftp.rocksclusters.org/pub/rocks/rocks-4.1. 1/rocks-dist/rolls/os/4.1.1/x86_64/RedHat/RPMS/ncompress -4.2.4-40.x86_64.rpm -b- Adding additional post-installation configuration scripts
You may want to add your own post-installation scripts to configure a compute node. Some examples include turning on/off services, creating or updating configuration files, creating init scripts, etc.. The scripts are executed during the post-installation after all RPM packages have been installed. Create the script in a text editor, and save it to a file. The script must be a bash shell script.
The
rocks-computetool can be used to update the Platform OCS distribution with the user's post-installation scripts. The tool allows you to add, list, or post-installation scripts. The tool creates the/export/home/install/site-profiles/4.1.1/nodes/extend-c ompute.xmlfile and rebuilds the Platform OCS distribution. You can add as many scripts as needed.
- To add a post-installation script and rebuild the Platform OCS distribution:
#rocks-compute -a -s<path to script>-b- To list the post-installation script(s) you added:
#rocks-compute -l sThe above command will list a unique ID for each script. This ID is used to remove a script from the distribution.
- To delete a post-installation script, and rebuilding the Platform OCS distribution:
#rocks-compute -d -s<script ID>-bExample: Adding a post-install script
Suppose you have a Bash script that appends a library path to the
/etc/ld.so.conffile. You can create a script that looks as follows.:#!/bin/bash echo "Appending /mypath/lib to /etc/ld.so.conf" >> /root/compute.log echo "/mypath/lib" >> /etc/ld.so.conf echo "Running ldconfig" >> /root/compute.log /sbin/ldconfig >> /root/compute.log 2>&1Let's assume the script is saved in
/home/user/myscript.sh. You can run add the script by running:#rocks-compute -a -s /home/user/myscript.sh -bFor more information about the rocks-compute tool, refer to the manpage or the Readme for Platform OCS Rolls.
Install compute nodes
Install compute nodes using either
insert-ethersoradd-hostsas follows:Installing compute nodes using insert-ethers
- Log in to the Platform OCS frontend as root, and run
insert-ethers.- If you have a managed ethernet switch that sends out DHCP requests, select Ethernet Switches. If you didn't, proceed to the next step.
Choosing Ethernet Switches will assign an IP address to the switch. You may need to wait several minutes for the switch to broadcast a DHCP request. When done, press F9 to quit.
- If you are installing a small cluster and you are not worried about assigning hostnames and IP addresses in the same order as the physical host layout, just run
insert-ethers.If you do care about order, you need to tell the
insert-etherscommand which rack you are installing by specifying the rack number on the command-line. Let's assume you want to start with the first rack:#insert-ethers --cabinet=0The nodes will be named
compute-0-0,compute-0-1,compute-0-2, and so on.- Choose Compute from the list of appliances.
- Once
insert-ethersis waiting for the compute node, you can PXE boot the node by either physically rebooting the node from the console or remotely logging into the console using Vendor IPMI or management tools.To make sure that your compute nodes are assigned hostnames and IP addresses in the correct order, you will need to PXE boot each machine, one at a time, in order of their physical location in the current rack you are installing. In other words, power up the bottom-most node in the rack, then work your way up, one node at a time.
- If the node is successfully detected, installation will begin and you should see the MAC address and compute node name on the
insert-ethersscreen.An asterisk
(*)indicates that a kickstart file was requested by the compute node and installation should proceed normally. If there is no(*), Platform OCS will not install properly on the node and you should see an error on the compute node.Once a node has a
(*), you can PXE boot the next node in the rack. Repeat the process for the rest of the nodes in the rack.If you see a
(503)status, it means that the frontend is too busy to serve a Kickstart file to a node. In this case, try PXE booting the compute node again. If you see a "(500)" status, then an error occurred when generating the Kickstart file for the node. In this case, verify whether the Kickstart file can be generated locally on the frontend.- You can monitor the installation of a compute node by either switching to the console of the compute node with a kvm switch or using management tools supplied by the hardware vendor. If you do not have a kvm switch or you have not configured the hardware management utilities you can still monitor the installation progress of the compute node by creating a secure shell connection to the compute node.
#ssh compute-0-0 -p 2200- You should see the install progress on the compute node.
- Once installation is complete, the node will automatically reboot and join the cluster.
- Once you have finished installing all of the compute nodes in rack `0', exit `insert-ethers' and run it again incrementing the cabinet number, for example:
#insert-ethers --cabinet=1Return to Step 4 and repeat the installation process for the rack.
Installing compute nodes using add-hosts
For small clusters insert-ethers is the quickest and easiest way to install Platform OCS. However, for larger clusters of 128 nodes and beyond, the
add-hoststool provides better configuration management. A large cluster requires planning out the layout of the network, switches, racks, and nodes in the cluster. Theadd-hoststool is an easy way to plan out the cluster layout, when a list of all of the MAC addresses is provided by the hardware vendor.The steps are as follows:
- Obtain a list of the MAC addresses for all nodes in the cluster. Save the addresses in a text file. For example, /opt/rocks/etc/mac.txt. The MAC addresses must be listed in the order in which you plan to add the hosts. In other words, the first MAC address corresponds to the first node in the first rack, the second MAC to the second node in the first rack, and so on.
- Create an XML configuration file in
/opt/rocks/etc/add-hostsrcto define the names, IP addresses, and appliances that you will be installing.- For brevity, we will show you a sample add-hostsrc file and MAC address file, and ask that you refer to the Advanced Administration section of this guide for more information on setting up the
add-hostsrcfile.Suppose that you are installing 5 compute node, located in the same rack, in a class B network (i.e. netmask is 255.255.0.0). Assume that you want to assign IP address starting from 10.1.1.5. Your MAC address file and add-hostsrc file will contain:
MAC Address file:
00:11:22:33:44:55 # first compute node 00:11:22:33:44:56 # second compute node 00:11:22:33:45:57 # etc...... 00:11:22:33:45:58 00:11:22:33:45:59Add-hostsrc file:
<?xml version="1.0" standalone="yes"?> <add-hosts> <mac_addr_file value = "/opt/rocks/etc/mac.txt" /> <num_hosts_per_rack value = "10" /> <order_by_rack value = "yes" /> <netmask value = "255.255.0.0" /> <subnet> <host_prefix value = "compute" /> <baseip value = "10.1.1.5" /> <num_hosts_in_subnet value = "5" /> <appliance value = "compute" /> </subnet> </add-hosts>- Run
add-hoststo populate the database based on the information in the XML file above#add-hostsThe following information is added to the Platform OCS database:
- PXE boot your compute nodes. Note that the order in which your nodes are PXE booted is not important since the node information is already in the database. You can PXE boot several hosts at the same time.
Install other appliance types
In addition to compute nodes, you can install other appliance types:
Install an LSF HPC master candidate host
If you installed the LSF HPC roll, you can install LSF HPC master nodes to fail-over the LSF HPC master host to another host. This increases cluster uptime and availability. We recommend installing one or more LSF HPC master nodes if you are setting up a large cluster.
Install an LSF HPC master candidate host using the following steps:
- Log into the frontend as root
- Run
insert-ethersand select the LSF HPC Master appliance type.- Install one or more of the
LSF HPC Masternodes using PXE boot.- Exit
insert-ethersby pressing F9 to update thelsf.cluster.lsfhpcfile.- Create an NFS shared path on another NFS server, and make sure that this NFS path can be mounted on the new LSF HPC master node.
- On the frontend, run the following:
#cd /home/install/upgrades/lsfhpc#config-lsf-master- Answer the dialog questions when prompted by the script.
Install a PVFS2 meta server
If you installed the PVFS2 roll, you can install this appliance type. The PVFS2 appliance installs a server that acts as both a PVFS2 Meta Server and Data Server. It will create a sample PVFS2 filesystem that is mounted under
/mnt/pvfs2.Install a PVFS2 meta server using the following steps:
- Log into the frontend as root
- Run
insert-ethersand select thePvfs2-meta-serverappliance type.- Install the PVFS2 meta server using PXE boot
- Repeat the process till all the nodes to be used as Data Servers are installed.
- Exit
insert-ethersby pressing F9.- Follow the instructions in the PVFS2 Roll section under Production Cluster Configuration to complete the configuration.
Test compute nodes and appliances
You can test the compute nodes and appliances as follows:
- Check if you can log into the compute node without a password:
#ssh<compute node name>- Check DNS by resolving the frontend's hostname:
#host<frontend's local name>- Check if
/home/installis auto-mounted:#ls /home/install/- Check if 411 can update all of the files on the compute node:
#411get --all- If you installed an LSF HPC master candidate host, perform the following tests:
- If you installed a PVFS2 meta server, check that the
/mnt/pvfs2path is mounted. Test that other compute nodes can also mount the/mnt/pvfs2path.Test the cluster installation
Before proceeding further, make sure you have completed the post-install tests for the frontend and compute nodes. When done, run the following tests to ensure that your cluster is functioning properly.
- Run Cluster-fork to verify that all nodes can be connected
#cluster-fork hostname- Test 411 to verify that 411 broadcasts can be sent out to all nodes
#make -C /var/411 force- Check Ganglia (if installed). Point your browser to
http://localhost/gangliaand verify that all nodes appear on the webpage.- Check Clumon (if installed). Point your browser to
http://localhost/clumonand verify that all nodes appear on the webpage.- Check Lava cluster (if installed) to see if all nodes appear in the cluster
#lsid#lsload#bhosts- Check LSF HPC cluster (if installed) to see if all nodes appear in the cluster
#lsid#lsload#bhosts[ Top ]
Basic Administration
The following topics describe basic tasks when administrating your Platform OCS cluster:
- Online documentation
- Clumon
- Platform Lava GUI
- Ganglia
- Ntop
- SSH
- Adding, removing, or upgrading rolls
- Adding or removing users
- Firewall/iptables
- Platform OCS services and utilities
- Reinstalling compute nodes
- Log files
- Cron jobs
Online documentation
Online documentation is provided online by the frontend node. Start a browser on the frontend. It will default to the Cluster page, which contains links to the following guides:
Roll-specific documentation is available from the Installed Rolls link, by following the Guide or Readme links beside the roll of interest.
Clumon
Clumon is a cluster job-monitoring tool that allows the administrator to see: the states of jobs, view job queues, load information, resource usage and process information and if a node's scheduler daemons are up or down.
Clumon represents the system load by colours on a bar for each compute node. Icons representing nodes will have indicators denoting the load of the node, red indicating high or heavy load, and various levels of blue to indicate a lighter load. If a node is experiencing problems or is down, the node will become a black and red crossbones icon. If you move your mouse over a node icon, a popup note will appear and provides summary information about the node.
To view Clumon information, go to the main cluster webpage, click on "Cluster Status (CluMon)" link, or point your browser to
http://localhost/clumon(on the frontend).In a screen with running jobs, you can examine each job's state by clicking on the job number, or by examining the queues:
![]()
Platform Lava GUI
The Lava web GUI is a frontend to the Lava batch scheduling system. Users can submit jobs and perform actions such as suspending, resuming or killing jobs.
To submit or modify jobs go to the Lava GUI web interface. Go to the main cluster webpage, click on "Lava GUI". You will need to log into the interface. User
rootis not permitted to login. Log in to the interface using an existing user account or thelavaadminaccount.The following is a Lava GUI dialog window for submitting a job:
![]()
Ganglia
Ganglia is a cluster statistics collector which monitors node availability, displays system load, network usage, and other resource information over a period of time. Data is collected for each metric and is stored on the frontend. The data is stored for up to one year.
Ganglia displays detailed information regarding the usage of each node and provide the administrator a guide as to the day-to-day functions of the cluster.
To view Ganglia information, go to the main cluster webpage, click on "Cluster Status (Ganglia)" link.The following is a Ganglia display showing the overview of a cluster:
![]()
Ntop
Ntop is a network traffic analyzer designed to show the administrator the different protocol traffic passing through the frontend.. Ntop can also show network traffic patterns to better diagnose network problems and network utilization issues.
By default, Ntop is configured for both public and private traffic with one interface always listening. You can switch which network interface ntop should listen on by clicking the Admin menu option and selecting "Switch NIC". From there a new screen will appear and you can then select which network interface to listen on.
Ntop provides several plug-ins that can be enabled or disabled for further analysis of the traffic. See the plugins page within Ntop for more details.
To view the Ntop page, go to the main cluster webpage, click on "Ntop Cluster Monitoring (SSL)" link.
The following is an Ntop display showing active TCP and UDP sessions connected to a frontend on a private network:
![]()
SSH
By default, the OpenSSH daemon is configured to enable X11 forwarding. This can sometimes slow down connecting to nodes. You can disable forwarding by using the
-xoption when connecting to a node to skip X11 forwarding.This can also be disabled permanently by editing the
/etc/ssh/ssh_configfile and changing the lineForwardX11 Yesand setting this toNo.An SSH connection from one node to another may be slow in setting up. This is usually because of a name resolution failure, and subsequent timeout. This can occur if the frontend was installed with an invalid DNS server.
Note: this will also slow MPI jobs.
Adding, removing, or upgrading rolls
Platform OCS provides a tool that allows the user to do roll maintenance on their frontend.
Adding a roll
Using the rollops tool, you can add a roll to the frontend. To do this, you need a CD/DVD roll or you can download an ISO image.
- Insert the CD/DVD roll into the drive or use the -i option to rollops
- Run either of the following:
Example output:
rollops: Copying Roll: ntop Copying roll from media (directory "/tmp/tmprcwC0V") into mirror Copying "ntop" (4.1.1,x86_64) roll... 7645 blocks chmod a+rx /home/install/ftp.rocksclusters.org Installing Roll: ntop, please wait... <Roll installation output> rollops: The 'ntop' roll has been successfully installed!
Note: If the CD/DVD roll or the ISO image is a meta-roll (a roll that contains many rolls in one), you will see a list of rolls to install.
rollops: Autodetecting CD-ROM/DVD roll... Rolls found 1) clumon 2) extras 3) ganglia 4) lsfhpc 5) modules 6) myrinet 7) ntop 8) pvfs2 9) ts_ib q) Quit To install a roll, type the number or type "q" to quit>Upgrading a roll
- Insert the CD/DVD roll into the drive or use the
rollops -ioption.- Run either of the following:
Example output:
rollops: Copying Roll: dell Copying roll from media (directory "/tmp/tmprcwC0V") into mirror Copying "dell" (4.1.1,x86_64) roll... 7645 blocks chmod a+rx /home/install/ftp.rocksclusters.org Installing Roll: dell, please wait... <Roll installation output> rollops: The 'dell' roll has been successfully upgraded!
Note: If the CD/DVD roll or the ISO image is a meta-roll (a roll that contains many rolls in one), you will see a list of rolls to perform an upgrade.
You can upgrade to rolls with the same version or with a newer version but cannot rollback to an older roll.
rollops: Autodetecting CD-ROM/DVD roll... Rolls found 1) myrinet 2) dell 3) intel_mpirt 4) ts_ib q) Quit To upgrade a roll, type the number or type "q" to quit>Removing a roll
To remove the roll from the frontend, run the following command:
#rollops -e<roll_name>Example output:
rollops: Removing Roll: 'ntop', please wait... <Roll removal output> rollops: The 'ntop' roll has been removed successfully!Disabling a roll
To disable a roll from being installed on a compute node run the following command:
#rollops -p no -r<roll_name>Example output:
rollops: Setting permissions for the 'ntop' roll. Please wait... rollops: Completed updating permissions for the 'ntop' roll.Adding or removing users
To add a user or delete a user, you must be logged into the frontend as root. After a user is added or removed, 411 automatically updates the user information on all of the nodes in the cluster.
Adding a user
- To specify the password on the same command-line:
#adduser -p<password> <user_name>- To specify the password using the
passwdcommand:#adduser<user_name> #passwd<user_name> #make -C /var/411Removing a user
To remove a user, run the following command:
#userdel<user_name>Firewall/iptables
The frontend is installed with firewalling software (iptables). It is configured with some basic forwarding rules. From a network security standpoint the frontend and nodes are not secure. Evaluate the security risks at your site and create appropriate firewall rules to secure the cluster.
Warning: The frontend should never be connected to the Internet without first restricting the type of packets allowed by customizing the iptables rules.
By default, services are only visible to the private network. However, you may choose to enable HTTP and HTTPS over the public network. Please note that this will expose your cluster homepage and Platform OCS database to the external network. To open HTTP and HTTPS access, edit the
/etc/sysconfig/iptablesfile and uncomment the following lines:# Uncomment the lines below to activate web access to the cluster. -A INPUT -m state --state NEW -p tcp --dport https -j ACCEPT -A INPUT -m state --state NEW -p tcp --dport www -j ACCEPTThen, restart iptables:
#service iptables restartFor details on customizing your firewall, see http://www.netfilter.org.
The default routing of Platform OCS is to use eth0 for private and eth1 for public traffic.
Platform OCS services and utilities
411
Platform OCS provides a service called 411. This is very similar to NIS. It is used to synchronize files across a cluster. This is done via multicasting a notification of change from the frontend then having the nodes download the file over an encrypted channel. Users and groups are one example of information passed over 411. Whenever you run
useraddoruserdel, 411 will update the user information on all nodes in the cluster.The diagram below depicts the process.
![]()
By default the following files are propagated throughout the cluster by 411:
/etc/passwd/etc/shadow/etc/group/etc/services/etc/rpc/etc/auto.*files (for example,auto.masterandauto.home)If you have made any changes to the files listed above. Running the command
make -C /var/411will push the updated files to the cluster.You can also have compute nodes pull any 411 synchronized files by running
411get --allon the compute node to retrieve all files. To update all compute nodes, run the following command:#cluster-fork 411get --allSee the Advanced Administration for how to customize 411.
Rocks-grub
The
rocks-grubservice is a tool that forces an appliance such as a compute node to reinstall if the node is powered off incorrectly, such as a power outage. If the service is turned ON, the node will be reinst