====== Cluster in a Box ======
A VirtualBox Solaris Cluster complete with iSCSI quorum disks!
===== Host System =====
The host system is a speedy laptop:
* Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz
* 6Gb RAM
* GeForce GTX 260M
At the time of this article, I'm running Ubuntu 10.04 (lucid) with VirtualBox 3.2.8. The Solaris Cluster is 3.3 running in a Solaris 10u9 virtual machine. Here's an {{:prophead:s10u9sc33.tgz|export}} of the appliance if you want to use it. It has the networks defined so all you need to do is create a hard drive and load the software. VirtualBox will also require a //Host-only Network// which is configured under //File// -> //Preferences// -> //Network//. I called mine //vboxnet0// and configured it as such:
^ IPv4 Address|192.168.1.254 |
^ IPv4 Network Mask|255.255.255.0 |
I don't use IPv6 so I just went with the defaults.
===== Shared Storage =====
A cluster will require some sort of shared storage. This can be accomplished in VirtualBox by either using iSCSI or fixed size virtual disks.
===== Fixed Size Virtual Disks =====
If you don't care to set up iSCSI, you can use shared virtual storage in later releases of VirtualBox:
Create the disks.
$ for f in 0 1 2 3 4 5 ;do
> vboxmanage createhd --filename shared-$f.vdi --size 5000 --format VDI --variant Fixed
> done
Attach the disks to an existing VM.
$ for f in 0 1 2 3 4 5 ;do
VBoxManage storageattach node1 --storagectl "SCSI Controller" --port $f --device 0 --type hdd --medium shared-$f.vdi
done
Modify the disk type.
$ for f in 0 1 2 3 4 5 ;do
> vboxmanage modifyhd shared-$f.vdi --type shareable
> done
You may now attach these drives to multiple VMs and use them as shared storage.
===== iSCSI =====
Before the cluster software is installed into the virtual machines, iSCSI needs to be installed and configured on the linux host system.
- **Create Disk Image Files**.\\ iSCSI will require a few disk image files and this can be done with dd. The following will create 6 1Gb disk image files:
cd /some/place/with/plenty/of/storage
for i in {0..5}; do dd if=/dev/zero of=lun$i.img bs=1024k count=1000; done
- **Install iSCSI Packages**
sudo apt-get install iscsitarget
- **Configure iSCSI**\\ Add these entries to /etc/ietd.conf:
Target iqn.2010-04.us.webcave.linux-host:storage
Lun 0 Path=/some/place/with/plenty/of/storage/lun0.img,Type=fileio
Lun 1 Path=/some/place/with/plenty/of/storage/lun1.img,Type=fileio
Lun 2 Path=/some/place/with/plenty/of/storage/lun2.img,Type=fileio
Lun 3 Path=/some/place/with/plenty/of/storage/lun3.img,Type=fileio
Lun 4 Path=/some/place/with/plenty/of/storage/lun4.img,Type=fileio
Lun 5 Path=/some/place/with/plenty/of/storage/lun5.img,Type=fileio
\\ Edit /etc/default/iscsitarget
ISCSITARGET_ENABLE=true
-- **Start the Service**\\ /etc/init.d/iscsitarget start
===== Virtual Machines =====
- Create two Solaris 10 virtual machines using this appliance {{:prophead:s10u9sc33.tgz|export}}. You will need to attach a 15GB hard disk to each machine. Install Solaris 10u9 into each machine selecting at least the //End User// package cluster. I went with //All//. When finishing the install, make sure you select a default router pointing to the virtual network adapter of the host system.
- Log into the virtual machines using the command line login option and change a few things before we get started:
- Edit /etc/hosts on both systems and add an entry for the other cluster node(s).
- Disable the graphical loginsvcadm disable cde-login
- Set up a decent /.profile
#
# let's use a real shell
#
if [ "$0" = "-sh" -a "$SHELL" = "/sbin/sh" ]; then
SHELL="/usr/bin/bash"; export SHELL
exec $SHELL -l
fi
if [ "$SHELL" = "/usr/bin/bash" ]; then
set -o vi
export PAGER=less
export PATH=/usr/sbin:/usr/bin:/usr/cluster/bin:/usr/cluster/lib/sc:/usr/ccs/bin:/usr/proc/bin:/usr/sfw/bin
export MANPATH=/usr/share/man:/usr/cluster/man:/usr/sfw/man
if [ "`tty`" != "/dev/console" ]; then
PROMPT_COMMAND='echo -ne "\033]0;${LOGNAME}@${HOSTNAME}: ${PWD}\007"'
fi
export PS1="${LOGNAME}@${HOSTNAME} # "
fi
- Enable root logins by editing /etc/default/login and commenting the //CONSOLE// entry.
vi /etc/default/login
...
# If CONSOLE is set, root can only login on that device.
# Comment this line out to allow remote login by root.
#
#CONSOLE=/dev/console
...
- Enable root ssh logins by editing /etc/ssh/sshd_config and changing //PermitRootLogin// to yes:
vi /etc/ssh/sshd_config
...
# Are root logins permitted using sshd.
# Note that sshd uses pam_authenticate(3PAM) so the root (or any other) user
# maybe denied access by a PAM module regardless of this setting.
# Valid options are yes, without-password, no.
PermitRootLogin yes
...
Notify sshd of change:pkill -HUP sshd
- If you chose to use iSCSI, set up the iSCSI initiator in the VMs\\
iscsiadm add static-config iqn.2010-04.us.webcave.linux-host:storage,192.168.1.254:3260
iscsiadm modify discovery --static enable\\ This should add the iSCSI disks and you can verify this by looking at format. If the disks are now present in format, label the disks.\\
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0d0
/pci@0,0/pci-ide@1,1/ide@0/cmdk@0,0
1. c3t1d0
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,0
2. c3t2d0
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,1
3. c3t3d0
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,2
4. c3t4d0
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,3
5. c3t5d0
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,4
6. c3t6d0
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,5
Specify disk (enter its number): Specify disk (enter its number):
===== Cluster Install =====
Install the cluster software in the VMs but choose the option to //Configure Later// and reboot. Run //scinstall// on the first node:
*** Main Menu ***
Please select from one of the following (*) options:
* 1) Create a new cluster or add a cluster node
2) Configure a cluster to be JumpStarted from this install server
3) Manage a dual-partition upgrade
4) Upgrade this cluster node
* 5) Print release information for this cluster node
* ?) Help with menu options
* q) Quit
Option: 1
*** New Cluster and Cluster Node Menu ***
Please select from any one of the following options:
1) Create a new cluster
2) Create just the first node of a new cluster on this machine
3) Add this machine as a node in an existing cluster
?) Help with menu options
q) Return to the Main Menu
Option: 2
Continue on answering the configuration questions but disable //automatic quorum device selection//. Your install should look like this:
>>> Confirmation <<<
Your responses indicate the following options to scinstall:
scinstall -i \
-C crash \
-F \
-G lofi \
-T node=node1,node=node2,authtype=sys \
-w netaddr=172.16.0.0,netmask=255.255.240.0,maxnodes=64,maxprivatenets=10,numvirtualclusters=12 \
-A trtype=dlpi,name=e1000g1 -A trtype=dlpi,name=e1000g2 \
-B type=switch,name=switch1 -B type=switch,name=switch2 \
-m endpoint=:e1000g1,endpoint=switch1 \
-m endpoint=:e1000g2,endpoint=switch2
Are these the options you want to use (yes/no) [yes]?
Once the first node has rebooted, run //scinstall// on the second node:
*** Main Menu ***
Please select from one of the following (*) options:
* 1) Create a new cluster or add a cluster node
2) Configure a cluster to be JumpStarted from this install server
3) Manage a dual-partition upgrade
4) Upgrade this cluster node
* 5) Print release information for this cluster node
* ?) Help with menu options
* q) Quit
Option: 1
*** New Cluster and Cluster Node Menu ***
Please select from any one of the following options:
1) Create a new cluster
2) Create just the first node of a new cluster on this machine
3) Add this machine as a node in an existing cluster
?) Help with menu options
q) Return to the Main Menu
Option: 3
Answer the install questions and when finished, the scinstall command should look like this:
>>> Confirmation <<<
Your responses indicate the following options to scinstall:
scinstall -i \
-C crash \
-N node1 \
-G lofi \
-A trtype=dlpi,name=e1000g1 -A trtype=dlpi,name=e1000g2 \
-m endpoint=:e1000g1,endpoint=switch1 \
-m endpoint=:e1000g2,endpoint=switch2
Are these the options you want to use (yes/no) [yes]?
After all nodes have been installed and added to the cluster, the shared storage (either iSCSI or shared virtual disks) must be set for software fencing. Obtain a list of the shared disks so we know which //did// devices to change:\\
# cldev list -v
DID Device Full Device Path
---------- ----------------
d1 node1:/dev/rdsk/c0d0
d2 node1:/dev/rdsk/c1t0d0
d3 node2:/dev/rdsk/c0d0
d4 node2:/dev/rdsk/c1t0d0
d5 node1:/dev/rdsk/c3t1d0
d5 node2:/dev/rdsk/c2t1d0
d6 node2:/dev/rdsk/c2t2d0
d6 node1:/dev/rdsk/c3t2d0
d7 node2:/dev/rdsk/c2t3d0
d7 node1:/dev/rdsk/c3t3d0
d8 node2:/dev/rdsk/c2t4d0
d8 node1:/dev/rdsk/c3t4d0
d9 node2:/dev/rdsk/c2t5d0
d9 node1:/dev/rdsk/c3t5d0
d10 node2:/dev/rdsk/c2t6d0
d10 node1:/dev/rdsk/c3t6d0
The shared disks are d5 through d10 so we'll change the fencing option for these devices to //nofencing//.\\
# for i in 5 6 7 8 9 10; do
> cldev set -p default_fencing=nofencing d$i
> done
We can verify this setting:
# for i in 5 6 7 8 9 10; do
> cldev show d$i
> done
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d5
Full Device Path: node2:/dev/rdsk/c2t1d0
Full Device Path: node1:/dev/rdsk/c3t1d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d6
Full Device Path: node1:/dev/rdsk/c3t2d0
Full Device Path: node2:/dev/rdsk/c2t2d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d7
Full Device Path: node1:/dev/rdsk/c3t3d0
Full Device Path: node2:/dev/rdsk/c2t3d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d8
Full Device Path: node1:/dev/rdsk/c3t4d0
Full Device Path: node2:/dev/rdsk/c2t4d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d9
Full Device Path: node1:/dev/rdsk/c3t5d0
Full Device Path: node2:/dev/rdsk/c2t5d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d10
Full Device Path: node1:/dev/rdsk/c3t6d0
Full Device Path: node2:/dev/rdsk/c2t6d0
Replication: none
default_fencing: nofencing
Now that we have changed the fencing protocol, add a quorum disk:clq add d5\\
Check the disk and make sure the //Access Mode// is set to //sq_disk// (software quorum).
# clq show d5
=== Quorum Devices ===
Quorum Device Name: d5
Enabled: yes
Votes: 1
Global Name: /dev/did/rdsk/d5s2
Type: shared_disk
Access Mode: sq_disk
Hosts (enabled): node1, node2
We can now turn off installmode.
# cluster set -p installmode=disabled
# clq reset
# clq show
=== Cluster Nodes ===
Node Name: node1
Node ID: 1
Quorum Vote Count: 1
Reservation Key: 0x4C927F9400000001
Node Name: node2
Node ID: 2
Quorum Vote Count: 1
Reservation Key: 0x4C927F9400000002
=== Quorum Devices ===
Quorum Device Name: d5
Enabled: yes
Votes: 1
Global Name: /dev/did/rdsk/d5s2
Type: shared_disk
Access Mode: sq_disk
Hosts (enabled): node1, node2