A VirtualBox Solaris Cluster complete with iSCSI quorum disks!
The host system is a speedy laptop:
At the time of this article, I'm running Ubuntu 10.04 (lucid) with VirtualBox 3.2.8. The Solaris Cluster is 3.3 running in a Solaris 10u9 virtual machine. Here's an export of the appliance if you want to use it. It has the networks defined so all you need to do is create a hard drive and load the software. VirtualBox will also require a Host-only Network which is configured under File → Preferences → Network. I called mine vboxnet0 and configured it as such:
| IPv4 Address | 192.168.1.254 |
|---|---|
| IPv4 Network Mask | 255.255.255.0 |
I don't use IPv6 so I just went with the defaults.
A cluster will require some sort of shared storage. This can be accomplished in VirtualBox by either using iSCSI or fixed size virtual disks.
If you don't care to set up iSCSI, you can use shared virtual storage in later releases of VirtualBox:
Create the disks.
$ for f in 0 1 2 3 4 5 ;do > vboxmanage createhd --filename shared-$f.vdi --size 5000 --format VDI --variant Fixed > done
Attach the disks to an existing VM.
$ for f in 0 1 2 3 4 5 ;do VBoxManage storageattach node1 --storagectl "SCSI Controller" --port $f --device 0 --type hdd --medium shared-$f.vdi done
Modify the disk type.
$ for f in 0 1 2 3 4 5 ;do > vboxmanage modifyhd shared-$f.vdi --type shareable > done
You may now attach these drives to multiple VMs and use them as shared storage.
Before the cluster software is installed into the virtual machines, iSCSI needs to be installed and configured on the linux host system.
cd /some/place/with/plenty/of/storage
for i in {0..5}; do dd if=/dev/zero of=lun$i.img bs=1024k count=1000; done
sudo apt-get install iscsitarget
Target iqn.2010-04.us.webcave.linux-host:storage
Lun 0 Path=/some/place/with/plenty/of/storage/lun0.img,Type=fileio
Lun 1 Path=/some/place/with/plenty/of/storage/lun1.img,Type=fileio
Lun 2 Path=/some/place/with/plenty/of/storage/lun2.img,Type=fileio
Lun 3 Path=/some/place/with/plenty/of/storage/lun3.img,Type=fileio
Lun 4 Path=/some/place/with/plenty/of/storage/lun4.img,Type=fileio
Lun 5 Path=/some/place/with/plenty/of/storage/lun5.img,Type=fileio
Edit /etc/default/iscsitarget
ISCSITARGET_ENABLE=true
/etc/init.d/iscsitarget start
svcadm disable cde-login
#
# let's use a real shell
#
if [ "$0" = "-sh" -a "$SHELL" = "/sbin/sh" ]; then
SHELL="/usr/bin/bash"; export SHELL
exec $SHELL -l
fi
if [ "$SHELL" = "/usr/bin/bash" ]; then
set -o vi
export PAGER=less
export PATH=/usr/sbin:/usr/bin:/usr/cluster/bin:/usr/cluster/lib/sc:/usr/ccs/bin:/usr/proc/bin:/usr/sfw/bin
export MANPATH=/usr/share/man:/usr/cluster/man:/usr/sfw/man
if [ "`tty`" != "/dev/console" ]; then
PROMPT_COMMAND='echo -ne "\033]0;${LOGNAME}@${HOSTNAME}: ${PWD}\007"'
fi
export PS1="${LOGNAME}@${HOSTNAME} # "
fi
vi /etc/default/login ... # If CONSOLE is set, root can only login on that device. # Comment this line out to allow remote login by root. # #CONSOLE=/dev/console ...
vi /etc/ssh/sshd_config ... # Are root logins permitted using sshd. # Note that sshd uses pam_authenticate(3PAM) so the root (or any other) user # maybe denied access by a PAM module regardless of this setting. # Valid options are yes, without-password, no. PermitRootLogin yes ...
Notify sshd of change:
pkill -HUP sshd
iscsiadm add static-config iqn.2010-04.us.webcave.linux-host:storage,192.168.1.254:3260 iscsiadm modify discovery --static enable
This should add the iSCSI disks and you can verify this by looking at format. If the disks are now present in format, label the disks.
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0d0 <DEFAULT cyl 1955 alt 2 hd 255 sec 63>
/pci@0,0/pci-ide@1,1/ide@0/cmdk@0,0
1. c3t1d0 <DEFAULT cyl 485 alt 2 hd 64 sec 32>
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,0
2. c3t2d0 <DEFAULT cyl 485 alt 2 hd 64 sec 32>
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,1
3. c3t3d0 <DEFAULT cyl 485 alt 2 hd 64 sec 32>
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,2
4. c3t4d0 <DEFAULT cyl 485 alt 2 hd 64 sec 32>
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,3
5. c3t5d0 <DEFAULT cyl 485 alt 2 hd 64 sec 32>
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,4
6. c3t6d0 <DEFAULT cyl 485 alt 2 hd 64 sec 32>
/iscsi/disk@0000iqn.2010-04.us.webcave.linux-host%3Astorage0001,5
Specify disk (enter its number): Specify disk (enter its number):
Install the cluster software in the VMs but choose the option to Configure Later and reboot. Run scinstall on the first node:
*** Main Menu ***
Please select from one of the following (*) options:
* 1) Create a new cluster or add a cluster node
2) Configure a cluster to be JumpStarted from this install server
3) Manage a dual-partition upgrade
4) Upgrade this cluster node
* 5) Print release information for this cluster node
* ?) Help with menu options
* q) Quit
Option: 1
*** New Cluster and Cluster Node Menu ***
Please select from any one of the following options:
1) Create a new cluster
2) Create just the first node of a new cluster on this machine
3) Add this machine as a node in an existing cluster
?) Help with menu options
q) Return to the Main Menu
Option: 2
Continue on answering the configuration questions but disable automatic quorum device selection. Your install should look like this:
>>> Confirmation <<<
Your responses indicate the following options to scinstall:
scinstall -i \
-C crash \
-F \
-G lofi \
-T node=node1,node=node2,authtype=sys \
-w netaddr=172.16.0.0,netmask=255.255.240.0,maxnodes=64,maxprivatenets=10,numvirtualclusters=12 \
-A trtype=dlpi,name=e1000g1 -A trtype=dlpi,name=e1000g2 \
-B type=switch,name=switch1 -B type=switch,name=switch2 \
-m endpoint=:e1000g1,endpoint=switch1 \
-m endpoint=:e1000g2,endpoint=switch2
Are these the options you want to use (yes/no) [yes]?
Once the first node has rebooted, run scinstall on the second node:
*** Main Menu ***
Please select from one of the following (*) options:
* 1) Create a new cluster or add a cluster node
2) Configure a cluster to be JumpStarted from this install server
3) Manage a dual-partition upgrade
4) Upgrade this cluster node
* 5) Print release information for this cluster node
* ?) Help with menu options
* q) Quit
Option: 1
*** New Cluster and Cluster Node Menu ***
Please select from any one of the following options:
1) Create a new cluster
2) Create just the first node of a new cluster on this machine
3) Add this machine as a node in an existing cluster
?) Help with menu options
q) Return to the Main Menu
Option: 3
Answer the install questions and when finished, the scinstall command should look like this:
>>> Confirmation <<<
Your responses indicate the following options to scinstall:
scinstall -i \
-C crash \
-N node1 \
-G lofi \
-A trtype=dlpi,name=e1000g1 -A trtype=dlpi,name=e1000g2 \
-m endpoint=:e1000g1,endpoint=switch1 \
-m endpoint=:e1000g2,endpoint=switch2
Are these the options you want to use (yes/no) [yes]?
After all nodes have been installed and added to the cluster, the shared storage (either iSCSI or shared virtual disks) must be set for software fencing. Obtain a list of the shared disks so we know which did devices to change:
# cldev list -v DID Device Full Device Path ---------- ---------------- d1 node1:/dev/rdsk/c0d0 d2 node1:/dev/rdsk/c1t0d0 d3 node2:/dev/rdsk/c0d0 d4 node2:/dev/rdsk/c1t0d0 d5 node1:/dev/rdsk/c3t1d0 d5 node2:/dev/rdsk/c2t1d0 d6 node2:/dev/rdsk/c2t2d0 d6 node1:/dev/rdsk/c3t2d0 d7 node2:/dev/rdsk/c2t3d0 d7 node1:/dev/rdsk/c3t3d0 d8 node2:/dev/rdsk/c2t4d0 d8 node1:/dev/rdsk/c3t4d0 d9 node2:/dev/rdsk/c2t5d0 d9 node1:/dev/rdsk/c3t5d0 d10 node2:/dev/rdsk/c2t6d0 d10 node1:/dev/rdsk/c3t6d0
The shared disks are d5 through d10 so we'll change the fencing option for these devices to nofencing.
# for i in 5 6 7 8 9 10; do > cldev set -p default_fencing=nofencing d$i > done
We can verify this setting:
# for i in 5 6 7 8 9 10; do > cldev show d$i > done === DID Device Instances === DID Device Name: /dev/did/rdsk/d5 Full Device Path: node2:/dev/rdsk/c2t1d0 Full Device Path: node1:/dev/rdsk/c3t1d0 Replication: none default_fencing: nofencing === DID Device Instances === DID Device Name: /dev/did/rdsk/d6 Full Device Path: node1:/dev/rdsk/c3t2d0 Full Device Path: node2:/dev/rdsk/c2t2d0 Replication: none default_fencing: nofencing === DID Device Instances === DID Device Name: /dev/did/rdsk/d7 Full Device Path: node1:/dev/rdsk/c3t3d0 Full Device Path: node2:/dev/rdsk/c2t3d0 Replication: none default_fencing: nofencing === DID Device Instances === DID Device Name: /dev/did/rdsk/d8 Full Device Path: node1:/dev/rdsk/c3t4d0 Full Device Path: node2:/dev/rdsk/c2t4d0 Replication: none default_fencing: nofencing === DID Device Instances === DID Device Name: /dev/did/rdsk/d9 Full Device Path: node1:/dev/rdsk/c3t5d0 Full Device Path: node2:/dev/rdsk/c2t5d0 Replication: none default_fencing: nofencing === DID Device Instances === DID Device Name: /dev/did/rdsk/d10 Full Device Path: node1:/dev/rdsk/c3t6d0 Full Device Path: node2:/dev/rdsk/c2t6d0 Replication: none default_fencing: nofencing
Now that we have changed the fencing protocol, add a quorum disk:
clq add d5
Check the disk and make sure the Access Mode is set to sq_disk (software quorum).
# clq show d5 === Quorum Devices === Quorum Device Name: d5 Enabled: yes Votes: 1 Global Name: /dev/did/rdsk/d5s2 Type: shared_disk Access Mode: sq_disk Hosts (enabled): node1, node2
We can now turn off installmode.
# cluster set -p installmode=disabled # clq reset # clq show === Cluster Nodes === Node Name: node1 Node ID: 1 Quorum Vote Count: 1 Reservation Key: 0x4C927F9400000001 Node Name: node2 Node ID: 2 Quorum Vote Count: 1 Reservation Key: 0x4C927F9400000002 === Quorum Devices === Quorum Device Name: d5 Enabled: yes Votes: 1 Global Name: /dev/did/rdsk/d5s2 Type: shared_disk Access Mode: sq_disk Hosts (enabled): node1, node2