A VirtualBox Solaris Cluster complete with shared storage and quorum disk!
Host System
The host system is a Mac running 10.8:
Intel Core i7-3770K Ivy Bridge 3.5GHz
16Gb RAM
GeForce GTX 470 1280M
At the time of this article, I’m running Mac OSX 10.8.2 with VirtualBox 4.2.6. The Solaris Cluster is 3.3 running in a Solaris 10u10 virtual machine.
I have also tested this procedure using Solaris 11.1 and Solaris Cluster 4.1. There are some minor differences:
- The cluster software is installed with the pkg command:
pkg install ha-cluster-full
- root is now a role, not a user.
Virtual Machines
1. Create a Solaris 10 virtual machine using this appliance, s10u10sc33u1zfs. You will need to attach a 15GB hard disk to the SATA controller. Boot the Solaris 10u10 DVD and when prompted select option 4:
1. Oracle Solaris Interactive (default)
2. Custom JumpStart
3. Oracle Solaris Interactive Text (Desktop session)
4. Oracle Solaris Interactive Text (Console session)
(Select option 3 or 4 to install a ZFS root file system)
5. Apply driver updates
6. Single user shell
Enter the number of your choice.
Timeout cancelled. Please make a selection (1 - 6).
This allows for a ZFS root. Make sure you select ZFS and not UFS when asked and select the default software group, Entire Distribution. When the install has completed, the machine will reboot and will be at the graphical login screen.
2. Log into the virtual machine using the command line login option and configure a few items:
- Modify or create the /.profile:
#
# let's use a real shell
#
if [ "$0" = "-sh" -a "$SHELL" = "/sbin/sh" ]; then
SHELL="/usr/bin/bash"; export SHELL
exec $SHELL -l
fi
if [ "$SHELL" = "/usr/bin/bash" ]; then
set -o vi
export PAGER=less
export PATH=/usr/sbin:/usr/bin:/usr/cluster/bin:/usr/cluster/lib/sc:/usr/ccs/bin:/usr/proc/bin:/usr/sfw/bin
export MANPATH=/usr/share/man:/usr/cluster/man:/usr/sfw/man
if [ "`tty`" != "/dev/console" ]; then
PROMPT_COMMAND='echo -ne "\033]0;${LOGNAME}@${HOSTNAME}: ${PWD}\007"'
else
export TERM=xterm
/usr/openwin/bin/resize
fi
export PS1="${LOGNAME}@${HOSTNAME} # "
fi
- Disable the graphical login
# svcadm disable cde-login
- Enable remote root logins by editing /etc/default/login and commenting the CONSOLE entry.
vi /etc/default/login
...
# If CONSOLE is set, root can only login on that device.
# Comment this line out to allow remote login by root.
#
#CONSOLE=/dev/console
...
- Enable root ssh logins by editing /etc/ssh/sshd_config and changing PermitRootLogin to yes:
vi /etc/ssh/sshd_config
...
# Are root logins permitted using sshd.
# Note that sshd uses pam_authenticate(3PAM) so the root (or any other) user
# maybe denied access by a PAM module regardless of this setting.
# Valid options are yes, without-password, no.
PermitRootLogin yes
...
- Notify sshd of change:
pkill -HUP sshd
3. Install Cluster software.
- Put in the Oracle Solaris Cluster CD or extract the archive and run the installer found in cd mount path/Solaris_x86.
- Answer no to “Do you want to install the full set of Oracle Solaris Cluster Products and Services?”
- Install 4,6 (Cluster and Agents)
- When asked for the configuration, answer 2, “Configure Later”.
Now we have an installed machine which we can clone to make the other node in the cluster. You might also want to keep an extra clone around for seed. Run the sys-unconfig command and bring down the machine before cloning in VirtualBox. It’s also a good idea to Reinitialize the Mac addresses when cloning and make sure you select Full clone. I typically create two clones from the seed and call them base-node1 and base-node2.
Shared Storage
A cluster will require some sort of shared storage. This can be accomplished in VirtualBox by creating fixed size virtual disks.
Create the disks
$ for f in 0 1 2 3 4 5 ;do
> VBoxManage createhd --filename shared-$f.vdi --size 5000 --format VDI --variant Fixed
> done
Attach the disks to an existing VM
$ for f in 0 1 2 3 4 5 ;do
> VBoxManage storageattach node1 --storagectl "SCSI" --port $f --device 0 --type hdd --medium shared-$f.vdi
Modify the disk type.
$ for f in 0 1 2 3 4 5 ;do
> VBoxManage modifyhd shared-$f.vdi --type shareable
You may now attach these drives to multiple VMs and use them as shared storage. I would suggest that if you intend on using zfs on the shared storage, make sure the disks are labeled using an EFI label. This can be accomplished by using format -e which will prompt for the label type when the disk is labeled. This should be done prior to the final cluster installation.
Cluster Install
You should have both nodes created in virtualbox so boot both nodes. Edit /etc/hosts on both nodes and add the other node. The cluster software has already been installed so we need to run scinstall on the first node:
*** Main Menu ***
Please select from one of the following (*) options:
* 1) Create a new cluster or add a cluster node
2) Configure a cluster to be JumpStarted from this install server
3) Manage a dual-partition upgrade
4) Upgrade this cluster node
* 5) Print release information for this cluster node
* ?) Help with menu options
* q) Quit
Option: 1
*** New Cluster and Cluster Node Menu ***
Please select from any one of the following options:
1) Create a new cluster
2) Create just the first node of a new cluster on this machine
3) Add this machine as a node in an existing cluster
?) Help with menu options
q) Return to the Main Menu
Option: 2
Continue on answering the configuration questions but disable automatic quorum device selection. Your install should look like this:
>>> Confirmation <<<
Your responses indicate the following options to scinstall:
scinstall -i \
-C crash \
-F \
-G lofi \
-T node=node1,node=node2,authtype=sys \
-w netaddr=172.16.0.0,netmask=255.255.240.0,maxnodes=64,maxprivatenets=10,numvirtualclusters=12 \
-A trtype=dlpi,name=e1000g1 -A trtype=dlpi,name=e1000g2 \
-B type=switch,name=switch1 -B type=switch,name=switch2 \
-m endpoint=:e1000g1,endpoint=switch1 \
-m endpoint=:e1000g2,endpoint=switch2
Are these the options you want to use (yes/no) [yes]?
Once the first node has rebooted, run scinstall on the second node (I had to fix mine):
*** Main Menu ***
Please select from one of the following (*) options:
* 1) Create a new cluster or add a cluster node
2) Configure a cluster to be JumpStarted from this install server
3) Manage a dual-partition upgrade
4) Upgrade this cluster node
* 5) Print release information for this cluster node
* ?) Help with menu options
* q) Quit
Option: 1
*** New Cluster and Cluster Node Menu ***
Please select from any one of the following options:
1) Create a new cluster
2) Create just the first node of a new cluster on this machine
3) Add this machine as a node in an existing cluster
?) Help with menu options
q) Return to the Main Menu
Option: 3
Answer the install questions and when finished, the scinstall command should look like this:
>>> Confirmation <<<
Your responses indicate the following options to scinstall:
scinstall -i \
-C crash \
-N node1 \
-G lofi \
-A trtype=dlpi,name=e1000g1 -A trtype=dlpi,name=e1000g2 \
-m endpoint=:e1000g1,endpoint=switch1 \
-m endpoint=:e1000g2,endpoint=switch2
Are these the options you want to use (yes/no) [yes]?
After all nodes have been installed and added to the cluster, the shared storage (either iSCSI or shared virtual disks) must be set for software fencing. Obtain a list of the shared disks so we know which did devices to change:
# cldev list -v
DID Device Full Device Path
---------- ----------------
d1 node1:/dev/rdsk/c0d0
d2 node1:/dev/rdsk/c1t0d0
d3 node2:/dev/rdsk/c0d0
d4 node2:/dev/rdsk/c1t0d0
d5 node1:/dev/rdsk/c3t1d0
d5 node2:/dev/rdsk/c2t1d0
d6 node2:/dev/rdsk/c2t2d0
d6 node1:/dev/rdsk/c3t2d0
d7 node2:/dev/rdsk/c2t3d0
d7 node1:/dev/rdsk/c3t3d0
d8 node2:/dev/rdsk/c2t4d0
d8 node1:/dev/rdsk/c3t4d0
d9 node2:/dev/rdsk/c2t5d0
d9 node1:/dev/rdsk/c3t5d0
d10 node2:/dev/rdsk/c2t6d0
d10 node1:/dev/rdsk/c3t6d0
# for i in 5 6 7 8 9 10; do
> cldev set -p default_fencing=nofencing d$i
> done
We can verify this setting:
# for i in 5 6 7 8 9 10; do
> cldev show d$i
> done
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d5
Full Device Path: node2:/dev/rdsk/c2t1d0
Full Device Path: node1:/dev/rdsk/c3t1d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d6
Full Device Path: node1:/dev/rdsk/c3t2d0
Full Device Path: node2:/dev/rdsk/c2t2d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d7
Full Device Path: node1:/dev/rdsk/c3t3d0
Full Device Path: node2:/dev/rdsk/c2t3d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d8
Full Device Path: node1:/dev/rdsk/c3t4d0
Full Device Path: node2:/dev/rdsk/c2t4d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d9
Full Device Path: node1:/dev/rdsk/c3t5d0
Full Device Path: node2:/dev/rdsk/c2t5d0
Replication: none
default_fencing: nofencing
=== DID Device Instances ===
DID Device Name: /dev/did/rdsk/d10
Full Device Path: node1:/dev/rdsk/c3t6d0
Full Device Path: node2:/dev/rdsk/c2t6d0
Replication: none
default_fencing: nofencing
Now that we have changed the fencing protocol, add a quorum disk:
clq add d5
Check the disk and make sure the Access Mode is set to sq_disk (software quorum).
# clq show d5
=== Quorum Devices ===
Quorum Device Name: d5
Enabled: yes
Votes: 1
Global Name: /dev/did/rdsk/d5s2
Type: shared_disk
Access Mode: sq_disk
Hosts (enabled): node1, node2
We can now turn off installmode.
# cluster set -p installmode=disabled
# clq reset
# clq show
=== Cluster Nodes ===
Node Name: node1
Node ID: 1
Quorum Vote Count: 1
Reservation Key: 0x4C927F9400000001
Node Name: node2
Node ID: 2
Quorum Vote Count: 1
Reservation Key: 0x4C927F9400000002
=== Quorum Devices ===
Quorum Device Name: d5
Enabled: yes
Votes: 1
Global Name: /dev/did/rdsk/d5s2
Type: shared_disk
Access Mode: sq_disk
Hosts (enabled): node1, node2
Verify the keys on disk:
# pgre -c pgre_inkeys -d /dev/did/rdsk/d5
key[0]=0x4C927F9400000001.
key[1]=0x4C927F9400000002.
Conclusion
That’s it… You now have a functioning 2 node VirtualBox cluster. You can create some zpools on the shared storage and set up some resource groups and HAStorageplus resources.
How many NIC’s? What type of NICs in VirtualBox? Did you configure DNS or anything during network setup? DHCP?
The VM file is now attached to the post (it didn’t migrate to wordpress with the content). To answer your question:
1 public nic (default hw)
2 private nics (default hw; connect to internal net 1, internal net 2)
DNS isn’t needed but can be used.
DHCP isn’t needed but can be used.
General rule… All cluster nodes need to be resolvable before installing the cluster.
Before cluster install I ran the following to avoid the sccheck failures:
# /usr/sbin/cacaoadm enable
# /usr/sbin/svccfg -s system/webconsole setprop options/tcp_listen = true
# /usr/sbin/svcadm refresh svc:/system/webconsole:console
# /usr/sbin/smcwebserver restart
Don’t think second node can join without tcp_listen enabled