GFS2 Cluster in VMware
From Linux NFS
Contents |
VMware
- bought a copy of VMware Workstation 6, installed it on my T-43 Thinkpad "atro"(running openSuSE 10.2, 2GB of RAM).
- made a new virtual machine: OS: Linux, Version: "Other Linux 2.6.x kernel", Networking: Bridged, Disk: 4GB, split into 2GB files, RAM: 256MB
- installed Fedora 8 in it -- even X worked well with only 256MB of RAM(!) -- guest is named "guest1"
- yum-installed gfs2-utils and libvolume_id-devel (i also tried cman, cman-devel, openais, openais-devel, and lvm2-cluster, but even they were out-of-date with the stock Fedora kernel, and so are also too old for the pNFS kernels)
- downloaded and installed device-mapper-1.02.22, openais-0.80.3, cluster-2.01.00, and lvm2-2.02.28
- yum-installed AoE initiator (client) aoetools-18-1 on guest1
- downloaded AoE target (server) vblade-15.tgz and installed it on atro
- i set aside a spare partition on atro to export as a block device over AoE:
- [atro] $ sudo ln -s /dev/sda6 /dev/AoE
- [atro] $ sudo vbladed 0 1 eth0 /dev/AoE (major dev num 0, minor 1)
- [guest1] $ sudo modprobe aoe
- .. AoE discovers all exported devices on the LAN; mine was the only one, and immediately appeared as /dev/etherd/e0.1. Mounting it "just worked"; props to AoE!
LVM and GFS2 setup
- prep physical volume for LVM:
- [guest1] $ sudo pvcreate -M 2 /dev/etherd/e0.1
- create the volume group GuestVolGroup and add all of the AoE "device" to it:
- [guest1] $ sudo vgcreate -M 2 -s 1m -c y GuestVolGroup /dev/etherd/e0.1
- edit /etc/lvm/lvm.conf and make sure to set locking_type to DLM
- before further stuff can proceed, the cluster needs to be up and clvmd needs to be running everywhere. So, in VMware I cloned guest1 twice: as guest2 and guest3.
- edit /etc/cluster.conf and name the cluster GuestCluster and set up the three nodes with manual (read: ignored) fencing.
- bring up the cluster:
- $ pdsh -w guest[1-3] sudo service cman start && pdsh -w guest[1-3] sudo service clvmd start
- create the logical volume GuestVolume and assign the full volume group to it:
- [guest1] $ sudo lvcreate -n GuestVolume -l 100%VG GuestVolGroup
- .. and make a GFS2 fs therein:
- [guest1] $ sudo gfs2_mkfs -j 3 -p lock_dlm -t GuestCluster:GuestFS /dev/GuestVolGroup/GuestVolume
- restart the daemons, then mount and your VMware GFS2 cluster should be good to go! :)
Adding disk space to an LVM'ed VMware guest
Having blithely thought that 4GB of disk space per guest (which Fedora LVMs as VolGroup00) would be sufficient, I then git-cloned my repo and then didn't have enough space to build my kernels; gak. (Since I'm building things on just one guest and then cloning it, I'm hoping that maybe I can somehow shrink the cloned guests' disks back down to just 4GB.)
- in VMware, I went to Edit Virtual Machine Settings -> Add (device). I created a (virtual) SCSI disk, 3GB, allocate on-demand, and added it to my guest.
- after starting the guest, the disk appeared as /dev/sdb
- create a single partition using the entire device:
- [guest1] $ fdisk # etc etc NB: make sure that the partition type is 0x8e (Linux LVM)
- make a single LVM physical volume on it:
- [guest1] $ pvcreate -M 2 /dev/sdb1
- extend the existing volume group by adding the prepped physical volume:
- [guest1] $ vgextend VolGroup00 /dev/sdb1
- extend the logical volume to use the entire (now-larger) volume group:
- [guest1] $ lvextend -l +100%FREE /dev/VolGroup00/LogVol00
- Inspect things with pvs, vgs, and lvs
- extend the filesystem itself within the logical volume (it can handle online resizing):
- [guest1] $ resize2fs /dev/VolumeGroup00/LogVol00
At this point, hopefully df -k should show you a larger volume :)
Update: reactions from Connectathon '08
The purpose of this entire VMware/GFS2 setup in the first place was so I could work on a pNFS/GFS2 MDS at Connectathon '08 with Frank Filz, Dean Hildebrand, and Marc Eshel (all gentlemen from IBM).
On the one hand, once I had a primary guest system set up and could just clone it to make a cluster, it was very easy to make kernel changes, rebuild, push things out the cluster, and reboot.
The downside came during testing, when we tried doing pNFS writes of several KB or more -- the RPC layer would barf on the packet with a message like "Error: bad tcp reclen". Fortunately, Dean recalled that Ricardo Labiaga had had a similar problem with KVM (or UML?) at the fall 2007 CITI Bakeathon, so we started to suspect VMware. I quick set up two laptops to act as GFS2 nodes, accessing shared storage with AoE. I shut down the VMware cluster, configured it so that only one VMware node and the two new laptops would be a 3-node GFS2 cluster, and brought up the new cluster. Then, using the node in VMware as a pNFS MDS and the two laptops as DSes, we almost immediately were able to pass the Connectathon test suite.
The verdict: VMware Workstation 6 still totally impresses me, but it's probably better to do cluster work on an actual cluster. That said, my I/O troubles may just stem from my laptop, or my particular NIC driver, or whatever -- I can't imagine that there aren't ways to resolve that somehow.