Oracle 10g CheapRAC :: HowTo using OCFS/LVM

OCFS FAQ OCPdba.Net

Table of Contents
============================================================
1. Introduction
2. SCSI Chain
3. Network Cards
4. LVM - Logical Volume Manager
5. OCFS - Oracle Clustered File System

Appendices
A. Pre-requisites
B. Hardware Diagram
C. Volume Group oravg
        
1. Introduction [Top]
============================================================

Overview:
=========
The purpose of this exercise is to provide a DBA the opportunity to setup OCFS on top of LVM, using cheap
off the shelf components. In most cases the hardware can all be found at much lower prices than high end
components. The main goals are:

i)  Setup LVM to achieve shared devices
ii) Setup OCFS to create shared file systems under these devices.

Next Steps:
===========
After successfully installing OCFS on LVM you may wish to look at installing Cluster Ready Services (CRS).
The basic installation is documented here.


What this is not:
=================
This is not a production RAC setup. Please do not setup a production RAC using the steps stated here. For
example I have chosen to ignore the hangcheck-timer setup as this is used to properly remove a node that has failed
from a cluster (I/O fencing). In this case if one box fails the whole cluster will fail as the SCSI chain is 
dependent on all boxes being alive. There is also no redundancy at any stage of this RAC. If one half of the 
interconnect fails the RAC will fail.

If you do require a production RAC and would like to use this as a guide, please do so. In this case SLES 9 should
be used. Additional more reliable hardware will of coure be required. In that configuration, IBM e Series servers and
NetApp storage is my prefered platform.

Omissions:
==========
Hangcheck-timer: This is used to monitor and remove a node from the cluster if it fails. In this setup this step is 
omitted as both nodes need to be alive for the RAC to function.

        
2. SCSI Chain [Top]
============================================================

1) Install SCSI Cards in both boxes
2) In safa, set the SCSI card id to 15
3) In marwa, set the SCSI card id to 0

The idea here is that the ends of the SCSI chain are in the two boxes, so 
any devices between the two ends will be visible from both machines.

4) In the SCSI tower set the SCSI ids of the drives you are using to values
   from 1 to 14. Do not set termintation on the drives.

5) Connect SCSI cables from safa to the SCSI tower, and from marwa to the
   SCSI tower. You should now have a setup that looks like:

   [ SAFA ] -------- [ SCSI Tower ] --------- [ MARWA ]

5) Power up sequence: To get the devices functioning correctly:
   a) Power on the SCSI tower
   b) Power on SAFA, go into the BIOS and wait there
   c) Power on MARWA, go into the BIOS and wait there
   
   At this point all ends of the SCSI chain are visible. You can now reset (Ctrl-Alt-Del)
   eiher box, and allow it to boot. Then do the same for the oher box.

   Please allow one box to boot at a time.

   Look at the boot log, you should see entries similar to the following in your boot log:

Attached scsi disk sda at scsi0, channel 0, id 4, lun 0 Attached scsi disk sdb at scsi0, channel 0, id 5, lun 0 scsi device sda: 17783240 512-byte hdwr sectors (9105 MB) sda: sda1 scsi device sdb: 17783240 512-byte hdwr sectors (9105 MB) sdb: sdb1
In this setup the disks in the SCSI tower are set to ids 4 and 5. Please ensure that the device name is the same on both machines, ensure that your SCSI card is placed in the same PCI slot sequence on both machines to ensure that they are picked up in the same order. The device names MUST be the same or LVM will have problems later on. In this case my shared disks are seen as /dev/sda and /dev/sdb 6) At this point your SCSI chain should be functional.
3. Network Cards [Top]
============================================================

There are basically three types of traffic that you will be passing in this configuration:

a) Database Connections
b) RAC intercommunication
c) OCFS intercommunication

To facilitate this, two netword cards are used per box. One card has a public visible IP, the other 
has an IP used specifically for the interconnect. The interconnect in this configuration is used to handle
traffic for categories b) and c) above. A 100MB/s crossover cable is used between safa and marwa on the interconnect.
This allows full duplex 100MB/s traffic.

If the OCFS traffic is too much for the interconnect, then a third pair of network cards may be used to create a 2nd
interconnect for the OCFS traffic. In this configuration I have not done this as this is a development and testing only
configuration. A crossover connection should be used here as well.

        
4. LVM - Logical Volume Manager [Top]
============================================================
a. Set LVM Partition Type
b. Initialize disks for LVM
c. Create LVM Volume Group
d. Create LVM Logical Volumes
e. Enable Volume Group on second node (marwa)


Perfom these steps on one box, in this case I did this from safa

a. Set LVM partition types [LVM]

   Use fdisk and create a single partition of type 8e, do not format this partition.

safa:/u01 # fdisk -l /dev/sda Disk /dev/sda Disk /dev/sda: 9105 MB, 9105018880 bytes 255 heads, 63 sectors/track, 1106 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 1106 8883913+ 8e Linux LVM safa:/u01 # fdisk -l /dev/sdb Disk /dev/sdb: 9105 MB, 9105018880 bytes 255 heads, 63 sectors/track, 1106 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 1106 8883913+ 8e Linux LVM
b. Initialize disks for LVM [LVM]
pvcreate /dev/sda1 /dev/sdb1
c. Create LVM Volume Group [LVM]
vgcreate oravg /dev/sda1 /dev/sdb1
The special device file created is /dev/oravg/group The major device number for LVM groups is 109.
safa:/u01 # ls -al /dev/oravg/group crw-r----- 1 root disk 109, 0 Nov 7 02:04 /dev/oravg/group
d. Create LVM Logical Volumes: [LVM] Create several logical volumes for the files that will need to be shared within the RAC:
safa:/u01 # lvcreate -n lv_ctrl1 -L 500m oravg lvcreate -- rounding size up to physical extent boundary lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_ctrl1" successfully created safa:/u01 # lvcreate -n lv_ctrl2 -L 500m oravg lvcreate -- rounding size up to physical extent boundary lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_ctrl2" successfully created safa:/u01 # lvcreate -n lv_redo1 -L 500m oravg lvcreate -- rounding size up to physical extent boundary lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_redo1" successfully created safa:/u01 # lvcreate -n lv_redo2 -L 500m oravg lvcreate -- rounding size up to physical extent boundary lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_redo2" successfully created safa:/u01 # lvcreate -n lv_data1 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_data1" successfully created safa:/u01 # lvcreate -n lv_data2 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_data2" successfully created safa:/u01 # lvcreate -n lv_data3 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_data3" successfully created safa:/u01 # lvcreate -n lv_ndx1 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_ndx1" successfully created safa:/u01 # lvcreate -n lv_ndx2 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_ndx2" successfully created safa:/u01 # lvcreate -n lv_ndx3 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_ndx3" successfully created safa:/u01 # lvcreate -n lv_temp1 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_temp1" successfully created safa:/u01 # lvcreate -n lv_temp2 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_temp2" successfully created safa:/u01 # lvcreate -n lv_undo1 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_undo1" successfully created safa:/u01 # lvcreate -n lv_undo2 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_undo2" successfully created safa:/u01 # lvcreate -n lv_undo3 -L 1000m oravg lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_undo3" successfully created safa:~ # lvcreate -n lv_cm -L 500M oravg lvcreate -- rounding size up to physical extent boundary lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_cm" successfully created safa:~ # lvcreate -n lv_quorum -L 500M oravg lvcreate -- rounding size up to physical extent boundary lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_quorum" successfully created safa:~ # lvcreate -n lv_svrcfg -L 500M oravg lvcreate -- rounding size up to physical extent boundary lvcreate -- doing automatic backup of "oravg" lvcreate -- logical volume "/dev/oravg/lv_svrcfg" successfully created
The special device files (major device number 58) for all of the logical volumes are now visible:
safa:/u01 # ls -al /dev/oravg/ total 80 dr-xr-xr-x 2 root root 4096 Nov 10 22:14 . drwxr-xr-x 34 root root 73728 Nov 9 22:21 .. crw-r----- 1 root disk 109, 0 Nov 9 22:21 group brw-rw---- 1 root disk 58, 16 Nov 9 22:14 lv_cm brw-rw---- 1 root disk 58, 1 Nov 9 22:21 lv_ctrl1 brw-rw---- 1 root disk 58, 2 Nov 9 22:21 lv_ctrl2 brw-rw---- 1 root disk 58, 5 Nov 9 22:21 lv_data1 brw-rw---- 1 root disk 58, 6 Nov 9 22:21 lv_data2 brw-rw---- 1 root disk 58, 7 Nov 9 22:21 lv_data3 brw-rw---- 1 root disk 58, 8 Nov 9 22:21 lv_ndx1 brw-rw---- 1 root disk 58, 9 Nov 9 22:21 lv_ndx2 brw-rw---- 1 root disk 58, 10 Nov 9 22:21 lv_ndx3 brw-rw---- 1 root disk 58, 17 Nov 9 22:14 lv_quorum brw-rw---- 1 root disk 58, 3 Nov 9 22:21 lv_redo1 brw-rw---- 1 root disk 58, 4 Nov 9 22:21 lv_redo2 brw-rw---- 1 root disk 58, 18 Nov 9 22:14 lv_svrcfg brw-rw---- 1 root disk 58, 11 Nov 9 22:21 lv_temp1 brw-rw---- 1 root disk 58, 12 Nov 9 22:21 lv_temp2 brw-rw---- 1 root disk 58, 13 Nov 9 22:21 lv_undo1 brw-rw---- 1 root disk 58, 14 Nov 9 22:21 lv_undo2 brw-rw---- 1 root disk 58, 15 Nov 9 22:21 lv_undo3
f. Enable Volume Group on second node (marwa) [LVM] On marwa, the following commands are used to read the LVM configuration: How does marwa see the LVM information? This is because the LVM structure is written to the first disk of the volume group. The "vgscan" command reads this information off of the disk and creates the needed Volume Group files. "vgchange -ay oravg" then activates the Volume Group on marwa.
marwa:/u01 # ls -al /dev/oravg/ total 80 dr-xr-xr-x 2 root root 4096 Nov 9 13:53 . drwxr-xr-x 32 root root 73728 Nov 9 13:53 .. crw-r----- 1 root disk 109, 0 Nov 9 13:53 group marwa:/u01 # vgscan vgscan -- reading all physical volumes (this may take a while...) vgscan -- found inactive volume group "oravg" vgscan -- "/etc/lvmtab" and "/etc/lvmtab.d" successfully created vgscan -- WARNING: This program does not do a VGDA backup of your volume group marwa:/u01 # vgchange -ay oravg vgchange -- volume group "oravg" successfully activated marwa:/u01 # ls -al /dev/oravg/ total 80 dr-xr-xr-x 2 root root 4096 Nov 10 22:14 . drwxr-xr-x 34 root root 73728 Nov 9 22:21 .. crw-r----- 1 root disk 109, 0 Nov 9 22:21 group brw-rw---- 1 root disk 58, 16 Nov 9 22:14 lv_cm brw-rw---- 1 root disk 58, 1 Nov 9 22:21 lv_ctrl1 brw-rw---- 1 root disk 58, 2 Nov 9 22:21 lv_ctrl2 brw-rw---- 1 root disk 58, 5 Nov 9 22:21 lv_data1 brw-rw---- 1 root disk 58, 6 Nov 9 22:21 lv_data2 brw-rw---- 1 root disk 58, 7 Nov 9 22:21 lv_data3 brw-rw---- 1 root disk 58, 8 Nov 9 22:21 lv_ndx1 brw-rw---- 1 root disk 58, 9 Nov 9 22:21 lv_ndx2 brw-rw---- 1 root disk 58, 10 Nov 9 22:21 lv_ndx3 brw-rw---- 1 root disk 58, 17 Nov 9 22:14 lv_quorum brw-rw---- 1 root disk 58, 3 Nov 9 22:21 lv_redo1 brw-rw---- 1 root disk 58, 4 Nov 9 22:21 lv_redo2 brw-rw---- 1 root disk 58, 18 Nov 9 22:14 lv_svrcfg brw-rw---- 1 root disk 58, 11 Nov 9 22:21 lv_temp1 brw-rw---- 1 root disk 58, 12 Nov 9 22:21 lv_temp2 brw-rw---- 1 root disk 58, 13 Nov 9 22:21 lv_undo1 brw-rw---- 1 root disk 58, 14 Nov 9 22:21 lv_undo2 brw-rw---- 1 root disk 58, 15 Nov 9 22:21 lv_undo3
5. OCFS - Oracle Clustered File System [Top]
============================================================
a. Obtaining OCFS
b. Installing OCFS
c. Configuring OCFS
d. Loading OCFS
e. Mount Points
f. Format Partitions
g. Mount Partitions
h. Test OCFS
i. Monitor OCFS
j. OCFS Performance & Maintenance


a. Obtaining OCFS: [OCFS]

   OCFS may be obtained from http://oss.oracle.com/projects/ocfs/

   OCFS2 is in a Beta phase, here OCFS1 is used and fulfills the required clustered file 
   system needs. The files you will need are:

   - ocfs-support-1.0.10-1.i386.rpm
   - ocfs-tools-1.0.10-1.i386.rpm

   One of the following based on your kernel, in this case ocfs-2.4.21-241-smp-1.0.13-1.i586.rpm was used
   since dual processor boxes are used.

   - ocfs-2.4.21-241-deflt-1.0.13-1.i586.rpm
   - ocfs-2.4.21-241-psmp-1.0.13-1.i586.rpm
   - ocfs-2.4.21-241-smp-1.0.13-1.i586.rpm



b. Installing OCFS: [OCFS]

   To install, install the required tools in the following sequence:

   rpm -ivh ocfs-support-1.0.10-1.i386.rpm
   rpm -ivh ocfs-tools-1.0.10-1.i386.rpm
   rpm -ivh --nodeps ocfs-2.4.21-241-smp-1.0.13-1.i586.rpm

   The reason nodeps is used is that ocfs asks for a specific version of k_smp, in this case k_smp4G-2.4.21-99 is installed
   with SuSE 9.0 which is a higher version. If nodeps is not used then ocfs will complain that a lower specific version
   is required.

   Again please be warned that this is not a production setup and should not be used for production systems.



c. Configuring OCFS: [OCFS]

   Run the configuration tool to generate the default /etc/ocfs.conf file on each node:

   /usr/sbin/ocfstool

   From the dropdown menu choose "Tasks -> Generate Config"

   Ensure that the network card used is the private interconnect, if you have a dedicated
   OCFS interconnect it should be used here.

   This is what the /etc/ocfs.conf file looks like on safa and marwa:

safa:/u01 # cat /etc/ocfs.conf # # ocfs config # Ensure this file exists in /etc # node_name = safa ip_address = 10.0.0.41 ip_port = 7000 comm_voting = 1 guid = 2D3BDD6ACB53F1A62573005004845D0C
marwa:/u01 # cat /etc/ocfs.conf # # ocfs config # Ensure this file exists in /etc # node_name = marwa ip_address = 10.0.0.51 ip_port = 7000 comm_voting = 1 guid = 30488860FA8B7BDFB35D00105A286603
d. Loading OCFS: [OCFS] Place the following lines in your /etc/modules.conf to have the ocfs module loaded at boot
######################################### # OCFS Setup ######################################### install ocfs load_ocfs
e. Create mount points for OCFS file systems on both nodes: [OCFS]
mkdir -p /u01/racdata/ctrl1 mkdir -p /u01/racdata/ctrl2 mkdir -p /u01/racdata/redo1 mkdir -p /u01/racdata/redo2 mkdir -p /u01/racdata/data1 mkdir -p /u01/racdata/data2 mkdir -p /u01/racdata/data3 mkdir -p /u01/racdata/ndx1 mkdir -p /u01/racdata/ndx2 mkdir -p /u01/racdata/ndx3 mkdir -p /u01/racdata/temp1 mkdir -p /u01/racdata/temp2 mkdir -p /u01/racdata/undo1 mkdir -p /u01/racdata/undo2 mkdir -p /u01/racdata/undo3 mkdir -p /u01/racdata/cm mkdir -p /u01/racdata/svrcfg mkdir -p /u01/racdata/quorum chown -R oracle:dba /u01/racdata
f. Format the partitions: [OCFS] From either safa or marwa format the logical volume partitions created earlier: The command options used are as follows: -F force format (existing partition, if it exists) -b Block Size in kilobytes, determines maximum file system size possible -L A label which references the partition (mount command) -m The mount point -u oracle UID (same on both safa and marwa) -g oracle GID (same on both safa and marwa, in this case the dba GID) -p Permissions on the file systems The device path to to the logical volume is the last option.
safa:/u01 # mkfs.ocfs -F -b 128 -L ocfs_ctrl1 -m /u01/racdata/ctrl1 -u 105 -g 111 -p 0775 /dev/oravg/lv_ctrl1 Cleared volume header sectors Cleared node config sectors Cleared publish sectors Cleared vote sectors Cleared bitmap sectors Cleared data block Wrote volume header
Similar commands are issued for the remaining file systems:
mkfs.ocfs -F -b 128 -L ocfs_ctrl2 -m /u01/racdata/ctrl2 -u 105 -g 111 -p 0775 /dev/oravg/lv_ctrl2 mkfs.ocfs -F -b 128 -L ocfs_redo1 -m /u01/racdata/redo1 -u 105 -g 111 -p 0775 /dev/oravg/lv_redo1 mkfs.ocfs -F -b 128 -L ocfs_redo2 -m /u01/racdata/redo2 -u 105 -g 111 -p 0775 /dev/oravg/lv_redo2 mkfs.ocfs -F -b 128 -L ocfs_data1 -m /u01/racdata/data1 -u 105 -g 111 -p 0775 /dev/oravg/lv_data1 mkfs.ocfs -F -b 128 -L ocfs_data2 -m /u01/racdata/data2 -u 105 -g 111 -p 0775 /dev/oravg/lv_data2 mkfs.ocfs -F -b 128 -L ocfs_data3 -m /u01/racdata/data3 -u 105 -g 111 -p 0775 /dev/oravg/lv_data3 mkfs.ocfs -F -b 128 -L ocfs_ndx1 -m /u01/racdata/ndx1 -u 105 -g 111 -p 0775 /dev/oravg/lv_ndx1 mkfs.ocfs -F -b 128 -L ocfs_ndx2 -m /u01/racdata/ndx2 -u 105 -g 111 -p 0775 /dev/oravg/lv_ndx2 mkfs.ocfs -F -b 128 -L ocfs_ndx3 -m /u01/racdata/ndx3 -u 105 -g 111 -p 0775 /dev/oravg/lv_ndx3 mkfs.ocfs -F -b 128 -L ocfs_temp1 -m /u01/racdata/temp1 -u 105 -g 111 -p 0775 /dev/oravg/lv_temp1 mkfs.ocfs -F -b 128 -L ocfs_temp2 -m /u01/racdata/temp2 -u 105 -g 111 -p 0775 /dev/oravg/lv_temp2 mkfs.ocfs -F -b 128 -L ocfs_undo1 -m /u01/racdata/undo1 -u 105 -g 111 -p 0775 /dev/oravg/lv_undo1 mkfs.ocfs -F -b 128 -L ocfs_undo2 -m /u01/racdata/undo2 -u 105 -g 111 -p 0775 /dev/oravg/lv_undo2 mkfs.ocfs -F -b 128 -L ocfs_undo3 -m /u01/racdata/undo3 -u 105 -g 111 -p 0775 /dev/oravg/lv_undo3 mkfs.ocfs -F -b 128 -L ocfs_cm -m /u01/racdata/cm -u 105 -g 111 -p 0775 /dev/oravg/lv_cm mkfs.ocfs -F -b 128 -L ocfs_svrcfg -m /u01/racdata/svrcfg -u 105 -g 111 -p 0775 /dev/oravg/lv_svrcfg mkfs.ocfs -F -b 128 -L ocfs_quorum -m /u01/racdata/quorum -u 105 -g 111 -p 0775 /dev/oravg/lv_quorum
g. Mount the Partitions: [OCFS] On both nodes, issue the following commands to start OCFS:
/sbin/load_ocfs /etc/init.d/ocfs start
On both nodes mount the file systems:
mount -t ocfs /dev/oravg/lv_ctrl1 /u01/racdata/ctrl1 mount -t ocfs /dev/oravg/lv_ctrl2 /u01/racdata/ctrl2 mount -t ocfs /dev/oravg/lv_data1 /u01/racdata/data1 mount -t ocfs /dev/oravg/lv_data2 /u01/racdata/data2 mount -t ocfs /dev/oravg/lv_data3 /u01/racdata/data3 mount -t ocfs /dev/oravg/lv_ndx1 /u01/racdata/ndx1 mount -t ocfs /dev/oravg/lv_ndx2 /u01/racdata/ndx2 mount -t ocfs /dev/oravg/lv_ndx3 /u01/racdata/ndx3 mount -t ocfs /dev/oravg/lv_temp1 /u01/racdata/temp1 mount -t ocfs /dev/oravg/lv_temp2 /u01/racdata/temp2 mount -t ocfs /dev/oravg/lv_undo1 /u01/racdata/undo1 mount -t ocfs /dev/oravg/lv_undo2 /u01/racdata/undo2 mount -t ocfs /dev/oravg/lv_undo3 /u01/racdata/undo3 mount -t ocfs /dev/oravg/lv_cm /u01/racdata/cm mount -t ocfs /dev/oravg/lv_svrcfg /u01/racdata/svrcfg mount -t ocfs /dev/oravg/lv_quorum /u01/racdata/quorum
The filesystems are now mounted on both nodes and are available for use:
safa:/u01 # mount | grep ocfs /dev/oravg/lv_ctrl1 on /u01/racdata/ctrl1 type ocfs (rw) /dev/oravg/lv_ctrl2 on /u01/racdata/ctrl2 type ocfs (rw) /dev/oravg/lv_data1 on /u01/racdata/data1 type ocfs (rw) /dev/oravg/lv_data2 on /u01/racdata/data2 type ocfs (rw) /dev/oravg/lv_data3 on /u01/racdata/data3 type ocfs (rw) /dev/oravg/lv_ndx1 on /u01/racdata/ndx1 type ocfs (rw) /dev/oravg/lv_ndx2 on /u01/racdata/ndx2 type ocfs (rw) /dev/oravg/lv_ndx3 on /u01/racdata/ndx3 type ocfs (rw) /dev/oravg/lv_temp1 on /u01/racdata/temp1 type ocfs (rw) /dev/oravg/lv_temp2 on /u01/racdata/temp2 type ocfs (rw) /dev/oravg/lv_undo1 on /u01/racdata/undo1 type ocfs (rw) /dev/oravg/lv_undo2 on /u01/racdata/undo2 type ocfs (rw) /dev/oravg/lv_undo3 on /u01/racdata/undo3 type ocfs (rw) /dev/oravg/lv_cm on /u01/racdata/cm type ocfs (rw) /dev/oravg/lv_svrcfg on /u01/racdata/svrcfg type ocfs (rw) /dev/oravg/lv_quorum on /u01/racdata/quorum type ocfs (rw)
marwa:/u01 # mount | grep ocfs /dev/oravg/lv_ctrl1 on /u01/racdata/ctrl1 type ocfs (rw) /dev/oravg/lv_ctrl2 on /u01/racdata/ctrl2 type ocfs (rw) /dev/oravg/lv_data1 on /u01/racdata/data1 type ocfs (rw) /dev/oravg/lv_data2 on /u01/racdata/data2 type ocfs (rw) /dev/oravg/lv_data3 on /u01/racdata/data3 type ocfs (rw) /dev/oravg/lv_ndx1 on /u01/racdata/ndx1 type ocfs (rw) /dev/oravg/lv_ndx2 on /u01/racdata/ndx2 type ocfs (rw) /dev/oravg/lv_ndx3 on /u01/racdata/ndx3 type ocfs (rw) /dev/oravg/lv_temp1 on /u01/racdata/temp1 type ocfs (rw) /dev/oravg/lv_temp2 on /u01/racdata/temp2 type ocfs (rw) /dev/oravg/lv_undo1 on /u01/racdata/undo1 type ocfs (rw) /dev/oravg/lv_undo2 on /u01/racdata/undo2 type ocfs (rw) /dev/oravg/lv_undo3 on /u01/racdata/undo3 type ocfs (rw) /dev/oravg/lv_cm on /u01/racdata/cm type ocfs (rw) /dev/oravg/lv_svrcfg on /u01/racdata/svrcfg type ocfs (rw) /dev/oravg/lv_quorum on /u01/racdata/quorum type ocfs (rw)
h. A small test [OCFS] on safa check /u01/racdata/data1
safa:/u01 # ls -al racdata/data1 total 65540 drwxrwxr-x 1 oracle dba 131072 Nov 9 22:25 . drwxr-xr-x 17 oracle dba 4096 Nov 9 12:27 ..
on marwa create a file in the same directory
marwa:/u01 # cat /etc/ocfs.conf > racdata/data1/testocfs.txt marwa:/u01 # ls -al racdata/data1 total 65541 drwxrwxr-x 1 oracle dba 131072 Nov 9 15:27 . drwxr-xr-x 17 oracle dba 4096 Nov 9 05:32 .. -rw-r--r-- 1 root root 170 Nov 9 15:34 testocfs.txt
on safa check /u01/racdata/data1 , note that the file now exists and is viewable
safa:/u01 # ls -al racdata/data1 total 65540 drwxrwxr-x 1 oracle dba 131072 Nov 9 22:25 . drwxr-xr-x 17 oracle dba 4096 Nov 9 12:27 .. safa:/u01 # ls -al racdata/data1 total 65541 drwxrwxr-x 1 oracle dba 131072 Nov 9 22:25 . drwxr-xr-x 17 oracle dba 4096 Nov 9 12:27 .. -rw-r--r-- 1 root root 170 Nov 9 15:34 testocfs.txt safa:/u01 # cat racdata/data1/testocfs.txt # # ocfs config # Ensure this file exists in /etc # node_name = marwa ip_address = 10.0.0.51 ip_port = 7000 comm_voting = 1 guid = 30488860FA8B7BDFB35D00105A286603
It works! i. Monitor OCFS [OCFS] OCFS starts node monitor processes for each mount point, these are seen as "ocfsnm-x" processes where x represent a number. OCFS starts a single listner process, seen as "ocfslsnr" Here is a sample listing from safa:
safa:~ # ps -ef|grep ocfs root 11471 1 0 15:34 ? 00:00:00 [ocfsnm-15] root 11472 1 0 15:34 ? 00:00:00 [ocfslsnr] root 11474 1 0 15:34 ? 00:00:00 [ocfsnm-16] root 11477 1 0 15:34 ? 00:00:00 [ocfsnm-17] root 11480 1 0 15:34 ? 00:00:00 [ocfsnm-18] root 11482 1 0 15:34 ? 00:00:00 [ocfsnm-19] root 11485 1 0 15:34 ? 00:00:00 [ocfsnm-20] root 11488 1 0 15:34 ? 00:00:00 [ocfsnm-21] root 11491 1 0 15:34 ? 00:00:00 [ocfsnm-22] root 11494 1 0 15:34 ? 00:00:00 [ocfsnm-23] root 11496 1 0 15:34 ? 00:00:00 [ocfsnm-24] root 11499 1 0 15:34 ? 00:00:00 [ocfsnm-25] root 11502 1 1 15:34 ? 00:00:00 [ocfsnm-26] root 11504 1 0 15:34 ? 00:00:00 [ocfsnm-27] root 11508 1 0 15:34 ? 00:00:00 [ocfsnm-28] root 11511 1 2 15:34 ? 00:00:00 [ocfsnm-29] root 11514 1 0 15:34 ? 00:00:00 [ocfsnm-30]
j. OCFS Performance & Maintenance [OCFS] Here are a few factors to take into consideration when using OCFS: 1) The block size used in OCFS is critical, this affects the number of files that will be available in an OCFS partition as follows: ( partition size - 8Mb ) / ocfs_block_size = maximum number of files available 2) The block size is the minimum space allocated for a file, for example a 1 byte file would be allocated 128k in our configuration. 128k is an Oracle recommended setting. There are exceptions to this and should be used as needed. 3) OCFS file systems can become fragmented, proper datafile sizing at the beginning is important. Automatic extensions should be avoided, try to pre-allocate the space ahead of time. 4) To check for the largest contiguous extent of space available in a partition use the ocfsextfinder tool:
safa:~ # /usr/sbin/ocfsextfinder /dev/oravg/lv_redo2 /usr/sbin/ocfsextfinder 1.0.10-PROD1 Fri Mar 5 15:52:10 PST 2004 (build fcb0206676afe0fcac47a99c90de0e7b) Runs of contiguous free space available (decending order) Run # Length (KB) Starting bit number ===== =========== =================== 1 505984 0
If the partition is fragmented, there is no defragmentation utility, the partition will need to be unmounted, reformated and the datafiles restored.
Appendix A.  Pre-requisites [Top]
============================================================
a. oracle account
b. kernel settings
c. rsh setup


a. oracle account [PREREQ]
============================================================

i) Please ensure that the environment settings are the same on both boxes for:
  1. oracle uid (cat /etc/passwd | grep oracle)
  2. oracle gid (cat /etc/passwd | grep oracle)
  3. oracle user home directory (cd ~oracle)
  4. oracle user ulimit settings (ulimit -a)
b. kernel settings [PREREQ] ============================================================ The following script will set the RAC kernel parameters for semaphores and the largest shared memory segment that can be allocated:
#!/bin/bash # Author: Ahbaid Gaffoor, OCPdba.Net # Date: Sunday 7th November 2004 # File: rac-kernel # Use: To set kernel parameters on boot for RAC # Customize to suit your taste # # Kernel Parameters before Oracle Configuration, make a note of yours # # /proc/sys/kernel/sem # 250 256000 32 1024 # # /proc/sys/kernel/shmmax # 33554432 echo "Setting Oracle RAC Kernel Parameters" # # SEMMSL SEMMNS SEMOPM SEMMNI Semaphore Parameters # # SEMMSL : Max number of semaphores per id # SEMMNS : Max number of semaphores in a system # SEMOPM : Max number of ops per per semop call # SEMMNI : Max number of semaphore identifiers # echo "Setting SEMMSL SEMMNS SEMOPM SEMMNI in /proc/sys/kernel/sem" echo 250 256000 100 1024 > /proc/sys/kernel/sem # # SHMMAX : Max amount of memory to be allocated for shared memory # Should be set to half of your physical Memory # echo "Setting SHMMAX in /proc/sys/kernel/shmmax" echo 536870912 > /proc/sys/kernel/shmmax
c. rsh setup [PREREQ] ============================================================ On both boxes:
  1. Install rsh-server-0.17-451 (rpm -ivh rsh-server-0.17-451.rpm)
  2. Activate rsh-server as follows:
    chkconfig rsh on chkconfig rlogin on /etc/init.d/xinetd reload
  3. /etc/hosts.equiv should be as follows:
    # # hosts.equiv This file describes the names of the hosts which are # to be considered "equivalent", i.e. which are to be # trusted enough for allowing rsh(1) commands. # # hostname safa oracle safa-rac oracle marwa oracle marwa-rac oracle
Appendix B.  Hardware Diagram [Top]
============================================================

        
Appendix C.  Volume Group oravg [Top]
============================================================
vgdisplay -v oravg
--- Volume group ---
VG Name               oravg
VG Access             read/write
VG Status             available/resizable
VG #                  0
MAX LV                256
Cur LV                18
Open LV               16
MAX LV Size           511.98 GB
Max PV                256
Cur PV                2
Act PV                2
VG Size               16.92 GB
PE Size               8 MB
Total PE              2166
Alloc PE / Size       1816 / 14.19 GB
Free  PE / Size       350 / 2.73 GB
VG UUID               c2OfJG-zcr9-3VBO-BPuh-0VJd-2pDI-eTFa9g

--- Logical volume ---
LV Name                /dev/oravg/lv_ctrl1
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   1
# open                 1
LV Size                504 MB
Current LE             63
Allocated LE           63
Allocation             next free
Read ahead sectors     1024
Block device           58:1

--- Logical volume ---
LV Name                /dev/oravg/lv_ctrl2
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   2
# open                 1
LV Size                504 MB
Current LE             63
Allocated LE           63
Allocation             next free
Read ahead sectors     1024
Block device           58:2

--- Logical volume ---
LV Name                /dev/oravg/lv_redo1
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   3
# open                 0
LV Size                504 MB
Current LE             63
Allocated LE           63
Allocation             next free
Read ahead sectors     1024
Block device           58:3

--- Logical volume ---
LV Name                /dev/oravg/lv_redo2
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   4
# open                 0
LV Size                504 MB
Current LE             63
Allocated LE           63
Allocation             next free
Read ahead sectors     1024
Block device           58:4

--- Logical volume ---
LV Name                /dev/oravg/lv_data1
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   5
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:5

--- Logical volume ---
LV Name                /dev/oravg/lv_data2
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   6
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:6

--- Logical volume ---
LV Name                /dev/oravg/lv_data3
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   7
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:7

--- Logical volume ---
LV Name                /dev/oravg/lv_ndx1
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   8
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:8

--- Logical volume ---
LV Name                /dev/oravg/lv_ndx2
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   9
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:9

--- Logical volume ---
LV Name                /dev/oravg/lv_ndx3
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   10
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:10

--- Logical volume ---
LV Name                /dev/oravg/lv_temp1
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   11
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:11

--- Logical volume ---
LV Name                /dev/oravg/lv_temp2
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   12
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:12

--- Logical volume ---
LV Name                /dev/oravg/lv_undo1
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   13
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:13

--- Logical volume ---
LV Name                /dev/oravg/lv_undo2
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   14
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:14

--- Logical volume ---
LV Name                /dev/oravg/lv_undo3
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   15
# open                 1
LV Size                1000 MB
Current LE             125
Allocated LE           125
Allocation             next free
Read ahead sectors     1024
Block device           58:15

--- Logical volume ---
LV Name                /dev/oravg/lv_cm
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   16
# open                 1
LV Size                504 MB
Current LE             63
Allocated LE           63
Allocation             next free
Read ahead sectors     1024
Block device           58:16

--- Logical volume ---
LV Name                /dev/oravg/lv_quorum
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   17
# open                 1
LV Size                504 MB
Current LE             63
Allocated LE           63
Allocation             next free
Read ahead sectors     1024
Block device           58:17

--- Logical volume ---
LV Name                /dev/oravg/lv_svrcfg
VG Name                oravg
LV Write Access        read/write
LV Status              available
LV #                   18
# open                 1
LV Size                504 MB
Current LE             63
Allocated LE           63
Allocation             next free
Read ahead sectors     1024
Block device           58:18


--- Physical volumes ---
PV Name (#)           /dev/sda1 (1)
PV Status             available / allocatable
Total PE / Free PE    1083 / 0

PV Name (#)           /dev/sdb1 (2)
PV Status             available / allocatable
Total PE / Free PE    1083 / 350
        


OCPdba.Net