Here is important information, taken from Oracle Technology Network, about OCFSv2 timeouts.
In case of iSCSI on NetAll cluster, failover time can be 10 – 30 seconds, so OCVS timeout must be higher then default. I did not figure out best timeout yet, but I recommend 46 to have 90 seconds fence time (46 – 1 ) * 2:
vi /etc/sysconfig/o2cb
O2CB_HEARTBEAT_THRESHOLD=46
Script 019-edit-o2cb.sh do this change.
Text below was taken from ‘Oracle on FireWire’ document (see here: http://www.oracle.com/technology/pub/articles/hunter_rac10gr2_2.html#16 ).
Adjust the O2CB Heartbeat Threshold
This is a very important section when configuring OCFS2 for use by
Oracle Clusterware's two shared files on our FireWire drive. During testing, I
was able to install and configure OCFS2, format the new volume, and finally
install Oracle Clusterware (with its two required shared files; the voting disk
and OCR file), located on the new OCFS2 volume. I was able to install Oracle
Clusterware and see the shared drive; however, during my evaluation I was
receiving many lock-ups and hanging after about 15 minutes when the Clusterware
software was running on both nodes. It always varied on which node would hang
(either linux1 or linux2 in my example). It also didn't matter whether there was a high
I/O load or none at all for it to crash (hang).
Keep in mind that the configuration you are creating is a rather
low-end setup being configured with slow disk access with regards to the
FireWire drive. This is by no means a high-end setup and is susceptible to
bogus timeouts.
After looking through the trace files for OCFS2, it was apparent
that access to the voting disk was too slow (exceeding the O2CB heartbeat
threshold) and causing the Oracle Clusterware software (and the node) to crash.
The solution I used was to simply increase the O2CB heartbeat
threshold from its default setting of 7, to 601 (and in some cases as high as
900). This is a configurable parameter that is used to compute the time it
takes for a node to "fence" itself.
First, let's see how to determine what the O2CB heartbeat
threshold is currently set to. This can be done by querying the /proc file system as follows:
# cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold7
The value is 7, but what
does this value represent? Well, it is used in the formula below to determine
the fence time (in seconds):
[fence time in seconds] = (O2CB_HEARTBEAT_THRESHOLD - 1) * 2
So, with a O2CB heartbeat
threshold of 7, you would have a fence time of:
(7 - 1) * 2 = 12 seconds
You need a much larger
threshold (1200 seconds to be exact) given your slower FireWire disks. For 1200
seconds, you will want a O2CB_HEARTBEAT_THRESHOLD of 601 as shown below:
(601 - 1) * 2 = 1200 seconds
Let's see now how to
increase the O2CB heartbeat threshold from 7 to 601. This will need to be
performed on both nodes in the cluster. You first need to modify the file /etc/sysconfig/o2cb and set
O2CB_HEARTBEAT_THRESHOLD to 601:
# O2CB_ENABELED: 'true' means to load the driver on boot.O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.O2CB_HEARTBEAT_THRESHOLD=601
After modifying the file /etc/sysconfig/o2cb, you need to alter the o2cb configuration. Again,
this should be performed on all nodes in the cluster.
# umount /u02/oradata/orcl/# /etc/init.d/o2cb unload# /etc/init.d/o2cb configure Load O2CB driver on boot (y/n) [y]: yCluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2Writing O2CB configuration: OKLoading module "configfs": OKMounting configfs filesystem at /config: OKLoading module "ocfs2_nodemanager": OKLoading module "ocfs2_dlm": OKLoading module "ocfs2_dlmfs": OKMounting ocfs2_dlmfs filesystem at /dlm: OKStarting cluster ocfs2: OK
You can now check again
to make sure the settings took place in for the o2cb cluster stack:
# cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold601
Important Note: The value of 601 used for the O2CB heartbeat threshold will not work for all the FireWire drives listed in this guide. Use the following chart to determine the O2CB heartbeat threshold value that should be used.