samedi 16 mai 2009

Problem determination PowerHA / HACMP

Collect logs : clsnap -d '/tmp' -p2 -n 'node1,node2'

or

snap -e

If cspoc pb : /tmp/cspoc.log <= more details

If Vg configuration is inconstitent between nodes :

1) Validate that all disks in a VG are know, on both nodes

node1 # lspv grep vg1

hdisk3 005a2b2a4dc045f3 vg1 active

hdisk4 005a2b2ab58a59b2 vg1 active

node 2 # lspv grep vg1

hdisk3 005a2b2a4dc045f3 vg1

hdisk4 is missing ....

if HACMP is > 5.4, and VG are not ehanced concurrent (which is not the case, since the vg is not "concurrent" but "active" then :

node 2 # lspv grep 005a2b2ab58a59b2

hdisk4 005a2b2ab58a59b2 None

2) Integrate it in the vg correctly

node1 # lqueryvg -p hdisk3 -T > /usr/es/sbin/cluster/etc/vg/vg1 <= this is to save the good timestamp for the cluster

node1 # varyonvg -ub vg1 <= From now on, NO MORE manipulation on vg1, on node 1, must occur...

node2 # importvg -L vg1 hdisk3

vg1

node2 # lqueryvg -p hdisk3 -T > /usr/es/sbin/cluster/etc/vg/vg1 <= This way, the timestamp is correct on both nodes.

node1 #varyonvg vg1 <= Things are back to normal, now.

This is the simplest way to refefine correctly the vg on backup node... But, this is when things are going smooth.... its not always that way....

First, if the pvid is not known on node 2... First, is it's zoning correctly defined ? if yes, you MUST have a disk in "none None" on your backup node. If you want it to be correctly defined on you second node, you must do a "rmdev/cfgmgr" while vg1 is in mode 'unlocked' on node 1, via the varyonvg -ub command.

If it has been known, and now, it is no more, it means you have 'phantom' disks. Some disks must be "Defined" on node 2, as others are defined in place, with no definition (None = no pvid none = no vg defined). The good way to define them correctly, is to remove the "None none" disk, and to "mkdev" the Defined one, again.

For the timestamp definition, since HACMP 5.4, the timestamp is synchronised via the clveryfy command.

mercredi 6 mai 2009

Beware of the storm...

While configuring two vio server the other day, i wanted to transform both vio server SEA cards into SEA failover mode. I fell into the following trap :


If one Vio server is configured as followed :

1 virtual ethernet in vlan / pvid 1 with external network access yes, and trunk pri 1

SEA created between the adapter and this card, and, an internal adress configured on the SEA.


While configuring the other VIO server, in the same way than the first one, in order to transform it in failover mode, later, when you create the SEA adapter (with the virtual adapter on the same Vlan / pvid than the first vio server), you generate a biiig arp storm / broadcast storm, that can put your vlan, and more, down.


So, the good way to do it is to make directly the failover mode, as you create the SEA, or, to transform it before creating the second SEA on the second VIO server.


NOT : mkvdev -sea ent1 -vadapter ent4 -default ent4 -defaultid 3

but directly

mkvdev -sea ent1 -vadapter ent4 -default ent4 -defaultid 3 ha_mode=auto ctl_chan=ent3


Before, you should have the virtual adapter ent3 created, on vlan 3.


OR : if you just need to modify your existing SEA into failover mode :


chdev -dev ent3 -attr ha_mode=auto ctl_chan=ent4


there seems also to be a bypass at the switch level, which could be helpful : its the BPDU guard setting, which disables the port if bridging loop or packet storm occurs.



This is what it looks like at the end (2 differents networks for every partition)



SEA Failover entre 2 vio servers

Configuration réseau des vio servers, si on a 2 cartes réseaux physiques dans chaque vio server. Ceci ne concerne que les interfaces virtuelles qui sont présentées aux partitions :
Cette configuration permet de faire du fail over entre les vio servers, et de n'indiquer qu'une seule carte aux partitions hébergées, et que celle-ci soit en haute disponibilité.
1) Dans chaque vio serveur, il faut créer 4 interfaces virtuelles avec les parametres suivants :
ent2 dans le vlan (pvid) 1, acces au réseau externe, trunk priority 1 pour le 1er vio server, et trunk priority 2 sur le 2eme vio server
ent3 dans le vlan (pvid) 2, acces au réseau externe, trunk priority 1 pour le 1er vio server, et trunk priority 2 sur le 2eme vio server
ent4 dans le vlan 99
ent5 dans le vlan 98

2) Il faut ensuite créer DIRECTEMENT, et non pas, par étapes (sinon attention au storm arp) via la commande suivante, les SEA en mode FAIL OVER :
Pour la production :
mkvdev -sea ent0 -vadapter ent2 -default ent2 -defaultid 99 -attr ha_mode=auto ctl_chan=ent4
Explication : on crée l'interface ent6 qui sera le SEA (Shared ethernet adapter) via l'adaptateur virtuel ent2, en nous appuyant sur l'interface physique ent0. On le configure directement en fail over mode et le heartbit passe par ent4
pour le réseau de sauvegarde :
mkvdev -sea ent1 -vadapter ent3 -default ent3 -defaultid 98 -attr ha_mode=auto ctl_chan=ent5
qui va créer ent7
3) ensuite il faut configurer l'adresse ip par laquelle on va joindre les vio servers :
cfgassist => et configurer les interfaces SEA soit ent6 et ent7, avec les bonnes adresses ip.