Comments on The Linux Juggernaut: Heartbeat Clustering

@Bharat.. Can you be more info.. I am not able to ...

2010-11-03T22:12:53.756-07:00

@Bharat.. Can you be more info.. I am not able to understand your question please give as much info as possible so that i can help you in this regard.

Thanks,
Surendra.

I am having a 4 server private network with one pu...

2010-10-29T04:42:30.450-07:00

I am having a 4 server private network with one public virtual ip address problem is this ip is not binding to primary server as it is floating to other node first and then binding to primary server as such loosing data
using linux suse server

my virtual ip address is not stabling in n nodes t...

2010-10-29T04:39:27.590-07:00

my virtual ip address is not stabling in n nodes to primary server as it first binds to other and then to primary server how can i stop it please tell
Thanks

@anony.. ya thats true.. you can just keep httpd ...

2010-09-14T09:00:12.113-07:00

@anony.. ya thats true.. you can just keep httpd
then check if you have httpd script in /etc/ha.d/resource.d/ then only it will work

How can i setup heartbeat cluster using web servic...

2010-09-14T01:58:29.534-07:00

How can i setup heartbeat cluster using web service not squid? Do i change this parameter? rp1.linuxnix.com 10.77.225.20 squid to rp1.linuxnix.com 10.77.225.20 httpd? Tahnks

How can i configure Heartbeat cluster web service ...

2010-09-14T01:56:05.030-07:00

How can i configure Heartbeat cluster web service not squid? Do i change rp1.linuxnix.com 10.77.225.20 squid to rp1.linuxnix.com 10.77.225.20 httpd ? Thanks!

Thanks Surender for timely response. I have fixed...

2010-02-18T06:54:56.694-08:00

Thanks Surender for timely response.

I have fixed all the issues. The problem is my crossover cable IP's are not communicating properly. There is some issue with network. So I finished the cluster configuration with single NICs.

Regards,
UK

Please see in line.. Regd. issue#1, as you mentio...

2010-02-15T21:34:51.170-08:00

Please see in line..

Regd. issue#1, as you mentioned in your blog-step#7, started heartbeat on both node1 and node2. Somehow both the machines are showing eth0:0 and the last started service (i.e. node2) is active and able to access the server with clusterIP. But when I stop it on node2, though node1 show eth0:0 it is not picking up. looks like it is not switching properly...

>> THATS COOL AND YOUR HEARTBEAT CLUSTER IS WORKING VERY WELL.. FOR YOUR ISSUE THINK IN THIS WAY.. I HAVE TWO NODES.. ONE ACTIVE NODE AND OTHER PASIVE.. WHEN PASIVE WILL GET UP?.. PASIVE WILL GET UP WHEN PASIVE IS NOT RECEIVING HEARTBEAT PULSE FROM ACTIVE NODE(THROUGH ETH1). SO WHAT YOUR PASIVE NODE IS THINKING? IT JUST THINKS THAT ACTIVE NODE WENT DOWN, SO I(PASIVE) HAVE TO START SMB SERVICE HERE AND IT(PASIVE NODE) WILL TAKE INITIATING OF CREATING ETH0:0 TOO.. THAT IS THE REASON YOU ARE SEEING ETH0:0 ON BOTH THE NODES..
SO HOW TO RESOLVE THIS ISSUE?
FROM MY UNDERSTANDING THERE IS NO PROPER COMMUNICATION BETWEEN ETH1 OF BOTH NODES.. PLEASE CHECK THAT CONFIGURATION.. DID YOU USED CROSS CABLE TO CONNECT ETH1 OF BOTH THE NODES?
LET ME KNOW..

I WILL POST ONE MORE POST ON HOW TO TRUBLESHOOT HEARTBEAT CLUSTER BY THIS WEEKEND.. MAY BE THAT WILL BE MORE USEFULL TO YOU..
AND THANKS FOR WRITING TO LINUXNIX.COM

Please see in line.. Please update step 4 (a &...

2010-02-15T21:23:35.386-08:00

Please see in line..

Please update step 4 (a & b). Both are mentioned for node 1. I think 4(b) is for node 2,

>>> THE CONFIGURATION SHOULD BE SAME ON BOTH THE NODES.. BECAUSE THIS IS MY MASTER NODE.. lET ME PUT IT IN THIS WAY.. SUPPOSE NODE1(ACTIVE) WENT DOWN, NODE TWO(PACIVE) WILL TAKE CARE OF SERVING SMB.. THIS NODE TWO WILL CONTUNIOUSLY SENDING HERTBEAT PULSE TO MASTER NODE TO CHECK THE STAUS.. ONCE NODE1 IS UP NODE TWO WILL CONSIDER THE CONFIG IN THIS STEP4 TO CHECK TO HOW TO TRANSFER THE CONTROL..

Please update step 4 (a & b). Both are mention...

2010-02-15T15:42:56.460-08:00

Please update step 4 (a & b). Both are mentioned for node 1. I think 4(b) is for node 2,

******************************** /var/log/ha-log ...

2010-02-15T10:13:53.175-08:00

********************************
/var/log/ha-log on node2 (next started):
info: Heartbeat generation: 1265931585
info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth1
info: glib: ucast: bound send socket to device: eth1
info: glib: ucast: bound receive socket to device: eth1
info: glib: ucast: started on port 694 interface eth1 to 10.14.2.132
info: G_main_add_TriggerHandler: Added signal manual handler
info: G_main_add_TriggerHandler: Added signal manual handler
info: G_main_add_SignalHandler: Added signal handler for signal 17
info: Local status now set to: 'up'
info: Link nft80fs01b:eth1 up.
WARN: node nft80fs01a: is dead
info: Comm_now_up(): updating status to active
info: Local status now set to: 'active'
WARN: No STONITH device configured.
WARN: Shared disks are not protected.
info: Resources being acquired from nft80fs01a.
info: Running /etc/ha.d/rc.d/status status
info: No local resources [/usr/share/heartbeat/ResourceManager listkeys nft80fs01b] to acquire.
info: Taking over resource group 172.25.41.153
ResourceManager[22155]: 2010/02/15_17:56:58 info: Acquiring resource group: nft80fs01a 172.25.41.153 smb
IPaddr[22182]: 2010/02/15_17:56:58 INFO: Resource is stopped
ResourceManager[22155]: 2010/02/15_17:56:58 info: Running /etc/ha.d/resource.d/IPaddr 172.25.41.153 start
IPaddr[22255]: 2010/02/15_17:56:58 INFO: Using calculated nic for 172.25.41.153: eth0
IPaddr[22255]: 2010/02/15_17:56:58 INFO: Using calculated netmask for 172.25.41.153: 255.255.255.0
IPaddr[22255]: 2010/02/15_17:56:58 INFO: eval ifconfig eth0:0 172.25.41.153 netmask 255.255.255.0 broadcast 172.25.41.255
IPaddr[22238]: 2010/02/15_17:56:58 INFO: Success
ResourceManager[22155]: 2010/02/15_17:56:58 info: Running /etc/ha.d/resource.d/smb start
mach_down[22129]: 2010/02/15_17:56:58 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[22129]: 2010/02/15_17:56:58 info: mach_down takeover complete for node nft80fs01a.
info: mach_down takeover complete.
info: Initial resource acquisition complete (mach_down)
info: Local Resource acquisition completed. (none)
info: local resource transition completed.

see /var/log/ha-log on node1 (1st started): info...

2010-02-15T10:13:35.647-08:00

see /var/log/ha-log on node1 (1st started):
info: Heartbeat generation: 1265931455
info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth1
info: glib: ucast: bound send socket to device: eth1
info: glib: ucast: bound receive socket to device: eth1
info: glib: ucast: started on port 694 interface eth1 to 10.14.2.131
info: G_main_add_TriggerHandler: Added signal manual handler
info: G_main_add_TriggerHandler: Added signal manual handler
info: G_main_add_SignalHandler: Added signal handler for signal 17
info: Local status now set to: 'up'
info: Link nft80fs01a:eth1 up.
WARN: node nft80fs01b: is dead
info: Comm_now_up(): updating status to active
info: Local status now set to: 'active'
WARN: No STONITH device configured.
WARN: Shared disks are not protected.
info: Resources being acquired from nft80fs01b.
info: Running /etc/ha.d/rc.d/status status
info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
info: mach_down takeover complete for node nft80fs01b.
info: mach_down takeover complete.
info: Initial resource acquisition complete (mach_down)
IPaddr[17702]: INFO: Resource is stopped
heartbeat[17611]: info: Local Resource acquisition completed.
harc[17753]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[17753]: received ip-request-resp 172.25.41.153 OK yes
ResourceManager[17774]: info: Acquiring resource group: nft80fs01a 172.25.41.153 smb
IPaddr[17801]: INFO: Resource is stopped
ResourceManager[17774]: info: Running /etc/ha.d/resource.d/IPaddr 172.25.41.153 start
IPaddr[17874]: INFO: Using calculated nic for 172.25.41.153: eth0
IPaddr[17874]: INFO: Using calculated netmask for 172.25.41.153: 255.255.255.0
IPaddr[17874]: INFO: eval ifconfig eth0:0 172.25.41.153 netmask 255.255.255.0 broadcast 172.25.41.255
IPaddr[17857]: INFO: Success
ResourceManager[17774]: info: Running /etc/ha.d/resource.d/smb start
info: Local Resource acquisition completed. (none)
info: local resource transition completed.

Excellent. Thanks for your quick response. Regd. ...

2010-02-15T10:04:12.739-08:00

Excellent. Thanks for your quick response.

Regd. issue#1, as you mentioned in your blog-step#7, started heartbeat on both node1 and node2. Somehow both the machines are showing eth0:0 and the last started service (i.e. node2) is active and able to access the server with clusterIP. But when I stop it on node2, though node1 show eth0:0 it is not picking up. looks like it is not switching properly...

Start heartbeat on node1
start heartbeat on node2
observation:
clusterIP on node2 is active. If I stop it on node2 still the clusterIP not accessible from node1.

FYI: I am using heartbeat for smb service.

Please let me know if there is any issue in my configs.

-UK

>>PLEASE SEE IN LINE.. I have configured a 2...

2010-02-12T08:11:25.098-08:00

>>PLEASE SEE IN LINE..
I have configured a 2-node cluster as per the steps and everything looks good.
But in my setup, both the nodes are giving Floating IP for eth0:0. Is it ok? or Something wrong in my config? Please clarify.
>>> IT SHOULD NOT HAPPEN.. AT A GIVEN TIME ONLY ACTIVE NODE SHOULD HAVE eth0:0 CONFIGURED..
TO TROUBLESHOOT CLUSTER.. DO AS BELOW..
1)STOP THE HEARTBEAT CLUSTER ON ACTIVE NODE.. THE SECONDARY NODE SHOULD TAKE CARE OF STARTING THE DEPENDENT SERVICE..
AND eth0:0 SHOULD COME.. IF THIS WORKS FINE I THINK EVERY THING FINE..
CHECK THE LOGS OF HEARTBEAT.. MAY BE YOU WILL GET SOME INFO..

One more thing how to say the setup is configured as Active-Active OR Active-Passive? Please clarify.
>>>LOAD BALANCING CLUSTER IS CALLED ACTIVE-ACTIVE WHERE AS HA IS CALLED ACTIVE-PASSIVE..
Regards,
UK

Hi Surender, First of all thank you very much for ...

2010-02-12T07:32:43.567-08:00

Hi Surender,
First of all thank you very much for the detailed steps.

I have configured a 2-node cluster as per the steps and everything looks good.
But in my setup, both the nodes are giving Floating IP for eth0:0. Is it ok? or Something wrong in my config? Please clarify.

One more thing how to say the setup is configured as Active-Active OR Active-Passive? Please clarify.

Regards,
UK

Please see in line.. 1) wether web server sevi...

2010-02-08T08:57:47.629-08:00

Please see in line..
1) wether web server sevice will have to run on both machines or not?
>>NO NEED YOU TO RUN THE SERVER SERVICE ON ANY NODE.. HEARTBEAT CLUSTER WILL TAKE CARE OF RUNNING THE SERVICE FOR YOU.. AND ONE MORE THING HEARTBEAT WILL TAKECARE OF RUNNING THE SERVICE ON ACTIVE NODE AND IT WILL TAKE CARE OF STOPING THE SERVICE ON PASIVE NODE.
2) how would i connect those two ethernet interfaces to each other and to local switch (need a diagram of lan canling between servers and switch)
>>>ETH1 ON BOTH THE SYSTEMS ARE DIRECTLY CONNECTED WITH A CROSS CABLE.. AND ETH0 ARE CONNECTED TO A SWITCH DIRECTLY..
3) configuration realted to squid
>>> EDITED THE POST TO POINT TO SQUID CONFIGURATION..
If you provide me this information i think i can setup a cluster with ease

i have gone through your doc, you made it really l...

2010-02-03T00:24:07.509-08:00

i have gone through your doc, you made it really look simple and now i am all set to make a clusture for a web server but, there are some points to be reach.
1) wether web server sevice will have to run on both machines or not?
2) how would i connect those two ethernet interfaces to each other and to local switch (need a diagram of lan canling between servers and switch)
3) configuration realted to squid
If you provide me this information i think i can setup a cluster with ease