Why would my Linux host suddently stop receiving multicast? All other nics on private network are receiving

6,255

I'm not aware of any expiration policy for IGMP group membership within the Linux stack. It may happen, but I doubt it, since there are at least two ways for the kernel to be told (one explicit, the other implicit) when a program's IGMP membership should be dropped.

Therefore, I think you have a bug in the software listening for the multicast packets. (Care to name it?) The program receiving the multicast has somehow either dropped its own membership or neglected to add its membership on starting up. On restarting the multicast listener program, tcpdump should see the IGMPv2+ group add membership request go out on the network.

You may well have never noticed this bug when testing on a small LAN since cheap network switches don't understand IGMP. The feature is called IGMP snooping, and it's only found in switches roughly 5× the cost, per port, of the cheapest units, or more. A switch without IGMP snooping ability — or one with the feature turned off — turns multicast into broadcast, so IGMP group-add messages aren't necessary.

Your hosting provider apparently has IGMP snooping enabled on their network fabric, since multicast messages stopped coming in after the IGMP group membership went away on the troubled machine's network stack.

It may also be that the hosting provider's IGMP snooping options are misconfigured in the switches, so they're dropping the group membership, but that doesn't explain the netstat -g result.

Share:
6,255

Related videos on Youtube

Sharad
Author by

Sharad

Updated on September 18, 2022

Comments

  • Sharad
    Sharad over 1 year

    Here is my dilemma. Suddenly, as of yesterday, multicast packets are no longer being received from eth1 (private gigabit network) from one node. Routing between all the nodes is fine, no collisions, packet loss, etc.

    ifconfig info, inet addr, Bcast, Mask, are all fine - they all share the same bcast and netmask. Also, they all share: UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 on eth1 .

    These nodes are all hosted by a Xen VM provider. All the guests are seeing each others private IP addresses. There are no iptables rules involved. Multicast packets can be seen between all the nodes (20+) except one - using tcpdump. The system was restarted, etc.

    just to add, the affected node, according to netstat -g is not being assigned the multicast group, "eth1 1 224.2.2.4" as all the others.

    What would cause something like this? It seems that one node is no longer part of the multicast group - I have a ticket opened but I have a feeling they are stumped.

  • Sharad
    Sharad almost 12 years
    you're correct regarding netstat -g . We are using hazelcast for this. I just realized that nodes without hazelcast, that don't even need to be listening to those multicasts, according to tcpdump are still getting it - even when not part of the group - which is fine. But it doesn't explain why one node doesn't seem to be receiving them properly.
  • Warren Young
    Warren Young almost 12 years
    The continuing reception of mcast packets means one of two things: 1. You once ran hazelcast on that node, the switch saw the IGMP group join, didn't see a group drop, and so is still sending packets...some switches have a "fast leave" feature to fix this, but it can cause more problems than it solves; or 2. the hosting provider hasn't enabled IGMP snooping, in which case you have some problem with the node itself. Try the tcpdump test, looking for the IGMP group add request.