Help on configuring a switch to make 802.3ad link aggregation work

12,024

Solution 1

Just because the switch supports LACP doesn't mean that it's expecting it.

Make sure that you configure LACP groups on the switch sides for the appropriate ports.

Then, check the LACP status both the switch and the server. Make sure that the 802.3ad link is up.

Solution 2

Web Configuraton Guide for Linksys Switches:

http://www.cisco.com/en/US/docs/switches/lan/csbms/srw2048/administration/guide/SRW-US_v10_UG_A-Web.pdf

Page 25-26 describe where to go to setup LACP on the switch side. Make sure you have your admin keys set on the two ports going to the switch.

Sorry I can't be of more help, I've only dealt with Catalyst, ProCurve, and Juniper EX switches for things like this.

Solution 3

First of all, you have to troubleshoot each LAG, one at a time. It sounds like you just plugged everything in, and you didn't walk through the setup process with one server, first. Otherwise, it sounds like you're asking us to read the manual for you. :-)

Regarding switch configuration:

On the switch, you need to create a separate link aggregation group (or "bond" or "LAG") for each individual server. So if you have Server #1 and Server #2, you need to configure LAG #1 and LAG #2 on the switch.

Most "smart" switches (web interface) have a separate configuration page for assigning switch ports to LAGs. Command line interfaces differ, but generally have a configuration sub-tree specifically to handle this. Check your switch's manual--there will be a chapter dedicated to this topic.

Specifically, you'll need to assign each server's real (physical) switch ports to that server's LAG. If Server #1 plugs into switch ports 5 and 6, then you assign switch ports 5 and 6 to LAG #1. Server #2 gets the same treatment, except its switch ports get assigned to LAG #2.

Configure the LAG-specific parameters for each LAG. Make sure you configure the LACP timeout parameter identically for each LAG/server pair. Generally, you want to use a "short" (1 second) LACP timeout, but it's most important that the settings are the same on both sides. You'll also want to make sure that the LAG type is correct: Many switches support multiple link aggregation/bonding types, chiefly Cisco's Portchannel and 802.3ad. You must configure your LAGs for DYNAMIC 802.3ad operation, to match how your Linux machines are configured.

Finally, you should configure any VLAN, trunking, or other port-specific parameters for each LAG. For these parameters, your switch will treat each LAG as if it were just another switch port--it can be tagged or trunked, you can turn on Jumbo frames, you can filter traffic, etc. Whatever settings you gave the underlying, real member ports are ignored while those ports are assigned to the LAG.

After you've configured your LAGs and assigned their port settings, you should be able to check the status of each one through the switch interface. It will report some kind of link state, probably an overall state for the whole bonded group plus the states of the individual real links in the group. You may get more information, depending on your switch interface.

On the Linux server, run cat /proc/net/bonding/bond0 (change 'bond0' to whatever your bond device name is) to see the status of the whole bond and the member links. This shows a stanza for the bond and each member link, and each stanza will have a line like 'MII Status: up' if it's healthy and functioning.

Share:
12,024

Related videos on Youtube

James
Author by

James

Updated on September 17, 2022

Comments

  • James
    James over 1 year

    I have a switch (SRW2024) which supports both jumbo frames and link aggregation.

    I've got 2 servers (each of them has 2x Gbit nic, working under kernel) that I want to connect to a filestorage backend (iSCSI, openfiler).

    I've setup bonding on each server (eth0+eth1) as bond0 and configured the subnet for it. File server has also the same network.

    Bonding mode is 4 (802.3ad Dynamic link aggregation) on every node on the network.

    But, unable to ping any host.

    Using tcpdump on the bond0, i'm getting an arp request 'who has x.x.3.1 tell x.x.3.2' but the target machine is unable to answer.

    No firewall, no special policies.

    I've spent hours trying different configurations ... no success.

    I'm looking for someone to get me started, i'm just lost.

    Any help would be really appreciated.