2011/06/13

Link aggregation in a cross-platform environment

Everybody in the world knows that LACP (802.1ad) is the standard for Link Aggregation and Control, right? Well, not exactly.

We have VMware ESX and Solaris servers connected to our Cisco edge switches. Sounds good, right? We'd like to bond the multiple gig-E NICs into a multi-GB aggregate. Sounds good, right? Well, it's not so easy.

ESX doesn't support true 802.1ad aggregation. They fake it with their vSwitch NIC teaming properties. They do the same thing as L3 LACP (hash of the source and destination IPs) but don't call it that. Fortunately, they use the same hash algorithm as Cisco, so we can work with it.

On the cisco side, we add the interfaces to a channel-group with mode "on". This uses the default-for-the-switch port-channel load-balance setting, which we had to set to src-dst-ip.

Unfortunately, since that setting is a global switch option and is not set on a per-port-channel level, this means that our Solaris boxes (who speak LACP properly) can't use Layer-4 (hash of source and dest IPs and ports) balancing. This sucks, because our Solaris boxes are the heavy-network-hitters (backup servers) that could really use the extra bandwidth provided by spreading the multiple TCP connections across multiple links.

I'm not sure who to blame here, VMware for not doing LACP, or Cisco for not allowing multiple loadbalancing methods on different port channel groups.

--Joe

No comments: