[rbridge] Proposed resolution of DRB election/MTU testing
d3e3e3 at gmail.com
Fri May 1 11:42:59 PDT 2009
On Fri, May 1, 2009 at 2:01 AM, Ali Sajassi (sajassi) <sajassi at cisco.com> wrote:
> Hi Radia,
> Regarding your comment that loop can happen as the result of one-way
> connectivity via partitioned VLAN, AFAIK DRB selection is done over
> designated VLAN and if that VLAN is partitioned, then the bridged
> network is partitioned because this VLAN is expected to be reachable via
> all the access ports. And if the bridged network is partitioned, the it
> is O.K. to have a DRB for each of the partitions. So, where is the loop?
The DRB selection has to be done on more than "designated VLAN". What
is the "designated VLAN" if all the N different RBridges with ports
connected to an arbitrary bridged LAN are configured to want a
different designated VLAN? The answer is that the DRB gets to dictate
the designated VLAN.
Exactly what VLANs which RBridges need to send "hellos" on was are
area that the TRILL working group considered quite carefully. The set
of VLANs specified in the current draft for protective hellos (which
is the set of VLANs the one new "TRILL Hello" Radia suggested would be
sent on) was the result. While you could send "hellos" on even more
VLANs, a sketch of a proof that the set of VLANs specified in the
current draft is safe is given in Section 4 of
draft-eastlake-trill-rbridge-notes-01.txt (this is not a WG draft and
has not been updated for the latest base protocol specification).
> Besides, how does a partitioned VLAN result in a one-way connectivity
> (in bridged network, it should result in a two-way disconnectivity).
I'm not sure exactly what Radia meant but, as far as I know, if you
don't turn on bridge VLAN ingress filtering, it is quite easy to
configure a bridge port so that it acts as a VLAN diode, letting
frames through one way and not the other for a particular VLAN or
VLANs. The design of the protective hello mechanism in the current
draft allows for arbitrary VLAN diodes in the arbitrary bridged LAN
interconnecting a set of RBridges.
> I have no desire to prolong this discussion; however, I am not clear of
> the failure scenarios that warrant the modifications to IS-IS and I
> would very much prefer to keep the things simple and don't modify IS-IS
> if we don't have to. I agree with you that the proposed solution is more
> flexible, I just don't see where and why such flexibility is needed. So,
> under which of the following the loop is expected:
There is a potential loop in TRILL whenever you have two RBridges
ports, A and B, such that A can emit a native frame in VLAN x that
will get to B but when A sends a hello in VLAN x, it does not get to
> A) loss of connectivity over .1Q bridged network that supports standard
> 1522 bytes frame size: I listed all the scenarios that I could think of
> in my previous email (e.g., different combinations of access and trunk
> ports) and showed that we won't have loops in any of those scenarios.
In my response, I explained why I did not find your message persuasive.
> B) loss of connectivity over .1Q bridged network that doesn't support
> 1522 byte frame size (e.g., it only supports 1100 bytes). Is that the
> scenario of interest ? If so, then this cannot happen because by
> definition a .1Q network MUST support frame size of 1522 bytes. If it
> doesn't support standard frame size, then loop can happen within that
> .1Q network itself that runs MSTP.
While I agree that it is reasonable to assume support of some minimal
message size, there is no assumption that the arbitrary bridged LAN
connecting RBridges is "a .1Q network".
> C) loss of connectivity over .1Q bridge network because of one-way
> connectivity - e.g., partitioned VLAN. This is what you mentioned in
> this email and the question is how partition of designated VLAN will
> result in loop. And how a one-way connectivity can happen in a bridged
> network because if that can happen, then there will be a loop within the
> bridged network that runs MSTP.
It's not TRILL's problem whether or not bridges in the arbitrary
bridge LAN between RBridges support MSTP or whether or not MSTP works
when bridge ports are configured as VLAN diodes. I see no purpose in
discussing this here.
> So, I would appreciate if anyone can clearly describe a scenario in
> which a loop can be created by TRILL over a bridge network.
>> -----Original Message-----
>> From: rbridge-bounces at postel.org
>> [mailto:rbridge-bounces at postel.org] On Behalf Of Radia Perlman
>> Sent: Thursday, April 30, 2009 3:08 PM
>> To: rbridge at postel.org
>> Subject: Re: [rbridge] Proposed resolution of DRB election/MTU testing
>> Re: Ali Sajassi's question about how loops could form:
>> To simplify, with any sort of one-way connectivity, the
>> current IS-IS LAN Hello protocol says to ignore (for purpose
>> of DR election) any router with which you don't have 2-way
>> One-way connectivity can happen due to lots of reasons -- not
>> just MTU.
>> For example,
>> because of partitioned VLANs.
>> So therefore, it's pretty clear that if you want only a
>> single DRB, you need to defer to anyone you hear from (or
>> about) with better (ID, priority).
>> And separating out the MTU testing into a separate, optional,
>> very infrequent protocol allows flexibility that might be
>> very useful in data centers (actually measuring the MTU, not
>> just saying yes/no). Plus if padded hellos are guaranteed to
>> get through, you don't need to pad them. And if padding them
>> might cause them not to get through, then you'll get loops.
>> (sajassi) wrote:
>> > Hi Radia,
>> > Thanks for the description of the solution. It was very nice and
>> > concise. However, I'd like to get some clarification regarding the
>> > scenario(s) in which such solution is required.
>> > The original problem was that the IS-IS hello messages can get lost
>> > over 802.1Q (because of Eth frame size exceeding 1522
>> bytes) and thus
>> > resulting in multiple AFs selection and creating loop over 802.1Q
>> > network. This was unacceptable because TRILL was forming a
>> loop over
>> > 802.1Q network with standard-sized frame of 1522. I can
>> think of three
>> > scenarios that can be considered here:
>> > 1) 802.1Q network is only used to transport native Ethernet packets
>> > (e.g., Rbridge ports connected to 802.1Q network are configured as
>> > access ports only). In this scenario, no TRILL encapsulated packets
>> > are sent over the bridged network and thus IIHs that are sent over
>> > 802.1Q network should not be padded with any consideration
>> for TRILL
>> > header overhead - e.g., IIHs needs only to get padded to 1500 bytes.
>> > In this scenario, since the padded IIHs adhere to 1500
>> byte-limit of
>> > 802.1Q network, there is no issue with IIHs getting dropped by the
>> > bridged network. The TRILL mechanism for DRB & AF selection and
>> > root-bridge collision detect should work just fine and
>> there should be
>> > no transient loop created either in the bridged network or
>> the TRILL
>> > network.
>> > 2) 802.1Q network is only used as transit network for TRILL
>> > encapsulated packets (e.g., Rbridge ports connected to
>> 802.1Q network
>> > are configured as trunk ports only). In this scenario, IIHs should
>> > get padded with consideration for TRILL header overhead in
>> mind which
>> > means if the bridged network doesn't have support for
>> bigger MTU, then
>> > these IIHs get dropped. However, dropping of these IIHs
>> doesn't cause
>> > any harm since there is no access ports over this bridged
>> network and
>> > all it means is that TRILL nodes don't see each other over
>> the bridged
>> > network and thus they don't form adjacencies with each other. Since
>> > there is no access ports and there is no AFs (Appointed
>> > there will be no loop over this bridged network.
>> > 3) 802.1Q network can be used to transport both Native Ethernet
>> > packets and TRILL encapsulated packets (e.g., Rbridge ports
>> > to 802.1Q network are configured as both access and trunk
>> ports). In
>> > this scenario, IIHs MTU size should default to the one for
>> access port
>> > - e.g., without any consideration for TRILL header
>> overhead. In other
>> > words, it should work just like scenario-1. In such
>> scenario, no IIHs
>> > can get dropped over the 802.1Q network because the TRILL
>> never pads
>> > them to more than 1500 bytes of payload.
>> > So, in summary, I cannot see how a loop can be formed using
>> > standard-based 802.1Q frame size of 1522 based on the above three
>> > scenarios. If you have some other scenario(s) in mind, can you
>> > describe it or them. It should be noted that I am talking about
>> > standard 802.1Q frame size for the above scenarios (and not jumbo
>> > frame size based on 802.3as).
>> > Cheers,
>> > Ali
>> >> -----Original Message-----
>> >> From: rbridge-bounces at postel.org
>> >> [mailto:rbridge-bounces at postel.org] On Behalf Of Radia Perlman
>> >> Sent: Wednesday, April 29, 2009 9:21 PM
>> >> To: rbridge at postel.org
>> >> Subject: [rbridge] Proposed resolution of DRB election/MTU testing
>> >> Moving to a new thread, since the "Why is MTU discovery important"
>> >> thread was getting too long (and the subject line had
>> nothing to do
>> >> with most of the discussion).
>> >> Hopefully we can quickly close the issue of DRB election/MTU
>> >> discovery.
>> >> And then finally close on the base protocol document...
>> >> I was participating in an off-list
>> >> group of people discussing the issue, which was useful in
>> >> certain aspects.
>> >> I will summarize the proposed solution.
>> >> First, refreshing people as to what the issue is:
>> >> The problem: The Hello protocol in IS-IS may elect multiple DRs,
>> >> since routers ignore routers with whom they do not have 2-way
>> >> connectivity. Somewhat orthogonally, IS-IS LAN Hellos are
>> padded, to
>> >> avoid forming adjacencies with neighbors that you can't speak the
>> >> minimum acceptable size with.
>> >> This behavior is fine for layer 3, but not for layer 2,
>> where it will
>> >> form loops if there are multiple DRBs.
>> >> So ignoring this issue for TRILL is not an option.
>> >> ***********************
>> >> A bit of background on IS-IS
>> >> IS-IS is modular, in that there are two sublayers, that in
>> DECnet we
>> >> called the "subnetwork independent sublayer"
>> >> and the "subnetwork dependent sublayer". The subnetwork dependent
>> >> sublayer has neighbor adjacency forming protocols for
>> different types
>> >> of links.
>> >> What we are proposing for TRILL is support for a new type of link
>> >> within IS-IS's "link dependent sublayer". This is for
>> Ethernet links
>> >> that are not explicitly pt-to-pt.
>> >> What we need to accomplish with this protocol:
>> >> a) elect exactly one DRB
>> >> b) figure out what the campus-wide minimum MTU size, "S",
>> is (to know
>> >> what the minimum acceptable link MTU size is). LSP fragment sizes
>> >> must not be larger than this minimum
>> >> c) test neighbor-neighbor links to see if they support size S
>> >> d) remove links from the topology that do not support the campus
>> >> minimum size S
>> >> *****************************
>> >> Electing exactly one DRB
>> >> This will be based on periodic messages, which we'll call
>> >> TRILL-Hellos, similar to IS-IS LAN Hellos, but they are
>> different in
>> >> two aspects: they are not padded, and election is based solely on
>> >> (ID, priority), and not on whether connectivity is 2-way. In other
>> >> words, R2 defers to R1 if R1 has higher (ID, priority), whether or
>> >> not there is 2-way connectivity between R2 and R1.
>> >> TRILL-Hellos must be periodic and frequent, so as to avoid
>> having RBs
>> >> not know about each other. They will be sent on the set of
>> VLANs that
>> >> the TRILL spec already says Hellos would be sent on.
>> >> TRILL-Hellos contain all the same information that the TRILL spec
>> >> already claims is in Hello messages. (other than the
>> difference that
>> >> they won't be padded).
>> >> So basically, the changes in the TRILL spec so far are
>> >> 1) rename Hello to "TRILL-Hello message"
>> >> 2) do not pad the message
>> >> 3) election will be based solely on the fields (ID, priority).
>> >> Note: Although the neighbor list is included in a
>> TRILL-Hello, (as it
>> >> is in an IS-IS LAN Hello), it does not affect selection of DRB.
>> >> But the neighbor list still needs to be there for all the
>> RBridges to
>> >> know which neighbors they have 2-way connectivity with, for the
>> >> purpose of reporting links in LSPs.
>> >> *******************************
>> >> figuring out what the campus-wide minimum MTU size is:
>> >> This will be done based on a TLV in LSPs (which already exists in
>> >> IS-IS -- the originatingLSPBufferSize, TLV 14).
>> >> If that TLV is absent,
>> >> it is the same as requesting "1470". The campus-wide
>> minimum MTU size
>> >> chosen is the smallest size "S" reported in any LSP.
>> >> LSP fragments must not be bigger than S, and links that cannot
>> >> support S will not appear in the topology (meaning, they
>> will not be
>> >> reported in LSPs)
>> >> ***************************************************
>> >> testing neighbor-neighbor links to ensure they support "S".
>> >> We will have new messages: MTU probe, and MTU ack. Both
>> are padded to
>> >> size S.
>> >> It will be optional whether and when to send an MTU probe, but
>> >> mandatory to send an ack in response to receipt of an MTU
>> probe. The
>> >> ack is padded to the same size that the probe was padded
>> to, and is
>> >> unicast to the RBridge from which the probe was received.
>> Probes may
>> >> be unicast or multicast. They may be sent periodically
>> (but far less
>> >> frequently than DRB election messages). Or they might be
>> sent only in
>> >> response to an event such as hearing from a new neighbor RBridge.
>> >> Both MTU probes and acks are sent only on the Designated VLAN.
>> >> If R1 fails to get an ack from R2, R1 still reports R2 in its
>> >> neighbor list, but with a flag saying "failed minimum MTU test".
>> >> ****************
>> >> Links that are not 2-way for any reason, including not
>> supporting the
>> >> minimum campus-wide MTU S, are not reported in LSPs.
>> >> That means that if R1 is DRB, and it does not have 2-way
>> >> to R2, R1 does not list R2 as a neighbor, in the pseudonode LSP.
>> >> R2 does not report a link to the pseudonode.
>> >> If neither R2 nor R3 are DRB, they both have 2-way connectivity to
>> >> the DRB, but not to each other, then they do both report
>> >> to the pseuodnode. However, if R2 receives a packet that
>> needs to be
>> >> forwarded to R3 across that link, R2 sends the packet to the DRB
>> >> instead. (Note:
>> >> This behavior is already specified in IS-IS)
>> >> *******************
>> >> Concern was raised about the size of TRILL-hellos. Might
>> they wind up
>> >> being too big to fit? This concern would apply whether Hellos are
>> >> padded or not.
>> >> For instance, one topology
>> >> people envision is a core that connects hundreds of customer sites
>> >> into a giant Ethernet. The technology that creates the core is
>> >> irrelevant to TRILL, other than having Ethernet-like
>> >> in terms of being multiaccess and supporting multicast.
>> >> In that case, if the customer's Ethernet is running TRILL,
>> the core
>> >> would appear to TRILL to be a giant Ethernet with hundreds of
>> >> neighbors. In other words, all the hundreds of RBridges
>> connected to
>> >> the core would see all of the other RBridges on the core
>> as neighbors
>> >> on this "link".
>> >> IS-IS has carefully designed packet formats for LSPs and CSNPs, so
>> >> that they can be arbitrarily large, transmitted in pieces,
>> with each
>> >> piece being able to be independently processed. For some reason
>> >> though we didn't design Hellos that way. We should take the
>> >> opportunity to design TRILL-Hello messages with that effect.
>> >> There are some things in TRILL-Hellos that don't really need to be
>> >> reported frequently (like which VLANs you support). And
>> some things
>> >> (like the neighbor list) that might wind up being too large to fit
>> >> into a single packet.
>> >> Other information (such as ID and priority) really should
>> go in every
>> >> TRILL-Hello.
>> >> I'd suggest we do two things in the encoding of TRILL-Hellos
>> >> a) figure out which fields can be left out some of the time, and
>> >> specify that those fields, if absent, just mean they are
>> absent, not,
>> >> for instance, that you don't support any VLANs.
>> >> b) for information like the neighbor list that can be arbitrarily
>> >> large, figure out a way of encoding it in the spirit of CSNPs, so
>> >> that partial information can be included.
>> >> For instance, CNSPs say "this CSNP refers to all the LSPs from IDs
>> >> between x and y". We could do the same for the TRILL-Hellos, as in
>> >> "this TRILL-Hello neighbor list includes all neighbors with IDs
>> >> between x and y".
>> >> ***********************************
>> >> Radia
>> >> _______________________________________________
>> >> rbridge mailing list
>> >> rbridge at postel.org
>> >> http://mailman.postel.org/mailman/listinfo/rbridge
>> > _______________________________________________
>> > rbridge mailing list
>> > rbridge at postel.org
>> > http://mailman.postel.org/mailman/listinfo/rbridge
>> rbridge mailing list
>> rbridge at postel.org
> rbridge mailing list
> rbridge at postel.org
Donald E. Eastlake 3rd +1-508-634-2066 (home)
155 Beaver Street
Milford, MA 01757 USA
d3e3e3 at gmail.com
More information about the rbridge