From Donald.Eastlake at motorola.com Mon Dec 3 14:27:07 2007 From: Donald.Eastlake at motorola.com (Eastlake III Donald-LDE008) Date: Mon, 3 Dec 2007 17:27:07 -0500 Subject: [rbridge] TRILL Meeting Tomorrow Message-ID: <3870C46029D1F945B1472F170D2D9790034C8ACB@de01exm64.ds.mot.com> Hi, An updated agenda and more, but not all, presentations have been uploaded to the Meeting Materials site https://datatracker.ietf.org/meeting/70/materials.html. As usual, we will need a secretary to take notes from the meeting. I'm asking in advance so that people will think about this and maybe we'll have a volunteer... Thanks, Donald From eric.gray at ericsson.com Tue Dec 4 10:57:50 2007 From: eric.gray at ericsson.com (Eric Gray) Date: Tue, 4 Dec 2007 12:57:50 -0600 Subject: [rbridge] (no subject) Message-ID: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> Folks, Here is the problem that occurs when VLAN state is inferred for one VLAN from connectivity provided by another. This is a general problem, but has specific applicability to the current set of assumptions used in the protocol specification. This came up in off-line discussions with Anoop Ghanwani (and others) at an IEEE meeting a couple of weeks ago. A key thing to understand in looking at this problem is it is a comparison between how a network works with 802.1Q bridges and the same network after some 802.1Q bridges have been replaced with RBridges. The example shows a partial RBridge deployment and this is compared with how it will have worked with 802.1Q bridges where the example shows RBridges (i.e. - it is an after- the-fact comparison of L2 forwarding functionality). The network looks like this (initially): \\\\|//// \\\\\|||///// \\\|/// __|__ __|__ __|__ | B-1 | | B-2 | | B-3 | |_____| |_____| |_____| | \ / \_____ / | | \_____/ | B-5 | | | | B-4 | |_____| | | |_____| | | __|__ \ | | | B-6 |_______\ _______| __|__ |_____| \___________| B-7 | | |_____| | | __|___ __|___ | RB-1 | | RB-2 | |______| |______| \ _ / \__ __( )___ _/ \_( Core )_/ (_ RBridge _) (_ Cloud _) (_ _) (_) In this figure, B-1, B-2 and B-3 are aggregation bridges with multiple (lots and lots) of VIDs. B-4 is a special purpose bridge used for VLAN-A only, and B-5 is a special purpose bridge used for VLAN-B only. All remaining bridges are configured to participate in VLAN-A, VLAN-B and an arbitrary set of zero or more other VLANs. Since B-4 and B-5 are configured for specific VLANs only, the ports on their adjacent bridging peers are configured only for those VLANs. To be clear, the links between B-4 and B-7, and between B-6 and B-5 are not connected (they merely overlap in the drawing). In this network, RB-1 and RB-2 are both RBridges and both have access to VLAN A and VLAN B and each other (via both VLAN A and VLAN B) - as well as the same arbitrary set of zero or more VLANs that any of the other bridges in the drawing have. However, the two RBridges use another VLAN - say VLAN C - to exchange hellos. Under the normal operating conditions intended, this works fine and RB-1 and RB-2 may separately be elected DRB for either of VLAN A or VLAN B. If bridge B-4 fails, then VLAN A would be segmented, at least temporarily. Hellos continue to work, however, so the 2 RBridges do not discover the partition and the DRB election remains unchanged. In this case, part of the VLAN is orphaned - particularly from the perspective of any locally attached end-stations. However, this is not acceptable behavior. No misconfiguration exists and the ideal (and reasonably expected) behavior would be for the RBridges to discover the partition and redo the DRB election - making RB-1 the DRB for one partition and RB-2 the DRB for the other partition. If (when) RB-1 and RB-2 were 802.1Q bridges, using MSTP for multiple VLANs (in particular, for VLAN A and B), this failure will have resulted in re-running (M)STP for the affected VLAN connectivity and the segmentation (partitioning) would be healed. In the same way, if the status of VLAN's A and B were derived directly from messages that use VLAN A and VLAN B (as opposed to using VLAN C), this same robust behavior would occur. -- Eric Gray Principal Engineer Ericsson From Radia.Perlman at sun.com Tue Dec 4 19:02:45 2007 From: Radia.Perlman at sun.com (Radia Perlman) Date: Tue, 04 Dec 2007 19:02:45 -0800 Subject: [rbridge] Critical bits for options Message-ID: <475614D5.9000207@sun.com> I'd like to make sure the decision about whether to define critical bits for options is made with "informed consent" of the WG. I will write this with no opinion -- just the tradeoffs. Currently the TRILL spec says to ignore all options -- the only thing the spec says is how to skip the options, if any. A "critical" option (some people call it "mandatory") is an option that if it appears and you don't understand it, you MUST drop the packet. A noncritical option is one that you are allowed to ignore and skip over. The way the spec is now precludes critical options, because RBridges following the current spec will skip all options. An alternative is for TRILL to define two bits at the beginning of the options (these bits only appear if the options length is greater than 0). The two bits are: a) a critical hop-by-hop option exists b) a critical end-to-end option exists. If we define these bits, then an egress RBridge MUST look to see if either of those bits are set, and if so, parse the options. A transit RBridge MUST look to see if the hop-by-hop bit is set, and if so, MUST parse the hop-by-hop options. Though I suppose an RBridge that doesn't support ANY critical options would know based on the presence of a critical option that it should drop the packet, without having to parse to find the critical option. So, our choices as a WG: Choice A: Define NOW that the first 2 bits in the option portion, if the option length is nonzero are those two critical bits. And that if you are forwarding an encapsulated data packet (i.e., you are acting as a transit RBridge for this packet), and thost of choice first (hop-by-hop critical) bit is set, you MUST drop the packet. And if you are egress RBridge, and either of the first two bits are set, you MUST drop the packet. Choice B: Leave the spec as it is The cost of choice A is a bit more complexity, and a bit more overhead of forwarding because of having the set the bit. The cost of choice B is that we cannot ever define critical options. The other arguments might be . how likely is it that we'd ever need critical options? Can we imagine some examples that we'd ever really want/need? . we could support such things by using a new version TRILL header when including critical options, so we are not really precluding critical options. . we could advertise (in LSPs for end-to-end critical options, in Hellos for hop-by-hop critical options), support for critical options, and the previous hop RBridge can throw the packet away if there's no way to forward it without the critical option, so we don't really need for the receiving RBridge to discard the packet. . how bad would it be for an RBridge to ignore a critical option? So...what do people think? Radia From Radia.Perlman at sun.com Tue Dec 4 19:14:40 2007 From: Radia.Perlman at sun.com (Radia Perlman) Date: Tue, 04 Dec 2007 19:14:40 -0800 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud In-Reply-To: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> Message-ID: <475617A0.4010302@sun.com> Putting in a subject line since Eric didn't. To restate Eric's concern -- it is possible that VLAN A might be partitioned on a link, and since the DRB selects only a single VLAN-A-forwarder, some VLAN A endnodes on the cloud might get orphaned (since they are on the other side of the partition from the appointed VLAN A forwarder on the link). And that is correct. So, what is the alternative? Everything is an engineering tradeoff. We could run DRB elections on every possible VLAN, and with a partitioned VLAN, wind up with multiple DRBs for that VLAN. If there were n VLANs, we'd wind up with n DRBs (one per VLAN), n pseudonodes, n times as many Hello messages, etc. We chose to consider a partiitoned VLAN, caused by bridges configured to not pass data traffic for a particular VLAN, as a misonfiguration. This can happen with bridges anyway, if you mark a bunch of your potential through links as not being allowed for VLAN A transit. The solution in the current spec is low overhead (at most one pseudonode per link, one Hello per RBridge except for the DRB) and relatively simple. Hopefully eventually more bridges will be replaced by RBridges and this sort of thing won't happen. Radia Eric Gray wrote: > Folks, > > Here is the problem that occurs when VLAN state is inferred > for one VLAN from connectivity provided by another. This is a > general problem, but has specific applicability to the current > set of assumptions used in the protocol specification. This came > up in off-line discussions with Anoop Ghanwani (and others) at an > IEEE meeting a couple of weeks ago. > > A key thing to understand in looking at this problem is it > is a comparison between how a network works with 802.1Q bridges > and the same network after some 802.1Q bridges have been replaced > with RBridges. The example shows a partial RBridge deployment > and this is compared with how it will have worked with 802.1Q > bridges where the example shows RBridges (i.e. - it is an after- > the-fact comparison of L2 forwarding functionality). > > The network looks like this (initially): > > \\\\|//// \\\\\|||///// \\\|/// > __|__ __|__ __|__ > | B-1 | | B-2 | | B-3 | > |_____| |_____| |_____| > | \ / \_____ / | > | \_____/ | B-5 | | > | | B-4 | |_____| | > | |_____| | | > __|__ \ | | > | B-6 |_______\ _______| __|__ > |_____| \___________| B-7 | > | |_____| > | | > __|___ __|___ > | RB-1 | | RB-2 | > |______| |______| > \ _ / > \__ __( )___ _/ > \_( Core )_/ > (_ RBridge _) > (_ Cloud _) > (_ _) > (_) > > In this figure, B-1, B-2 and B-3 are aggregation bridges with > multiple (lots and lots) of VIDs. > > B-4 is a special purpose bridge used for VLAN-A only, and B-5 > is a special purpose bridge used for VLAN-B only. > > All remaining bridges are configured to participate in VLAN-A, > VLAN-B and an arbitrary set of zero or more other VLANs. Since B-4 > and B-5 are configured for specific VLANs only, the ports on their > adjacent bridging peers are configured only for those VLANs. > > To be clear, the links between B-4 and B-7, and between B-6 > and B-5 are not connected (they merely overlap in the drawing). > > In this network, RB-1 and RB-2 are both RBridges and both have > access to VLAN A and VLAN B and each other (via both VLAN A and VLAN > B) - as well as the same arbitrary set of zero or more VLANs that > any of the other bridges in the drawing have. However, the two > RBridges use another VLAN - say VLAN C - to exchange hellos. Under > the normal operating conditions intended, this works fine and RB-1 > and RB-2 may separately be elected DRB for either of VLAN A or VLAN B. > > If bridge B-4 fails, then VLAN A would be segmented, at least > temporarily. Hellos continue to work, however, so the 2 RBridges > do not discover the partition and the DRB election remains unchanged. > In this case, part of the VLAN is orphaned - particularly from the > perspective of any locally attached end-stations. > > However, this is not acceptable behavior. No misconfiguration > exists and the ideal (and reasonably expected) behavior would be for > the RBridges to discover the partition and redo the DRB election - > making RB-1 the DRB for one partition and RB-2 the DRB for the other > partition. > > If (when) RB-1 and RB-2 were 802.1Q bridges, using MSTP for > multiple VLANs (in particular, for VLAN A and B), this failure will > have resulted in re-running (M)STP for the affected VLAN connectivity > and the segmentation (partitioning) would be healed. In the same way, > if the status of VLAN's A and B were derived directly from messages > that use VLAN A and VLAN B (as opposed to using VLAN C), this same > robust behavior would occur. > > > -- > Eric Gray > Principal Engineer > Ericsson > > _______________________________________________ > rbridge mailing list > rbridge at postel.org > http://mailman.postel.org/mailman/listinfo/rbridge > From touch at ISI.EDU Tue Dec 4 20:23:20 2007 From: touch at ISI.EDU (Joe Touch) Date: Tue, 04 Dec 2007 20:23:20 -0800 Subject: [rbridge] Critical bits for options In-Reply-To: <475614D5.9000207@sun.com> References: <475614D5.9000207@sun.com> Message-ID: <475627B8.10603@isi.edu> Radia Perlman wrote: > I'd like to make sure the decision about whether to define critical bits > for options is made with "informed consent" of the WG. I will write > this with no opinion -- just the tradeoffs. > > Currently the TRILL spec says to ignore all options -- the only thing > the spec says is how to skip the options, if any. > > A "critical" option (some people call it "mandatory") is an option that > if it appears and you don't understand it, you MUST drop > the packet. A noncritical option is one that you are allowed to ignore > and skip over. > > The way the spec is now precludes critical options, because RBridges > following the current spec will skip all options. > > An alternative is for TRILL to define two bits at the beginning of the > options (these bits only appear if the options length is > greater than 0). > > The two bits are: > a) a critical hop-by-hop option exists > b) a critical end-to-end option exists. > > If we define these bits, then an egress RBridge MUST look to see if > either of those bits are set, and if so, parse the options. I believe we're talking about a "summary" bit in each case. As a result, the goal of these bits is to avoid parsing the options in detail. There is benefit ONLY for hop-by-hop bits, i.e., to accelerate forwarding. Egress rbridges would need to parse the entire header anyway. That said, I'm not sure this is the right meaning for these bits. They are useful as a summary ONLY if they indicate whether "CAN IGNORE" in fast path rbridges. Let's presume that's NOT what the bit means, i.e., the bit means there is a critical option (as defined above). The rbridge still needs to parse the entire option set to find out WHICH options are listed, and to decide whether each is supported or not anyway. > Though I suppose an RBridge that doesn't support ANY critical options > would know based on the presence of a critical > option that it should drop the packet, without having to parse to find > the critical option. That works ONLY for the case where an rbridge supports NO critical options. If it supports any, then it needs to parse the entire header anyway. Is that really worth this? Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://mailman.postel.org/pipermail/rbridge/attachments/20071204/0870e5cf/signature.bin From Donald.Eastlake at motorola.com Tue Dec 4 23:58:30 2007 From: Donald.Eastlake at motorola.com (Eastlake III Donald-LDE008) Date: Wed, 5 Dec 2007 02:58:30 -0500 Subject: [rbridge] Vancouver Presentations on line Message-ID: <3870C46029D1F945B1472F170D2D97900350C7A2@de01exm64.ds.mot.com> The as-presented versions of all the TRILL presentations have been uploaded to the Meeting Materials site. Donald From james.d.carlson at Sun.COM Wed Dec 5 06:14:23 2007 From: james.d.carlson at Sun.COM (James Carlson) Date: Wed, 5 Dec 2007 09:14:23 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <475614D5.9000207@sun.com> References: <475614D5.9000207@sun.com> Message-ID: <18262.45631.542916.231584@gargle.gargle.HOWL> Radia Perlman writes: > Currently the TRILL spec says to ignore all options -- the only thing > the spec says is how to skip the options, if any. I think the part that's actually broken here is not the lack of some summary bits for the option list, but the fact that we haven't defined the basic option format. We should do that, and not just leave it to some "later documents." *Then* we can say that transit RBridges must obey the critical hop-by-hop options, and the egress RBridge must obey all critical options. (Where "obey" means "process if you understand, drop if you don't.") Yes, that means that all option creators need to weigh carefully the costs of creating a new option (in terms of slow-path expense in all systems) versus the benefit the option claims to provide. I think that's a good thing, and the warning in section 3.5 makes it pretty clear. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From touch at ISI.EDU Wed Dec 5 09:24:56 2007 From: touch at ISI.EDU (Joe Touch) Date: Wed, 05 Dec 2007 09:24:56 -0800 Subject: [rbridge] Critical bits for options In-Reply-To: <18262.45631.542916.231584@gargle.gargle.HOWL> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> Message-ID: <4756DEE8.3030504@isi.edu> A few lessons from IP (which we may or may not want to emulate): -------------------------- V6 rules: - hop-by-hop options MUST come before E2E ones - unrecognized options have 4 variants indicated via flags: - silent pass (skip over silently) - silent discard - discard w/ICMP - discard w/ICMP iff source != multicast - an additional per-option flag indicates: - immutable en-route - mutable en-route (i.e., ignore in Auth) -------------------------- V4 rules: - unrecognized options - silent pass ------------------------- I'll note that IPv6 doesn't e have a summary bit about critical HBH options. I don't see why we would anticipate being more complex than IPv6 in this regard. IMO, the IPv6 rules are probably sufficient to use as-is. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://mailman.postel.org/pipermail/rbridge/attachments/20071205/dda86831/signature.bin From eric.gray at ericsson.com Wed Dec 5 10:41:41 2007 From: eric.gray at ericsson.com (Eric Gray) Date: Wed, 5 Dec 2007 12:41:41 -0600 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud In-Reply-To: <475617A0.4010302@sun.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <475617A0.4010302@sun.com> Message-ID: <941D5DCD8C42014FAF70FB7424686DCF0211FBD6@eusrcmw721.eamcs.ericsson.se> Radia, I've been sort of wondering what the point is in having a subject line anyway. After all, the subject surely must be obvious from the text in the message, right? :-) Thanks! -- Eric Gray Principal Engineer Ericsson > -----Original Message----- > From: Radia Perlman [mailto:Radia.Perlman at sun.com] > Sent: Tuesday, December 04, 2007 10:15 PM > To: Eric Gray > Cc: Developing a hybrid router/bridge. > Subject: Orphaned endnodes with partitioned VLANs on a cloud > Importance: High > > Putting in a subject line since Eric didn't. > > To restate Eric's concern -- it is possible that VLAN A might be > partitioned on a link, and since the DRB selects only > a single VLAN-A-forwarder, some VLAN A endnodes on the cloud > might get > orphaned (since they are on the > other side of the partition from the appointed VLAN A > forwarder on the > link). > > And that is correct. > > So, what is the alternative? Everything is an engineering tradeoff. > We could run DRB elections on every possible VLAN, and with a > partitioned VLAN, wind up with multiple DRBs for > that VLAN. If there were n VLANs, we'd wind up with n DRBs (one per > VLAN), n pseudonodes, n times as many Hello > messages, etc. > > We chose to consider a partiitoned VLAN, caused by bridges > configured to > not pass data traffic for a particular VLAN, > as a misonfiguration. This can happen with bridges anyway, if > you mark a > bunch of your potential through links as not > being allowed for VLAN A transit. The solution in the current spec is > low overhead (at most one pseudonode per link, > one Hello per RBridge except for the DRB) and relatively simple. > Hopefully eventually more bridges will be replaced by > RBridges and this sort of thing won't happen. > > Radia > > > Eric Gray wrote: > > Folks, > > > > Here is the problem that occurs when VLAN state is inferred > > for one VLAN from connectivity provided by another. This is a > > general problem, but has specific applicability to the current > > set of assumptions used in the protocol specification. This came > > up in off-line discussions with Anoop Ghanwani (and others) at an > > IEEE meeting a couple of weeks ago. > > > > A key thing to understand in looking at this problem is it > > is a comparison between how a network works with 802.1Q bridges > > and the same network after some 802.1Q bridges have been replaced > > with RBridges. The example shows a partial RBridge deployment > > and this is compared with how it will have worked with 802.1Q > > bridges where the example shows RBridges (i.e. - it is an after- > > the-fact comparison of L2 forwarding functionality). > > > > The network looks like this (initially): > > > > \\\\|//// \\\\\|||///// \\\|/// > > __|__ __|__ __|__ > > | B-1 | | B-2 | | B-3 | > > |_____| |_____| |_____| > > | \ / \_____ / | > > | \_____/ | B-5 | | > > | | B-4 | |_____| | > > | |_____| | | > > __|__ \ | | > > | B-6 |_______\ _______| __|__ > > |_____| \___________| B-7 | > > | |_____| > > | | > > __|___ __|___ > > | RB-1 | | RB-2 | > > |______| |______| > > \ _ / > > \__ __( )___ _/ > > \_( Core )_/ > > (_ RBridge _) > > (_ Cloud _) > > (_ _) > > (_) > > > > In this figure, B-1, B-2 and B-3 are aggregation bridges with > > multiple (lots and lots) of VIDs. > > > > B-4 is a special purpose bridge used for VLAN-A only, and B-5 > > is a special purpose bridge used for VLAN-B only. > > > > All remaining bridges are configured to participate in VLAN-A, > > VLAN-B and an arbitrary set of zero or more other VLANs. Since B-4 > > and B-5 are configured for specific VLANs only, the ports on their > > adjacent bridging peers are configured only for those VLANs. > > > > To be clear, the links between B-4 and B-7, and between B-6 > > and B-5 are not connected (they merely overlap in the drawing). > > > > In this network, RB-1 and RB-2 are both RBridges and both have > > access to VLAN A and VLAN B and each other (via both VLAN A > and VLAN > > B) - as well as the same arbitrary set of zero or more VLANs that > > any of the other bridges in the drawing have. However, the two > > RBridges use another VLAN - say VLAN C - to exchange hellos. Under > > the normal operating conditions intended, this works fine and RB-1 > > and RB-2 may separately be elected DRB for either of VLAN A > or VLAN B. > > > > If bridge B-4 fails, then VLAN A would be segmented, at least > > temporarily. Hellos continue to work, however, so the 2 RBridges > > do not discover the partition and the DRB election remains > unchanged. > > In this case, part of the VLAN is orphaned - particularly from the > > perspective of any locally attached end-stations. > > > > However, this is not acceptable behavior. No misconfiguration > > exists and the ideal (and reasonably expected) behavior > would be for > > the RBridges to discover the partition and redo the DRB election - > > making RB-1 the DRB for one partition and RB-2 the DRB for > the other > > partition. > > > > If (when) RB-1 and RB-2 were 802.1Q bridges, using MSTP for > > multiple VLANs (in particular, for VLAN A and B), this failure will > > have resulted in re-running (M)STP for the affected VLAN > connectivity > > and the segmentation (partitioning) would be healed. In > the same way, > > if the status of VLAN's A and B were derived directly from messages > > that use VLAN A and VLAN B (as opposed to using VLAN C), this same > > robust behavior would occur. > > > > > > -- > > Eric Gray > > Principal Engineer > > Ericsson > > > > _______________________________________________ > > rbridge mailing list > > rbridge at postel.org > > http://mailman.postel.org/mailman/listinfo/rbridge > > > > From eric.gray at ericsson.com Wed Dec 5 11:19:38 2007 From: eric.gray at ericsson.com (Eric Gray) Date: Wed, 5 Dec 2007 13:19:38 -0600 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud In-Reply-To: <475617A0.4010302@sun.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <475617A0.4010302@sun.com> Message-ID: <941D5DCD8C42014FAF70FB7424686DCF0211FC48@eusrcmw721.eamcs.ericsson.se> Radia, I don't think this is either as hard to solve as you make it, nor as easy to ignore as it seems you would like us to think. From what you say, we have apparently decided in the protocol specification that we're going to declare something to be a misconfiguration once we replace one or more 802.1Q bridges with RBridges on a one-for-one basis - that was not a misconfiguration previously - simply because doing so makes it easier to do the protocol specification. Frankly, in my opinion, that is an excellent example of why that type of decision is not usually made in the protocol specification process - except maybe as a last resort (clearly nobody expects a protocol specification to do the impossible). So, to be clear, what you're proposing to do is to make yet another decision not to support existing L2 forwarding capabilities with RBridges in the interest of expediency. Is this decision explicitly supported somewhere by WG consensus? Or is this yet another thing that we're going to let slip through the cracks? To show that this is not really as difficult as you make it sound, let's look at something Erik said during the meeting. Erik said that this situation would be a mis-configuration if there was a single spanning tree for all VLANs anyway. Yes, that is true, but one of the things that a lot of the people in the TRILL working group are apparently unaware of is the fact that Multiple Spanning Trees (MSTP - or 802.1S at one time) is now a part of the base 802.1Q specification. While this does not guarantee that multiple spanning trees would actually be configured - and it never guaranteed that there would be a spanning tree for each VLAN - it does allow VLAN's to be correctly configured as I showed in my figure earlier (see in the forwarded text below). To allow a similar level of support in RBridges, it is only necessary to _allow_ per VLAN (or per-VLAN group - though that is a separate, and probably orthogonal, issue to this example) IS-IS peering - which the protocol specification now seems to explicitly forbid. In the example, VLAN A and VLAN B IS-IS hello messages being propagated via VLAN A and VLAN B explicitly, with all other VLAN IS-IS hello messages using a single VLAN (VLAN C, for instance) is all that would be required to correctly support this scenario - and achieve the scalability that people apparently are concerned about. In fact, a more general statement can be made. It will be the case generally that it should be possible to support some number of per-VLAN (or VLAN-group) IS-IS peering (or at least IS-IS hello messaging) relationships as needed to support a topology in which specific links are excluded (by configuration) from VLAN used for the default VLAN group peering/hello-messaging. In order for a configuration to require any number N of distinct peering relationships on a per-VLAN, or per VLAN group, basis - it is necessary to have N or more distinct links that exclude one or more VLANs from each of all of other relationships. Without spending too much energy or thought on this, it is obvious (or not very difficult to realize) that complexity of the peering relationships is then bounded by the complexity of the configured topology. This argue that a customer would have to have really good reasons to want to make this very complex. And, the absolute upper limit on the peering complexity is directly analogous to the complexity of "peering" in MSTP - i.e. - if MSTP implementations currently only support a relatively small number of separate spanning trees, then RBridges would not be expected to necessarily support a larger number of peering relationships. Based on all of this, I would argue that the problem is not hard to solve. Also, the cost of not solving it is not small. -- Eric Gray Principal Engineer Ericsson > -----Original Message----- > From: Radia Perlman [mailto:Radia.Perlman at sun.com] > Sent: Tuesday, December 04, 2007 10:15 PM > To: Eric Gray > Cc: Developing a hybrid router/bridge. > Subject: Orphaned endnodes with partitioned VLANs on a cloud > Importance: High > > Putting in a subject line since Eric didn't. > > To restate Eric's concern -- it is possible that VLAN A might be > partitioned on a link, and since the DRB selects only > a single VLAN-A-forwarder, some VLAN A endnodes on the cloud > might get > orphaned (since they are on the > other side of the partition from the appointed VLAN A > forwarder on the > link). > > And that is correct. > > So, what is the alternative? Everything is an engineering tradeoff. > We could run DRB elections on every possible VLAN, and with a > partitioned VLAN, wind up with multiple DRBs for > that VLAN. If there were n VLANs, we'd wind up with n DRBs (one per > VLAN), n pseudonodes, n times as many Hello > messages, etc. > > We chose to consider a partiitoned VLAN, caused by bridges > configured to > not pass data traffic for a particular VLAN, > as a misonfiguration. This can happen with bridges anyway, if > you mark a > bunch of your potential through links as not > being allowed for VLAN A transit. The solution in the current spec is > low overhead (at most one pseudonode per link, > one Hello per RBridge except for the DRB) and relatively simple. > Hopefully eventually more bridges will be replaced by > RBridges and this sort of thing won't happen. > > Radia > > > Eric Gray wrote: > > Folks, > > > > Here is the problem that occurs when VLAN state is inferred > > for one VLAN from connectivity provided by another. This is a > > general problem, but has specific applicability to the current > > set of assumptions used in the protocol specification. This came > > up in off-line discussions with Anoop Ghanwani (and others) at an > > IEEE meeting a couple of weeks ago. > > > > A key thing to understand in looking at this problem is it > > is a comparison between how a network works with 802.1Q bridges > > and the same network after some 802.1Q bridges have been replaced > > with RBridges. The example shows a partial RBridge deployment > > and this is compared with how it will have worked with 802.1Q > > bridges where the example shows RBridges (i.e. - it is an after- > > the-fact comparison of L2 forwarding functionality). > > > > The network looks like this (initially): > > > > \\\\|//// \\\\\|||///// \\\|/// > > __|__ __|__ __|__ > > | B-1 | | B-2 | | B-3 | > > |_____| |_____| |_____| > > | \ / \_____ / | > > | \_____/ | B-5 | | > > | | B-4 | |_____| | > > | |_____| | | > > __|__ \ | | > > | B-6 |_______\ _______| __|__ > > |_____| \___________| B-7 | > > | |_____| > > | | > > __|___ __|___ > > | RB-1 | | RB-2 | > > |______| |______| > > \ _ / > > \__ __( )___ _/ > > \_( Core )_/ > > (_ RBridge _) > > (_ Cloud _) > > (_ _) > > (_) > > > > In this figure, B-1, B-2 and B-3 are aggregation bridges with > > multiple (lots and lots) of VIDs. > > > > B-4 is a special purpose bridge used for VLAN-A only, and B-5 > > is a special purpose bridge used for VLAN-B only. > > > > All remaining bridges are configured to participate in VLAN-A, > > VLAN-B and an arbitrary set of zero or more other VLANs. Since B-4 > > and B-5 are configured for specific VLANs only, the ports on their > > adjacent bridging peers are configured only for those VLANs. > > > > To be clear, the links between B-4 and B-7, and between B-6 > > and B-5 are not connected (they merely overlap in the drawing). > > > > In this network, RB-1 and RB-2 are both RBridges and both have > > access to VLAN A and VLAN B and each other (via both VLAN A > and VLAN > > B) - as well as the same arbitrary set of zero or more VLANs that > > any of the other bridges in the drawing have. However, the two > > RBridges use another VLAN - say VLAN C - to exchange hellos. Under > > the normal operating conditions intended, this works fine and RB-1 > > and RB-2 may separately be elected DRB for either of VLAN A > or VLAN B. > > > > If bridge B-4 fails, then VLAN A would be segmented, at least > > temporarily. Hellos continue to work, however, so the 2 RBridges > > do not discover the partition and the DRB election remains > unchanged. > > In this case, part of the VLAN is orphaned - particularly from the > > perspective of any locally attached end-stations. > > > > However, this is not acceptable behavior. No misconfiguration > > exists and the ideal (and reasonably expected) behavior > would be for > > the RBridges to discover the partition and redo the DRB election - > > making RB-1 the DRB for one partition and RB-2 the DRB for > the other > > partition. > > > > If (when) RB-1 and RB-2 were 802.1Q bridges, using MSTP for > > multiple VLANs (in particular, for VLAN A and B), this failure will > > have resulted in re-running (M)STP for the affected VLAN > connectivity > > and the segmentation (partitioning) would be healed. In > the same way, > > if the status of VLAN's A and B were derived directly from messages > > that use VLAN A and VLAN B (as opposed to using VLAN C), this same > > robust behavior would occur. > > > > > > -- > > Eric Gray > > Principal Engineer > > Ericsson > > > > _______________________________________________ > > rbridge mailing list > > rbridge at postel.org > > http://mailman.postel.org/mailman/listinfo/rbridge > > > > From eric.gray at ericsson.com Wed Dec 5 11:24:55 2007 From: eric.gray at ericsson.com (Eric Gray) Date: Wed, 5 Dec 2007 13:24:55 -0600 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud In-Reply-To: <4C94DE2070B172459E4F1EE14BD2364EB4EF13@HQ-EXCH-5.corp.brocade.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <475617A0.4010302@sun.com> <4C94DE2070B172459E4F1EE14BD2364EB4EF13@HQ-EXCH-5.corp.brocade.com> Message-ID: <941D5DCD8C42014FAF70FB7424686DCF0211FC57@eusrcmw721.eamcs.ericsson.se> Anoop/Radia, This is not exactly the note on which we left off in the off-line discussion. As I recall (isn't E-Mail great!), I replied to this with the following: ============================================================ If the RBridges were inserted to replace 802.1Q bridges, and each was configured as I've described for access to both VLAN A and B, then the bridge failure I described would simply result in a different spanning tree for VLAN A. VLAN A would not be broken, it would simply be (potentially) less optimal. ============================================================ This was specifically in response to your comment (to the effect that this wouldn't have worked in an 802.1Q LAN - so it would have been a misconfiguration there as well). In response to the above observation you said that you agreed that this would not have been a misconfiguration error in that case. Aactually, what you said exactly was "Yes, you are right." More on this in a separate mail message response to Radia... Thanks! -- Eric Gray Principal Engineer Ericsson > -----Original Message----- > From: Anoop Ghanwani [mailto:aghanwan at brocade.com] > Sent: Tuesday, December 04, 2007 11:27 PM > To: Radia Perlman; Eric Gray > Cc: Developing a hybrid router/bridge. > Subject: RE: [rbridge] Orphaned endnodes with partitioned > VLANs on a cloud > Importance: High > > > Radia, > > This is what I told Eric during the offline discussion > as well - that this is a misconfiguration and as long > as bad things aren't happening, it's up to the administrator > to configure things so that they work correctly when > such failures happen. > > It is no worse than what can happen in a misconfigured > bridged network. > > Anoop > > > -----Original Message----- > > From: rbridge-bounces at postel.org > > [mailto:rbridge-bounces at postel.org] On Behalf Of Radia Perlman > > Sent: Tuesday, December 04, 2007 7:15 PM > > To: Eric Gray > > Cc: Developing a hybrid router/bridge. > > Subject: [rbridge] Orphaned endnodes with partitioned VLANs > on a cloud > > > > Putting in a subject line since Eric didn't. > > > > To restate Eric's concern -- it is possible that VLAN A might > > be partitioned on a link, and since the DRB selects only a > > single VLAN-A-forwarder, some VLAN A endnodes on the cloud > > might get orphaned (since they are on the other side of the > > partition from the appointed VLAN A forwarder on the link). > > > > And that is correct. > > > > So, what is the alternative? Everything is an engineering tradeoff. > > We could run DRB elections on every possible VLAN, and with a > > partitioned VLAN, wind up with multiple DRBs for that VLAN. > > If there were n VLANs, we'd wind up with n DRBs (one per > > VLAN), n pseudonodes, n times as many Hello messages, etc. > > > > We chose to consider a partiitoned VLAN, caused by bridges > > configured to not pass data traffic for a particular VLAN, as > > a misonfiguration. This can happen with bridges anyway, if > > you mark a bunch of your potential through links as not being > > allowed for VLAN A transit. The solution in the current spec > > is low overhead (at most one pseudonode per link, one Hello > > per RBridge except for the DRB) and relatively simple. > > Hopefully eventually more bridges will be replaced by > > RBridges and this sort of thing won't happen. > > > > Radia > > > > > > Eric Gray wrote: > > > Folks, > > > > > > Here is the problem that occurs when VLAN state is > > inferred for one > > > VLAN from connectivity provided by another. This is a general > > > problem, but has specific applicability to the current set of > > > assumptions used in the protocol specification. This came up in > > > off-line discussions with Anoop Ghanwani (and others) at an IEEE > > > meeting a couple of weeks ago. > > > > > > A key thing to understand in looking at this problem is it is a > > > comparison between how a network works with 802.1Q > bridges and the > > > same network after some 802.1Q bridges have been replaced with > > > RBridges. The example shows a partial RBridge deployment > > and this is > > > compared with how it will have worked with 802.1Q bridges > where the > > > example shows RBridges (i.e. - it is an after- the-fact > > comparison of > > > L2 forwarding functionality). > > > > > > The network looks like this (initially): > > > > > > \\\\|//// \\\\\|||///// \\\|/// > > > __|__ __|__ __|__ > > > | B-1 | | B-2 | | B-3 | > > > |_____| |_____| |_____| > > > | \ / \_____ / | > > > | \_____/ | B-5 | | > > > | | B-4 | |_____| | > > > | |_____| | | > > > __|__ \ | | > > > | B-6 |_______\ _______| __|__ > > > |_____| \___________| B-7 | > > > | |_____| > > > | | > > > __|___ __|___ > > > | RB-1 | | RB-2 | > > > |______| |______| > > > \ _ / > > > \__ __( )___ _/ > > > \_( Core )_/ > > > (_ RBridge _) > > > (_ Cloud _) > > > (_ _) > > > (_) > > > > > > In this figure, B-1, B-2 and B-3 are aggregation bridges with > > > multiple (lots and lots) of VIDs. > > > > > > B-4 is a special purpose bridge used for VLAN-A only, > > and B-5 is a > > > special purpose bridge used for VLAN-B only. > > > > > > All remaining bridges are configured to participate in > > VLAN-A, VLAN-B > > > and an arbitrary set of zero or more other VLANs. Since > > B-4 and B-5 > > > are configured for specific VLANs only, the ports on > their adjacent > > > bridging peers are configured only for those VLANs. > > > > > > To be clear, the links between B-4 and B-7, and between > > B-6 and B-5 > > > are not connected (they merely overlap in the drawing). > > > > > > In this network, RB-1 and RB-2 are both RBridges and > > both have access > > > to VLAN A and VLAN B and each other (via both VLAN A and VLAN > > > B) - as well as the same arbitrary set of zero or more > > VLANs that any > > > of the other bridges in the drawing have. However, the two > > RBridges > > > use another VLAN - say VLAN C - to exchange hellos. Under > > the normal > > > operating conditions intended, this works fine and RB-1 and > > RB-2 may > > > separately be elected DRB for either of VLAN A or VLAN B. > > > > > > If bridge B-4 fails, then VLAN A would be segmented, at least > > > temporarily. Hellos continue to work, however, so the 2 > > RBridges do > > > not discover the partition and the DRB election remains unchanged. > > > In this case, part of the VLAN is orphaned - particularly > from the > > > perspective of any locally attached end-stations. > > > > > > However, this is not acceptable behavior. No > > misconfiguration exists > > > and the ideal (and reasonably expected) behavior would be for the > > > RBridges to discover the partition and redo the DRB > > election - making > > > RB-1 the DRB for one partition and RB-2 the DRB for the other > > > partition. > > > > > > If (when) RB-1 and RB-2 were 802.1Q bridges, using MSTP > > for multiple > > > VLANs (in particular, for VLAN A and B), this failure will have > > > resulted in re-running (M)STP for the affected VLAN > > connectivity and > > > the segmentation (partitioning) would be healed. In the > > same way, if > > > the status of VLAN's A and B were derived directly from > > messages that > > > use VLAN A and VLAN B (as opposed to using VLAN C), this > > same robust > > > behavior would occur. > > > > > > > > > -- > > > Eric Gray > > > Principal Engineer > > > Ericsson > > > > > > _______________________________________________ > > > rbridge mailing list > > > rbridge at postel.org > > > http://mailman.postel.org/mailman/listinfo/rbridge > > > > > > > _______________________________________________ > > rbridge mailing list > > rbridge at postel.org > > http://mailman.postel.org/mailman/listinfo/rbridge > > > From Donald.Eastlake at motorola.com Wed Dec 5 12:52:32 2007 From: Donald.Eastlake at motorola.com (Eastlake III Donald-LDE008) Date: Wed, 5 Dec 2007 15:52:32 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <475627B8.10603@isi.edu> References: <475614D5.9000207@sun.com> <475627B8.10603@isi.edu> Message-ID: <3870C46029D1F945B1472F170D2D97900350CADC@de01exm64.ds.mot.com> Hi Joe, See comments below at @@@ -----Original Message----- From: rbridge-bounces at postel.org [mailto:rbridge-bounces at postel.org] On Behalf Of Joe Touch Sent: Tuesday, December 04, 2007 11:23 PM To: Radia Perlman Cc: rbridge at postel.org Subject: Re: [rbridge] Critical bits for options Radia Perlman wrote: > I'd like to make sure the decision about whether to define critical bits > for options is made with "informed consent" of the WG. I will write > this with no opinion -- just the tradeoffs. > > Currently the TRILL spec says to ignore all options -- the only thing > the spec says is how to skip the options, if any. > > A "critical" option (some people call it "mandatory") is an option that > if it appears and you don't understand it, you MUST drop > the packet. A noncritical option is one that you are allowed to ignore > and skip over. > > The way the spec is now precludes critical options, because RBridges > following the current spec will skip all options. > > An alternative is for TRILL to define two bits at the beginning of the > options (these bits only appear if the options length is > greater than 0). > > The two bits are: > a) a critical hop-by-hop option exists > b) a critical end-to-end option exists. > > If we define these bits, then an egress RBridge MUST look to see if > either of those bits are set, and if so, parse the options. I believe we're talking about a "summary" bit in each case. As a result, the goal of these bits is to avoid parsing the options in detail. @@@ Yes, "summary" is a reasonable description of these bits. And, if one is trying to postpone defining the structure of options in detail, as the current protocol draft does, making it possible to avoid parsing them seems like a good idea :-) There is benefit ONLY for hop-by-hop bits, i.e., to accelerate forwarding. Egress rbridges would need to parse the entire header anyway. @@@ If by "parse the entire header" you mean look at each option individually, I disagree. If all options present are non-critical, the egress Rbridge, like all Rbridges, need not look at any of them. @@@ If by "parse the entire header" you mean examine stuff beyond the TRILL Header, the egress Rbridge has to do that to decapsulate but transit Rbridges have to do this also. While there are lots of other reasons to look beyond the TRILL header for various cases of multi-destination tree pruning, the most general reason is to get the frame priority from the inner VLAN tag. This is necessary even for known unicast frames for robustness since the outer VLAN tag may have been stripped. However, skipping over the options without parsing them is easy since their length is given and, if you know there were no critical options present, safe. That said, I'm not sure this is the right meaning for these bits. They are useful as a summary ONLY if they indicate whether "CAN IGNORE" in fast path rbridges. Let's presume that's NOT what the bit means, i.e., the bit means there is a critical option (as defined above). The rbridge still needs to parse the entire option set to find out WHICH options are listed, and to decide whether each is supported or not anyway. @@@ Well, I expect that many Rbridges will, for a long time, not implement any options, so they clearly wouldn't need to parse them if these summary bits were available. Even with Rbridges that do support some options, I would expect the more detailed TRILL option feature specification to require that all hop-by-hop options to precede all end-to-end options (and perhaps, within those categories, to have critical options precede non-critical options) so a transit Rbridge could stop parsing as soon as it hits an end-to-end option. You could also add a summary bit for the existence of non-critical hop-by-hop options so a transit Rbridge would know if all the options present are end-to-end. > Though I suppose an RBridge that doesn't support ANY critical options > would know based on the presence of a critical > option that it should drop the packet, without having to parse to find > the critical option. That works ONLY for the case where an rbridge supports NO critical options. If it supports any, then it needs to parse the entire header anyway. Is that really worth this? @@@ Transit Rbridges that support no hop-by-hop options, critical or non-critical, are expected to be very common. @@@ It is also good to keep in mind that, at least for unicast, these summary bits are probably just a safety measure. That's because I would expect that Rbridges will be required to announce in their LSP what options they support (at least for critical options and probably for all). Thus, by imposing requirements on the sender, you could claim that no unicast frame with a critical option should ever be sent to an Rbridge that does not support it. However, in the real world, errors of all types happen. If, say, a transit Rbridge that supported no options did receive a frame with a critical hop-by-hop option in it and you did not have these summary bits (or they were set wrong), it will screw up. But there is no way in the real world to guarantee that something like that could never happen. @@@ An argument in favor of the bits is that you might have a case where you are forwarding a multi-destination frame down a distribution tree from Rbridge Z and out one port there are three downstream Rbridges one hop away, A, B, and C, where the frame has a critical option that you know from the link state database is supported by A and B but not C. It might be nice, for some options, to be able to just multicast the frame knowing that C will just discard it. This would be a decision by RBridge Z (which must understand the option or it would not have created or successfully received the frame). Joe @@@ Thanks, @@@ Donald From erik.nordmark at sun.com Wed Dec 5 18:16:25 2007 From: erik.nordmark at sun.com (Erik Nordmark) Date: Wed, 05 Dec 2007 18:16:25 -0800 Subject: [rbridge] FYI: L2VPN presentation Message-ID: <47575B79.9070702@sun.com> Since your charter says we should work with L2VPN Donald put together a few slides and I presented it. It was past the end of the time slot so there wasn't time for any feedback, but we'll pursue feedback with the L2VPN chairs. The presentation is at http://www3.ietf.org/proceedings/07dec/slides/l2vpn-4.ppt Erik From touch at ISI.EDU Wed Dec 5 23:45:19 2007 From: touch at ISI.EDU (Joe Touch) Date: Wed, 05 Dec 2007 23:45:19 -0800 Subject: [rbridge] Critical bits for options In-Reply-To: <3870C46029D1F945B1472F170D2D97900350CADC@de01exm64.ds.mot.com> References: <475614D5.9000207@sun.com> <475627B8.10603@isi.edu> <3870C46029D1F945B1472F170D2D97900350CADC@de01exm64.ds.mot.com> Message-ID: <4757A88F.90100@isi.edu> Eastlake III Donald-LDE008 wrote: > Hi Joe, > > See comments below at @@@ > ... >> The two bits are: >> a) a critical hop-by-hop option exists >> b) a critical end-to-end option exists. >> >> If we define these bits, then an egress RBridge MUST look to see if >> either of those bits are set, and if so, parse the options. > > I believe we're talking about a "summary" bit in each case. As a result, > the goal of these bits is to avoid parsing the options in detail. > > @@@ Yes, "summary" is a reasonable description of these bits. And, if > one is trying to postpone defining the structure of options in detail, > as the current protocol draft does, making it possible to avoid parsing > them seems like a good idea :-) > > There is benefit ONLY for hop-by-hop bits, i.e., to accelerate > forwarding. Egress rbridges would need to parse the entire header > anyway. > > @@@ If by "parse the entire header" you mean look at each option > individually, Yes. > I disagree. If all options present are non-critical, the > egress Rbridge, like all Rbridges, need not look at any of them. Optional non-critical options? Here's the issue: let's say you have a hop option that IS implemented at the internal hops, but is NOT critical. Which hops would ever see them? If nobody looks at them, why bother putting them in? IMO, every internal hop must examine all hop options, and an egress must examine all end options - that's the only way for either to know which ones they implement that they CAN handle. The only short-cut here is that internal hops could skip examining options if there were no hop options. > @@@ If by "parse the entire header" you mean examine stuff beyond the > TRILL Header, the egress Rbridge has to do that to decapsulate but > transit Rbridges have to do this also. While there are lots of other > reasons to look beyond the TRILL header for various cases of > multi-destination tree pruning, the most general reason is to get the > frame priority from the inner VLAN tag. This is necessary even for known > unicast frames for robustness since the outer VLAN tag may have been > stripped. However, skipping over the options without parsing them is > easy since their length is given and, if you know there were no critical > options present, safe. Yes, safe, but again what's the point of putting things in that nodes can ignore even if they implement? If the node doesn't examine the options, how does it know it does or does not support them? > That said, I'm not sure this is the right meaning for these bits. They > are useful as a summary ONLY if they indicate whether "CAN IGNORE" in > fast path rbridges. > > Let's presume that's NOT what the bit means, i.e., the bit means there > is a critical option (as defined above). The rbridge still needs to > parse the entire option set to find out WHICH options are listed, and to > decide whether each is supported or not anyway. > > @@@ Well, I expect that many Rbridges will, for a long time, not > implement any options, so they clearly wouldn't need to parse them if > these summary bits were available. The bits are relevant only to rbridges that don't implement the options used. As you point out, that is useful for rbridges that implement NO options. An rbridge that implements ANY options would then need to examine them ALL to see which ones to handle. > Even with Rbridges that do support > some options, I would expect the more detailed TRILL option feature > specification to require that all hop-by-hop options to precede all > end-to-end options (and perhaps, within those categories, to have > critical options precede non-critical options) so a transit Rbridge > could stop parsing as soon as it hits an end-to-end option. You could > also add a summary bit for the existence of non-critical hop-by-hop > options so a transit Rbridge would know if all the options present are > end-to-end. You don't need that bit - you get that info from examining the first option (if it's not HBH, then there are no HBH options!). >> Though I suppose an RBridge that doesn't support ANY critical options >> would know based on the presence of a critical >> option that it should drop the packet, without having to parse to find >> the critical option. > > That works ONLY for the case where an rbridge supports NO critical > options. If it supports any, then it needs to parse the entire header > anyway. Is that really worth this? > > @@@ Transit Rbridges that support no hop-by-hop options, critical or > non-critical, are expected to be very common. Sure - for nodes that support NO options, these flags might be useful. B > @@@ It is also good to keep in mind that, at least for unicast, these > summary bits are probably just a safety measure. That's because I would > expect that Rbridges will be required to announce in their LSP what > options they support (at least for critical options and probably for > all). Thus, by imposing requirements on the sender, you could claim that > no unicast frame with a critical option should ever be sent to an > Rbridge that does not support it. However, in the real world, errors of > all types happen. If, say, a transit Rbridge that supported no options > did receive a frame with a critical hop-by-hop option in it and you did > not have these summary bits (or they were set wrong), it will screw up. > But there is no way in the real world to guarantee that something like > that could never happen. > > @@@ An argument in favor of the bits is that you might have a case where > you are forwarding a multi-destination frame down a distribution tree > from Rbridge Z and out one port there are three downstream Rbridges one > hop away, A, B, and C, where the frame has a critical option that you > know from the link state database is supported by A and B but not C. It > might be nice, for some options, to be able to just multicast the frame > knowing that C will just discard it. This would be a decision by RBridge > Z (which must understand the option or it would not have created or > successfully received the frame). Overall, I'd like to see a case for using these bits. You've made the case for IPv6 rules - ordering the options, and there clearly also needs to be IPv6-like flags inside the options. If the only reason for the summary flag is for transit rbridges implementing NO options, that's an extraordinarily specific corner case, IMO, for such a flag. The flag isn't particularly useful at the egress, IMO. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://mailman.postel.org/pipermail/rbridge/attachments/20071205/e5749c67/signature.bin From ddutt at cisco.com Thu Dec 6 10:25:50 2007 From: ddutt at cisco.com (Dinesh G Dutt) Date: Thu, 06 Dec 2007 10:25:50 -0800 Subject: [rbridge] Critical bits for options In-Reply-To: <4756DEE8.3030504@isi.edu> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <4756DEE8.3030504@isi.edu> Message-ID: <47583EAE.8000907@cisco.com> v6 rules are more complicated than we need, aren't they ? All we need is the ability for an Rbridge to say, "I don't support any options, but this frame has options which I understand are fine to ignore and so I'll process the frame in fast path AND not delete the options from the outgoing frame". This allows us to define options that can pass through transit Rbridges. If we don't do this, I'd say that options would in "dead in the water", pretty much like IP options. Dinesh Joe Touch wrote: > A few lessons from IP (which we may or may not want to emulate): > > -------------------------- > V6 rules: > > - hop-by-hop options MUST come before E2E ones > > - unrecognized options have 4 variants indicated via flags: > - silent pass (skip over silently) > - silent discard > - discard w/ICMP > - discard w/ICMP iff source != multicast > > - an additional per-option flag indicates: > - immutable en-route > - mutable en-route (i.e., ignore in Auth) > > -------------------------- > V4 rules: > > - unrecognized options > - silent pass > > ------------------------- > > I'll note that IPv6 doesn't e have a summary bit about critical HBH > options. I don't see why we would anticipate being more complex than > IPv6 in this regard. > > IMO, the IPv6 rules are probably sufficient to use as-is. > > Joe > > > > ------------------------------------------------------------------------ > > _______________________________________________ > rbridge mailing list > rbridge at postel.org > http://mailman.postel.org/mailman/listinfo/rbridge > -- We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan From ddutt at cisco.com Thu Dec 6 10:29:49 2007 From: ddutt at cisco.com (Dinesh G Dutt) Date: Thu, 06 Dec 2007 10:29:49 -0800 Subject: [rbridge] (no subject) In-Reply-To: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> Message-ID: <47583F9D.4070302@cisco.com> This is another reason for me to be nervous about making these changes now. The old method where we did DRB per VLAN, a single pseudo-node for all VLANs that the Rbridge was DRB for seemed sufficient to me and avoided all these issues. We've gotten into this mess because someone decided that sending multiple Hellos (one per active VLAN) is a huge burden on the CPU. I disagree with that assumption and would like to avoid all these issues by falling back to what we had earlier. Dinesh Eric Gray wrote: > Folks, > > Here is the problem that occurs when VLAN state is inferred > for one VLAN from connectivity provided by another. This is a > general problem, but has specific applicability to the current > set of assumptions used in the protocol specification. This came > up in off-line discussions with Anoop Ghanwani (and others) at an > IEEE meeting a couple of weeks ago. > > A key thing to understand in looking at this problem is it > is a comparison between how a network works with 802.1Q bridges > and the same network after some 802.1Q bridges have been replaced > with RBridges. The example shows a partial RBridge deployment > and this is compared with how it will have worked with 802.1Q > bridges where the example shows RBridges (i.e. - it is an after- > the-fact comparison of L2 forwarding functionality). > > The network looks like this (initially): > > \\\\|//// \\\\\|||///// \\\|/// > __|__ __|__ __|__ > | B-1 | | B-2 | | B-3 | > |_____| |_____| |_____| > | \ / \_____ / | > | \_____/ | B-5 | | > | | B-4 | |_____| | > | |_____| | | > __|__ \ | | > | B-6 |_______\ _______| __|__ > |_____| \___________| B-7 | > | |_____| > | | > __|___ __|___ > | RB-1 | | RB-2 | > |______| |______| > \ _ / > \__ __( )___ _/ > \_( Core )_/ > (_ RBridge _) > (_ Cloud _) > (_ _) > (_) > > In this figure, B-1, B-2 and B-3 are aggregation bridges with > multiple (lots and lots) of VIDs. > > B-4 is a special purpose bridge used for VLAN-A only, and B-5 > is a special purpose bridge used for VLAN-B only. > > All remaining bridges are configured to participate in VLAN-A, > VLAN-B and an arbitrary set of zero or more other VLANs. Since B-4 > and B-5 are configured for specific VLANs only, the ports on their > adjacent bridging peers are configured only for those VLANs. > > To be clear, the links between B-4 and B-7, and between B-6 > and B-5 are not connected (they merely overlap in the drawing). > > In this network, RB-1 and RB-2 are both RBridges and both have > access to VLAN A and VLAN B and each other (via both VLAN A and VLAN > B) - as well as the same arbitrary set of zero or more VLANs that > any of the other bridges in the drawing have. However, the two > RBridges use another VLAN - say VLAN C - to exchange hellos. Under > the normal operating conditions intended, this works fine and RB-1 > and RB-2 may separately be elected DRB for either of VLAN A or VLAN B. > > If bridge B-4 fails, then VLAN A would be segmented, at least > temporarily. Hellos continue to work, however, so the 2 RBridges > do not discover the partition and the DRB election remains unchanged. > In this case, part of the VLAN is orphaned - particularly from the > perspective of any locally attached end-stations. > > However, this is not acceptable behavior. No misconfiguration > exists and the ideal (and reasonably expected) behavior would be for > the RBridges to discover the partition and redo the DRB election - > making RB-1 the DRB for one partition and RB-2 the DRB for the other > partition. > > If (when) RB-1 and RB-2 were 802.1Q bridges, using MSTP for > multiple VLANs (in particular, for VLAN A and B), this failure will > have resulted in re-running (M)STP for the affected VLAN connectivity > and the segmentation (partitioning) would be healed. In the same way, > if the status of VLAN's A and B were derived directly from messages > that use VLAN A and VLAN B (as opposed to using VLAN C), this same > robust behavior would occur. > > > -- > Eric Gray > Principal Engineer > Ericsson > > _______________________________________________ > rbridge mailing list > rbridge at postel.org > http://mailman.postel.org/mailman/listinfo/rbridge > > -- We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan From touch at ISI.EDU Thu Dec 6 10:30:47 2007 From: touch at ISI.EDU (Joe Touch) Date: Thu, 06 Dec 2007 10:30:47 -0800 Subject: [rbridge] Critical bits for options In-Reply-To: <47583EAE.8000907@cisco.com> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <4756DEE8.3030504@isi.edu> <47583EAE.8000907@cisco.com> Message-ID: <47583FD7.3060205@isi.edu> Dinesh G Dutt wrote: > v6 rules are more complicated than we need, aren't they ? All we need is > the ability for an Rbridge to say, "I don't support any options, but > this frame has options which I understand are fine to ignore and so I'll > process the frame in fast path AND not delete the options from the > outgoing frame". If there are two variants: - NO options implemented - some options implemented AND we're trying to optimize the no-options case, then yes, a single bit would be useful. I am concerned that "no options" is a corner case; it's more likely that over time there will be some options in most devices. If that's the case, then a single bit doesn't help. > This allows us to define options that can pass through > transit Rbridges. If we don't do this, I'd say that options would in > "dead in the water", pretty much like IP options. If we need to optimize for the 'no critical options' case, then ALL critical options are dead in the water, by the above logic... Joe > > Dinesh > Joe Touch wrote: >> A few lessons from IP (which we may or may not want to emulate): >> >> -------------------------- >> V6 rules: >> >> - hop-by-hop options MUST come before E2E ones >> >> - unrecognized options have 4 variants indicated via flags: >> - silent pass (skip over silently) >> - silent discard >> - discard w/ICMP >> - discard w/ICMP iff source != multicast >> >> - an additional per-option flag indicates: >> - immutable en-route >> - mutable en-route (i.e., ignore in Auth) >> >> -------------------------- >> V4 rules: >> >> - unrecognized options >> - silent pass >> >> ------------------------- >> >> I'll note that IPv6 doesn't e have a summary bit about critical HBH >> options. I don't see why we would anticipate being more complex than >> IPv6 in this regard. >> >> IMO, the IPv6 rules are probably sufficient to use as-is. >> >> Joe >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> rbridge mailing list >> rbridge at postel.org >> http://mailman.postel.org/mailman/listinfo/rbridge >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://mailman.postel.org/pipermail/rbridge/attachments/20071206/f08c135a/signature.bin From ddutt at cisco.com Thu Dec 6 11:03:38 2007 From: ddutt at cisco.com (Dinesh G Dutt) Date: Thu, 06 Dec 2007 11:03:38 -0800 Subject: [rbridge] Critical bits for options In-Reply-To: <47583FD7.3060205@isi.edu> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <4756DEE8.3030504@isi.edu> <47583EAE.8000907@cisco.com> <47583FD7.3060205@isi.edu> Message-ID: <4758478A.8050007@cisco.com> You're saying that if we make options support possible in the fast path, most likely over time there will be more devices that support some options but not others, so a single bit doesn't suffice because the device that supports some options cannot decide if a specific option is critical or not. So you're saying if we define within the options two additional containers, hop-by-hop and e2e, and state that one comes before the other, we're in a better shape. I agree with your logic. I'm still not sure whether we need all the logic of IPv6 options such as stating whether an option may change HbH and we can probably use a single bit to decide whether we can skip over the option or discard the frame; we can ignore the ICMP options piece. Dinesh Joe Touch wrote: > Dinesh G Dutt wrote: > >> v6 rules are more complicated than we need, aren't they ? All we need is >> the ability for an Rbridge to say, "I don't support any options, but >> this frame has options which I understand are fine to ignore and so I'll >> process the frame in fast path AND not delete the options from the >> outgoing frame". >> > > If there are two variants: > - NO options implemented > - some options implemented > > AND we're trying to optimize the no-options case, then yes, a single bit > would be useful. I am concerned that "no options" is a corner case; it's > more likely that over time there will be some options in most devices. > > If that's the case, then a single bit doesn't help. > > >> This allows us to define options that can pass through >> transit Rbridges. If we don't do this, I'd say that options would in >> "dead in the water", pretty much like IP options. >> > > If we need to optimize for the 'no critical options' case, then ALL > critical options are dead in the water, by the above logic... > > Joe > >> Dinesh >> Joe Touch wrote: >> >>> A few lessons from IP (which we may or may not want to emulate): >>> >>> -------------------------- >>> V6 rules: >>> >>> - hop-by-hop options MUST come before E2E ones >>> >>> - unrecognized options have 4 variants indicated via flags: >>> - silent pass (skip over silently) >>> - silent discard >>> - discard w/ICMP >>> - discard w/ICMP iff source != multicast >>> >>> - an additional per-option flag indicates: >>> - immutable en-route >>> - mutable en-route (i.e., ignore in Auth) >>> >>> -------------------------- >>> V4 rules: >>> >>> - unrecognized options >>> - silent pass >>> >>> ------------------------- >>> >>> I'll note that IPv6 doesn't e have a summary bit about critical HBH >>> options. I don't see why we would anticipate being more complex than >>> IPv6 in this regard. >>> >>> IMO, the IPv6 rules are probably sufficient to use as-is. >>> >>> Joe >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> rbridge mailing list >>> rbridge at postel.org >>> http://mailman.postel.org/mailman/listinfo/rbridge >>> >>> > > -- We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan From Donald.Eastlake at motorola.com Thu Dec 6 11:17:19 2007 From: Donald.Eastlake at motorola.com (Eastlake III Donald-LDE008) Date: Thu, 6 Dec 2007 14:17:19 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <18262.45631.542916.231584@gargle.gargle.HOWL> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> Message-ID: <3870C46029D1F945B1472F170D2D97900350CE8B@de01exm64.ds.mot.com> Hi James, See below at @@@ -----Original Message----- From: rbridge-bounces at postel.org [mailto:rbridge-bounces at postel.org] On Behalf Of James Carlson Sent: Wednesday, December 05, 2007 9:14 AM To: Radia Perlman Cc: rbridge at postel.org Subject: Re: [rbridge] Critical bits for options Radia Perlman writes: > Currently the TRILL spec says to ignore all options -- the only thing > the spec says is how to skip the options, if any. I think the part that's actually broken here is not the lack of some summary bits for the option list, but the fact that we haven't defined the basic option format. We should do that, and not just leave it to some "later documents." *Then* we can say that transit RBridges must obey the critical hop-by-hop options, and the egress RBridge must obey all critical options. (Where "obey" means "process if you understand, drop if you don't.") @@@ There is clearly a spectrum of possibilities for the details in the "base protocol" specification. The current document is pretty much at one extreme, having just a size field and saying nothing about the content of the options area. @@@ Adding a specification of a few summary bits at the start of the options area if the options length is non-zero would be the next step. While I personally favor this, I used to think it didn't make much difference because you would be able to tell what options an Rbridge supports by looking in the link state database and not send an Rbridge any options it does not support. But that only works well for unicast. Because of multi-destination frames (broadcast, unknown unicast, and non-IP-derived multicast) I'm even more convinced that it would be useful to specify these summary bits now. @@@ The next stage after summary bits would be to specify the format, ordering, etc., of options, which is what you are suggesting. Unfortunately, I think that could be fairly controversial. Furthermore, I believe that the design of the options format would be greatly improved by designing a number of specific options at the same time. Some such options, like a data frame security option, might be fairly complex. Furthermore, I believe there are multiple efforts working towards high speed Rbridges that implement no options and which would benefit from progressing the specification to the stability of an RFC. @@@ All in all, I think that trying to put the detailed option format in the specification could delay things by 6 months. A separate document seems to me like the way to go. Yes, that means that all option creators need to weigh carefully the costs of creating a new option (in terms of slow-path expense in all systems) versus the benefit the option claims to provide. I think that's a good thing, and the warning in section 3.5 makes it pretty clear. @@@ Thanks, @@@ Donald -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at Sun.COM Thu Dec 6 11:20:15 2007 From: james.d.carlson at Sun.COM (James Carlson) Date: Thu, 6 Dec 2007 14:20:15 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <47583EAE.8000907@cisco.com> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <4756DEE8.3030504@isi.edu> <47583EAE.8000907@cisco.com> Message-ID: <18264.19311.851212.263849@gargle.gargle.HOWL> Dinesh G Dutt writes: > v6 rules are more complicated than we need, aren't they ? All we need is > the ability for an Rbridge to say, "I don't support any options, but > this frame has options which I understand are fine to ignore and so I'll > process the frame in fast path AND not delete the options from the > outgoing frame". This allows us to define options that can pass through > transit Rbridges. If we don't do this, I'd say that options would in > "dead in the water", pretty much like IP options. I don't understand that quoted section. Why would we define options that "everybody" is going to ignore? What would be the point in having always-ignored options? If an implementation actually has support for a given option, then I would expect it to be obligated[1] to process that option, and it should not just "ignore" it. Ignoring is equivalent to _not_ implementing. I think the only question is whether a node receiving a TRILL frame can determine if there are options that it must process. The rules sound to me like: - If you're a transit node, then process all options up to either the end of the list or the first end-to-end option, if any. - If you're an end node, then process all the options. We can "special case" the critical options by tweaking the order, if necessary. Just make the order: 1. Critical hop-by-hop 2. Non-critical hop-by-hop 3. Critical end-to-end 4. Non-critical end-to-end Then, if there are implementations that don't implement any options and if those implementations need to "optimize" the discard/pass action (!), they can simply look to see if the first option on the list is critical, and bail if so, and pass otherwise. (Decapsulating end nodes would have to span over non-critical hop-by-hop options, but that seems like a small price.) [1] Choosing my words carefully here -- "obligated" in moral sense, rather than "required" in an RFC 2119 sense. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From riw at cisco.com Thu Dec 6 11:58:04 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 14:58:04 -0500 Subject: [rbridge] Fixing Broken Connections in the Middle In-Reply-To: <47583F9D.4070302@cisco.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <47583F9D.4070302@cisco.com> Message-ID: <4758544C.2030901@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > This is another reason for me to be nervous about making these changes > now. The old method where we did DRB per VLAN, a single pseudo-node for > all VLANs that the Rbridge was DRB for seemed sufficient to me and > avoided all these issues. We've gotten into this mess because someone > decided that sending multiple Hellos (one per active VLAN) is a huge > burden on the CPU. I disagree with that assumption and would like to > avoid all these issues by falling back to what we had earlier. IMHO, the mechanism proposed is broken in multiple ways.... I would be glad to send out a few emails explaining specific situations where this mechanism of "fixing broken connections in the bridged cloud in the middle" is, itself, broken. I actually don't think you can fix this problem--people will find more ways to misconfigure this stuff than you can account for, or try to work around. Again, IMHO, it shouldn't be a part of the rbridge spec to fix broken connections in the middle of a cloud of rbridges. If folks want to cogitate on the problems involved, and write docs in that area, it's fine, but it shouldn't be a part of any sort of base specification. First, there are a ton of different media types involved. Rbridges could run over wireless media, MPLS, ATM, Frame, IP, and all sorts of other things. Even if you say: "Ethernet only," there's Ethernet, and then there's Ethernet. Let's stop getting hung up on this problem, and figure out how to make rbridges work at a base level. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWFRMER27sUhU9OQRAu+1AJ4vVr9thSqhqcku67nF+dCIfdvw3QCdGjcO haQUK0OvcQdYLY6i5QDmzGQ= =GPHz -----END PGP SIGNATURE----- From riw at cisco.com Thu Dec 6 11:58:04 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 14:58:04 -0500 Subject: [rbridge] Fixing Broken Connections in the Middle In-Reply-To: <47583F9D.4070302@cisco.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <47583F9D.4070302@cisco.com> Message-ID: <4758544C.2030901@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > This is another reason for me to be nervous about making these changes > now. The old method where we did DRB per VLAN, a single pseudo-node for > all VLANs that the Rbridge was DRB for seemed sufficient to me and > avoided all these issues. We've gotten into this mess because someone > decided that sending multiple Hellos (one per active VLAN) is a huge > burden on the CPU. I disagree with that assumption and would like to > avoid all these issues by falling back to what we had earlier. IMHO, the mechanism proposed is broken in multiple ways.... I would be glad to send out a few emails explaining specific situations where this mechanism of "fixing broken connections in the bridged cloud in the middle" is, itself, broken. I actually don't think you can fix this problem--people will find more ways to misconfigure this stuff than you can account for, or try to work around. Again, IMHO, it shouldn't be a part of the rbridge spec to fix broken connections in the middle of a cloud of rbridges. If folks want to cogitate on the problems involved, and write docs in that area, it's fine, but it shouldn't be a part of any sort of base specification. First, there are a ton of different media types involved. Rbridges could run over wireless media, MPLS, ATM, Frame, IP, and all sorts of other things. Even if you say: "Ethernet only," there's Ethernet, and then there's Ethernet. Let's stop getting hung up on this problem, and figure out how to make rbridges work at a base level. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWFRMER27sUhU9OQRAu+1AJ4vVr9thSqhqcku67nF+dCIfdvw3QCdGjcO haQUK0OvcQdYLY6i5QDmzGQ= =GPHz -----END PGP SIGNATURE----- From riw at cisco.com Thu Dec 6 11:58:04 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 14:58:04 -0500 Subject: [rbridge] Fixing Broken Connections in the Middle In-Reply-To: <47583F9D.4070302@cisco.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <47583F9D.4070302@cisco.com> Message-ID: <4758544C.2030901@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > This is another reason for me to be nervous about making these changes > now. The old method where we did DRB per VLAN, a single pseudo-node for > all VLANs that the Rbridge was DRB for seemed sufficient to me and > avoided all these issues. We've gotten into this mess because someone > decided that sending multiple Hellos (one per active VLAN) is a huge > burden on the CPU. I disagree with that assumption and would like to > avoid all these issues by falling back to what we had earlier. IMHO, the mechanism proposed is broken in multiple ways.... I would be glad to send out a few emails explaining specific situations where this mechanism of "fixing broken connections in the bridged cloud in the middle" is, itself, broken. I actually don't think you can fix this problem--people will find more ways to misconfigure this stuff than you can account for, or try to work around. Again, IMHO, it shouldn't be a part of the rbridge spec to fix broken connections in the middle of a cloud of rbridges. If folks want to cogitate on the problems involved, and write docs in that area, it's fine, but it shouldn't be a part of any sort of base specification. First, there are a ton of different media types involved. Rbridges could run over wireless media, MPLS, ATM, Frame, IP, and all sorts of other things. Even if you say: "Ethernet only," there's Ethernet, and then there's Ethernet. Let's stop getting hung up on this problem, and figure out how to make rbridges work at a base level. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWFRMER27sUhU9OQRAu+1AJ4vVr9thSqhqcku67nF+dCIfdvw3QCdGjcO haQUK0OvcQdYLY6i5QDmzGQ= =GPHz -----END PGP SIGNATURE----- From riw at cisco.com Thu Dec 6 11:58:04 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 14:58:04 -0500 Subject: [rbridge] Fixing Broken Connections in the Middle In-Reply-To: <47583F9D.4070302@cisco.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <47583F9D.4070302@cisco.com> Message-ID: <4758544C.2030901@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > This is another reason for me to be nervous about making these changes > now. The old method where we did DRB per VLAN, a single pseudo-node for > all VLANs that the Rbridge was DRB for seemed sufficient to me and > avoided all these issues. We've gotten into this mess because someone > decided that sending multiple Hellos (one per active VLAN) is a huge > burden on the CPU. I disagree with that assumption and would like to > avoid all these issues by falling back to what we had earlier. IMHO, the mechanism proposed is broken in multiple ways.... I would be glad to send out a few emails explaining specific situations where this mechanism of "fixing broken connections in the bridged cloud in the middle" is, itself, broken. I actually don't think you can fix this problem--people will find more ways to misconfigure this stuff than you can account for, or try to work around. Again, IMHO, it shouldn't be a part of the rbridge spec to fix broken connections in the middle of a cloud of rbridges. If folks want to cogitate on the problems involved, and write docs in that area, it's fine, but it shouldn't be a part of any sort of base specification. First, there are a ton of different media types involved. Rbridges could run over wireless media, MPLS, ATM, Frame, IP, and all sorts of other things. Even if you say: "Ethernet only," there's Ethernet, and then there's Ethernet. Let's stop getting hung up on this problem, and figure out how to make rbridges work at a base level. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWFRMER27sUhU9OQRAu+1AJ4vVr9thSqhqcku67nF+dCIfdvw3QCdGjcO haQUK0OvcQdYLY6i5QDmzGQ= =GPHz -----END PGP SIGNATURE----- From riw at cisco.com Thu Dec 6 11:58:04 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 14:58:04 -0500 Subject: [rbridge] Fixing Broken Connections in the Middle In-Reply-To: <47583F9D.4070302@cisco.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <47583F9D.4070302@cisco.com> Message-ID: <4758544C.2030901@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > This is another reason for me to be nervous about making these changes > now. The old method where we did DRB per VLAN, a single pseudo-node for > all VLANs that the Rbridge was DRB for seemed sufficient to me and > avoided all these issues. We've gotten into this mess because someone > decided that sending multiple Hellos (one per active VLAN) is a huge > burden on the CPU. I disagree with that assumption and would like to > avoid all these issues by falling back to what we had earlier. IMHO, the mechanism proposed is broken in multiple ways.... I would be glad to send out a few emails explaining specific situations where this mechanism of "fixing broken connections in the bridged cloud in the middle" is, itself, broken. I actually don't think you can fix this problem--people will find more ways to misconfigure this stuff than you can account for, or try to work around. Again, IMHO, it shouldn't be a part of the rbridge spec to fix broken connections in the middle of a cloud of rbridges. If folks want to cogitate on the problems involved, and write docs in that area, it's fine, but it shouldn't be a part of any sort of base specification. First, there are a ton of different media types involved. Rbridges could run over wireless media, MPLS, ATM, Frame, IP, and all sorts of other things. Even if you say: "Ethernet only," there's Ethernet, and then there's Ethernet. Let's stop getting hung up on this problem, and figure out how to make rbridges work at a base level. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWFRMER27sUhU9OQRAu+1AJ4vVr9thSqhqcku67nF+dCIfdvw3QCdGjcO haQUK0OvcQdYLY6i5QDmzGQ= =GPHz -----END PGP SIGNATURE----- From james.d.carlson at sun.com Thu Dec 6 11:39:44 2007 From: james.d.carlson at sun.com (James Carlson) Date: Thu, 6 Dec 2007 14:39:44 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <3870C46029D1F945B1472F170D2D97900350CE8B@de01exm64.ds.mot.com> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <3870C46029D1F945B1472F170D2D97900350CE8B@de01exm64.ds.mot.com> Message-ID: <18264.20480.290383.880523@gargle.gargle.HOWL> Eastlake III Donald-LDE008 writes: > Hi James, > > See below at @@@ Oh, how I wish the quoting styles were common ... > @@@ There is clearly a spectrum of possibilities for the details in the > "base protocol" specification. The current document is pretty much at > one extreme, having just a size field and saying nothing about the > content of the options area. Indeed. The problem that causes is that you can't really have a meaningful conversation about when to drop or pass based on options, as there's currently no understanding of what's *in* the options. > @@@ Adding a specification of a few summary bits at the start of the > options area if the options length is non-zero would be the next step. I don't agree. That's a *possible* next step, but not *the* next step. The assumption you're making is that summary bits are either useful or necessary, and until we have some clue about the semantics we want, we don't know that. So, for example, it's possible to arrange the options such that a transit node that implements _no_ options at all can determine whether to pass or drop a frame based on the flags on the first option alone. In that case, there's really very little reason to define "summary bits" because all we're doing is very slightly and trivially optimizing that one special edge case -- while at the same time adding another point where things can be broken. What happens if the summary bits disagree with the options contained? Obviously, the system fails, but the point is that duplicating data -- bits from the options into bits in the header -- *always* causes a risk of data synchronization errors. Consider for example a node that must remove options because of the way the options work or because of local configuration that dictates how optioned-packets are forwarded. What will it do? There's a good risk here that the summary bits will become wrong, as they need to be recalculated during changes. > While I personally favor this, I used to think it didn't make much > difference because you would be able to tell what options an Rbridge > supports by looking in the link state database and not send an Rbridge > any options it does not support. But that only works well for unicast. > Because of multi-destination frames (broadcast, unknown unicast, and > non-IP-derived multicast) I'm even more convinced that it would be > useful to specify these summary bits now. I think we need a solid framework for options first, and *then* decide whether it's worthwhile to add extra flags to optimize corner cases. The summary bits aren't primary functionality. They're secondary. They're derived from the information in the options themselves. As such, they're expendable and merely performance hacks. The life of a performance hack depends pretty strongly on the performance case for it, and I haven't seen that argument made yet. > @@@ The next stage after summary bits would be to specify the format, > ordering, etc., of options, which is what you are suggesting. > Unfortunately, I think that could be fairly controversial. It might be controversial, but what we have right now is a #include in the specification that names a non-existent file. We can pretend that's not a problem by papering around it, but it's going to bite us eventually. > Furthermore, > I believe that the design of the options format would be greatly > improved by designing a number of specific options at the same time. > Some such options, like a data frame security option, might be fairly > complex. Agreed; having some idea of what might be in options would help, but it's also true (as others have pointed out) that we can borrow ideas from other successful option mechanisms. Having critical/non-critical and hop-by-hop/end-to-end seems like a sufficient diversity -- but simple specification -- that we can layer other mechanisms on top later. > Furthermore, I believe there are multiple efforts working > towards high speed Rbridges that implement no options and which would > benefit from progressing the specification to the stability of an RFC. I don't see that as a serious issue. What's stopping people from building implementations before there's an RFC? What makes them think that there won't be changes _after_ there's an RFC? This isn't an IEEE spec. I agree that all implementations will benefit from getting this right. I think they'll benefit more if we define it correctly, so that it doesn't cause more trouble down the road. > @@@ All in all, I think that trying to put the detailed option format in > the specification could delay things by 6 months. A separate document > seems to me like the way to go. Six months seems unlikely to me ... but even if it is that long, I'd rather have a more robust document than a "quick" one that (years from now) has vestigial bits relating to performance hacks for obscure implementations that've been long forgotten. Protocols are forever, hardware is not. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From riw at cisco.com Thu Dec 6 12:20:06 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 15:20:06 -0500 Subject: [rbridge] Fixing Broken Connections in the Middle In-Reply-To: <4758544C.2030901@cisco.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <47583F9D.4070302@cisco.com> <4758544C.2030901@cisco.com> Message-ID: <47585976.5010904@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> This is another reason for me to be nervous about making these changes >> now. The old method where we did DRB per VLAN, a single pseudo-node for >> all VLANs that the Rbridge was DRB for seemed sufficient to me and >> avoided all these issues. We've gotten into this mess because someone >> decided that sending multiple Hellos (one per active VLAN) is a huge >> burden on the CPU. I disagree with that assumption and would like to >> avoid all these issues by falling back to what we had earlier. BTW--I actually agree with Dinesh' message here--there is no "massive burden" of sending one hello per vlan. That's what I get for picking a random message to reply to. :-) In the real world, if someone configured 4000 parallel transit vlans, we'd tell them to please redesign their network. In fact, most of the routing sorts of devices we're talking about will only hold 256 or so forwarding entries for any given destination--so 4000 is something of a mute point, I think. I'm in agreement that we should go back to the original concept--let the user decide which logical links rbridge traffic should run on, and let them configure it. If they misconfigure it, they do so at their own cost. As I said before--they're going to find more ways to misconfigure it than you can fix, and they're going to find them faster than you can think of ways to fix them. One of the things they'll need to consider when configuring this is the overhead of multiple IS-IS hello's. It's already something they have to think about in routing, so I don't see it as any great burden. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWFl2ER27sUhU9OQRAnbCAJ4kEsDAL+gaRwgmLujKjUzhNm/edgCgsOjJ ZmjxL2UJ0OmswgMxggQ/Nag= =h22V -----END PGP SIGNATURE----- From riw at cisco.com Thu Dec 6 13:05:39 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 16:05:39 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <18264.20480.290383.880523@gargle.gargle.HOWL> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <3870C46029D1F945B1472F170D2D97900350CE8B@de01exm64.ds.mot.com> <18264.20480.290383.880523@gargle.gargle.HOWL> Message-ID: <47586423.9090401@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm going to try a little summary here, just to see if I'm getting this.... :-) I'll include some questions to things I really don't know (?).... 1. For each packet processed, there will be some set of options contained within the header. 2. Some platforms might be able to switch some or no options in a faster switching path (hardware, for instance). 3. There are some options which must be switched because they are hop-by-hop, no matter what they do to the switching speed. So, the problem is--how do you signal, without looking for the options, whether or not the faster switching path the packet comes in on should punt the packet to some slower path because of the presence of some option or another. Now, if this is all correct, the questions: 1. What are these options for? I'm a bit confused as to why there are such things as options (?).... 2. Will there be a consistent list, through time, of which options a given platform will switch in some faster path, and which they will switch in some slower path? 3. Will it be faster to sort through an option summary per option rather than through the options themselves? Trying to sort this out: 1. It seems like if 2 is false, then there's no point in trying to put the options in any sort of order (?).... How can the sender know what some processing node in the middle can switch faster or slower? 2. It also seems like if 2 is false, there's no point in trying to put some sort of "I can be switched in the faster path," bit, because we don't really know what's switchable in the "faster path." I'm uncertain on this one, so please straighten me out if I'm wrong. :-) 3. If 3 is no, then I'm not certain what we would gain by these things? Overall, I think it might be useful to have a bit saying: "This packet has options you really need to process, so don't just switch it without paying attention." A bit that says: "I have no options" might also be useful. I'm not certain I understand the rest. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWGQjER27sUhU9OQRAs9RAJ97C8T5Jlu7dXE+ocZ4XBTcG1+CfACg5zhk WZirYH3Ou4AIW4Z8P27URSQ= =tBbR -----END PGP SIGNATURE----- From Radia.Perlman at sun.com Thu Dec 6 13:57:34 2007 From: Radia.Perlman at sun.com (Radia Perlman) Date: Thu, 06 Dec 2007 13:57:34 -0800 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud In-Reply-To: <475617A0.4010302@sun.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <475617A0.4010302@sun.com> Message-ID: <4758704E.70604@sun.com> The two possible designs people are thinking about are: a) separate DRB election per VLAN, possibly a different DRB per VLAN, and a different pseudonode (visible outside that link) for each VLAN. Per-VLAN adjacencies to each neighbor on that link. And, get rid of the safety check of "report root bridge, and RB2 should not decapslate something from ingress RB1 if RB1 reports the same root bridge", because with a partitioned VLAN, both RB1 and RB2 will see the same root bridge, and it looks like people advocating this option want to make sure both RB1 and RB2 will encpasulate/decapsulate from the same link. Tough luck for orphaned endnodes if a VLAN is partitioned and there is no RBridge in the partition with the endnode. b) one DRB, one pseudonode, tough luck for orphaned endnodes if a VLAN is partitioned, except for endnodes in the same partition as the RBridge appointed forwarder for that VLAN. The downside of a) is scalability. And if there is a link that is willing for any VLAN to traverse, wouldn't that be 4000 VLANs to send Hellos on, on that link? The downside of b) is that if there is a partitioned VLAN, then some endnodes will be orphaned. But a partitioned VLAN will only occur if someone has configured bridges to refuse to pass certain VLANs, and can be easily fixed through a number of ways (like, for instance, changing the configuration of the bridges). And note that orphaned endnodes can happen no matter what TRILL does because it could happen that VLAN A is partitioned and there is no RBridge on one of the partitions. Endnodes in that partition would be orphaned even if we did a) Note that if TRILL is trying to exactly copy all bridge functionality, we are not doing that. Some people apparently configure partitioned VLANs, I've heard, because they are reusing VLAN nmbers. TRILL is specifically not keeping different instances of VLAN A separate, but making the assumption that VLAN A = VLAN A. I think it is perfectly reasonable to do b), which is safer (because of the root bridge check), and more scalable, and consider a link with a partitioned VLAN as a configuration mistake. It seems like a better way of accomplishing whatever it is that the customers were attempting to accomplish with restricting certain VLANs from traversing certain links, would be doing all the fancy traffic engineering stuff using layer 3 techniques that have been around for awhile. That being said---I'm always anxious to please everyone, here's a potential compromise to solve the partitioned VLAN problem. The rule is: If you are appointed VLAN A forwarder, then you send Hellos, not just on the one VLAN specified by the DRB, but also on VLAN A. If the DRB announces its VLAN forwarder assignments in its Hello, then if RB2 attaches to VLAN A, and knows that RB1 is appointed VLAN A forwarder, then RB2, if it does not receive some VLAN A tagged Hellos from RB1 knows there is a problem. What to do when you know there's a problem? Here are some suggestions: a) log a network management event b) do separate DRB elections on any VLANs that have been detected as being partiitoned. Wind up with a DRB, and pseudonode for each partition. And get rid of the root bridge safety check because all the RBridges in all VLAN A partitions will see the same root bridge anyway. Note: I find this scary (I like the safety check). It is more scalable than do separate DRB elections for every VLAN, since except in the (hopefully very rare) case of a partitioned VLAN, there will be just a single Hello election and single pseudonode. I strongly think we should have just a single DRB for a link. I'd prefer that DRB be the only one that forwards data to/from the link, but I don't think it's that complicated or unsafe to have the DRB data packet forwarders per VLAN in order to do load splitting. I prefer just not worrying at all about endnodes orphaned by partitioned VLANs. However, I think it's OK to attempt to detect partitioned VLANs by having the VLAN X forwarder send Hellos on VLAN X (in addition to the common VLAN). If indeed there are 4000 VLANs on the link, that does mean that there will be 4000 Hellos, but not multiplied by the number of RBridges. All RBridges will send Hellos on the single common VLAN specified by the DRB, plus whoever is VLAN x forwarder, for each value of x, will send an additional Hello tagged with x. Radia Dinesh said: >>This is another reason for me to be nervous about making these changes >>now. The old method where we did DRB per VLAN, a single pseudo-node for >>all VLANs that the Rbridge was DRB for seemed sufficient to me and >>avoided all these issues. We've gotten into this mess because someone >>decided that sending multiple Hellos (one per active VLAN) is a huge >>burden on the CPU. I disagree with that assumption and would like to >>avoid all these issues by falling back to what we had earlier. Radia Perlman wrote: > Putting in a subject line since Eric didn't. > > To restate Eric's concern -- it is possible that VLAN A might be > partitioned on a link, and since the DRB selects only > a single VLAN-A-forwarder, some VLAN A endnodes on the cloud might get > orphaned (since they are on the > other side of the partition from the appointed VLAN A forwarder on the > link). > > And that is correct. > > So, what is the alternative? Everything is an engineering tradeoff. > We could run DRB elections on every possible VLAN, and with a > partitioned VLAN, wind up with multiple DRBs for > that VLAN. If there were n VLANs, we'd wind up with n DRBs (one per > VLAN), n pseudonodes, n times as many Hello > messages, etc. > > We chose to consider a partiitoned VLAN, caused by bridges configured to > not pass data traffic for a particular VLAN, > as a misonfiguration. This can happen with bridges anyway, if you mark a > bunch of your potential through links as not > being allowed for VLAN A transit. The solution in the current spec is > low overhead (at most one pseudonode per link, > one Hello per RBridge except for the DRB) and relatively simple. > Hopefully eventually more bridges will be replaced by > RBridges and this sort of thing won't happen. > > Radia > > > Eric Gray wrote: > >> Folks, >> >> Here is the problem that occurs when VLAN state is inferred >> for one VLAN from connectivity provided by another. This is a >> general problem, but has specific applicability to the current >> set of assumptions used in the protocol specification. This came >> up in off-line discussions with Anoop Ghanwani (and others) at an >> IEEE meeting a couple of weeks ago. >> >> A key thing to understand in looking at this problem is it >> is a comparison between how a network works with 802.1Q bridges >> and the same network after some 802.1Q bridges have been replaced >> with RBridges. The example shows a partial RBridge deployment >> and this is compared with how it will have worked with 802.1Q >> bridges where the example shows RBridges (i.e. - it is an after- >> the-fact comparison of L2 forwarding functionality). >> >> The network looks like this (initially): >> >> \\\\|//// \\\\\|||///// \\\|/// >> __|__ __|__ __|__ >> | B-1 | | B-2 | | B-3 | >> |_____| |_____| |_____| >> | \ / \_____ / | >> | \_____/ | B-5 | | >> | | B-4 | |_____| | >> | |_____| | | >> __|__ \ | | >> | B-6 |_______\ _______| __|__ >> |_____| \___________| B-7 | >> | |_____| >> | | >> __|___ __|___ >> | RB-1 | | RB-2 | >> |______| |______| >> \ _ / >> \__ __( )___ _/ >> \_( Core )_/ >> (_ RBridge _) >> (_ Cloud _) >> (_ _) >> (_) >> >> In this figure, B-1, B-2 and B-3 are aggregation bridges with >> multiple (lots and lots) of VIDs. >> >> B-4 is a special purpose bridge used for VLAN-A only, and B-5 >> is a special purpose bridge used for VLAN-B only. >> >> All remaining bridges are configured to participate in VLAN-A, >> VLAN-B and an arbitrary set of zero or more other VLANs. Since B-4 >> and B-5 are configured for specific VLANs only, the ports on their >> adjacent bridging peers are configured only for those VLANs. >> >> To be clear, the links between B-4 and B-7, and between B-6 >> and B-5 are not connected (they merely overlap in the drawing). >> >> In this network, RB-1 and RB-2 are both RBridges and both have >> access to VLAN A and VLAN B and each other (via both VLAN A and VLAN >> B) - as well as the same arbitrary set of zero or more VLANs that >> any of the other bridges in the drawing have. However, the two >> RBridges use another VLAN - say VLAN C - to exchange hellos. Under >> the normal operating conditions intended, this works fine and RB-1 >> and RB-2 may separately be elected DRB for either of VLAN A or VLAN B. >> >> If bridge B-4 fails, then VLAN A would be segmented, at least >> temporarily. Hellos continue to work, however, so the 2 RBridges >> do not discover the partition and the DRB election remains unchanged. >> In this case, part of the VLAN is orphaned - particularly from the >> perspective of any locally attached end-stations. >> >> However, this is not acceptable behavior. No misconfiguration >> exists and the ideal (and reasonably expected) behavior would be for >> the RBridges to discover the partition and redo the DRB election - >> making RB-1 the DRB for one partition and RB-2 the DRB for the other >> partition. >> >> If (when) RB-1 and RB-2 were 802.1Q bridges, using MSTP for >> multiple VLANs (in particular, for VLAN A and B), this failure will >> have resulted in re-running (M)STP for the affected VLAN connectivity >> and the segmentation (partitioning) would be healed. In the same way, >> if the status of VLAN's A and B were derived directly from messages >> that use VLAN A and VLAN B (as opposed to using VLAN C), this same >> robust behavior would occur. >> >> >> -- >> Eric Gray >> Principal Engineer >> Ericsson >> >> _______________________________________________ >> rbridge mailing list >> rbridge at postel.org >> http://mailman.postel.org/mailman/listinfo/rbridge >> >> > > _______________________________________________ > rbridge mailing list > rbridge at postel.org > http://mailman.postel.org/mailman/listinfo/rbridge > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/rbridge/attachments/20071206/afb8d187/attachment-0001.html From Radia.Perlman at sun.com Thu Dec 6 14:39:56 2007 From: Radia.Perlman at sun.com (Radia Perlman) Date: Thu, 06 Dec 2007 14:39:56 -0800 Subject: [rbridge] Critical bits for options In-Reply-To: <18264.19311.851212.263849@gargle.gargle.HOWL> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <4756DEE8.3030504@isi.edu> <47583EAE.8000907@cisco.com> <18264.19311.851212.263849@gargle.gargle.HOWL> Message-ID: <47587A3C.3090208@sun.com> I think the possibilities are a) what's in the spec now: a way to skip over all options. This means we can't define a critical option, because current RBridges will skip over it, and not realize that they are supposed to drop the packet because it has a "critical" option in it b) define option format, including marking each option as critical or not c) everything in b, but also have two summary bits "critical hop by hop option is in this packet" and "critical end to end option is in this packet" d) just define the two bits "critical hop by hop" and "critical end to end" option exists. The issues in each a) we won't be able to define an option that is "critical" without using a new version of the TRILL header (which is actually a plausible work-around), because current RBridges will just skip all options b) Might slow the WG down, but maybe the format won't be controversial. If the two critical bits are not there, then it will slow down RBridges that don't support any critical options, because they will have to parse through all the options. c) the two summary bits are, as someone pointed out, redndant information if each option is marked as critical or not. However, the summary bits do make forwarding faster for bridges that don't support any options, because they don't have to look through the options d) this is the minimal thing we'd need to do if we want to allow critical options At this point I favor d), because it's really easy to specify, and then we don't have to argue about the format of options. Don Eastlake actually had designed a format, and has verbiage for it, but I had perhaps bullied him into taking it out of the spec because at the time I didn't think it was worthwhile defining the option formats if we weren't defining any options. Radia From james.d.carlson at sun.com Thu Dec 6 14:27:57 2007 From: james.d.carlson at sun.com (James Carlson) Date: Thu, 6 Dec 2007 17:27:57 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <47586423.9090401@cisco.com> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <3870C46029D1F945B1472F170D2D97900350CE8B@de01exm64.ds.mot.com> <18264.20480.290383.880523@gargle.gargle.HOWL> <47586423.9090401@cisco.com> Message-ID: <18264.30573.551831.427398@gargle.gargle.HOWL> Russ White writes: > So, the problem is--how do you signal, without looking for the options, > whether or not the faster switching path the packet comes in on should > punt the packet to some slower path because of the presence of some > option or another. Yes. In a particular degenerate case that appears to be of interest to several posters here, systems that implement _no_ options at all would like to know if any "critical" (must-comply) ones are present. If they are, then the only thing the node can do is drop. > 1. What are these options for? I'm a bit confused as to why there are > such things as options (?).... Good question. Presumably, they'd be there to handle special features, such as encryption in transit and labeling for security purposes, and perhaps for debug options (trill-traceroute). It's hard to imagine too many _other_ purposes here, but I suppose others are possible. > 2. Will there be a consistent list, through time, of which options a > given platform will switch in some faster path, and which they will > switch in some slower path? Almost certainly not. The only assertion I've seen here is that some implementations will be completely unable to support options at all. They need to know whether to drop or pass a packet with options. > 1. It seems like if 2 is false, then there's no point in trying to put > the options in any sort of order (?).... How can the sender know what > some processing node in the middle can switch faster or slower? There's a difference between critical and non-critical hop-by-hop options. If there are any critical options, then all nodes on the path must process them. Any nodes that don't understand the option must drop the packet. That's the reason some are requesting a "summary" bit. One way to resolve the whole problem is to outlaw critical hop-by-hop options. Maybe not the _best_ way, though. > Overall, I think it might be useful to have a bit saying: "This packet > has options you really need to process, so don't just switch it without > paying attention." A bit that says: "I have no options" might also be > useful. The latter we have -- Op-Length will be zero when there are no options. The former we can get in one of at least two ways: - Summary bits (some like this, I don't) - Option ordering such that must-parse options are always first. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From riw at cisco.com Thu Dec 6 15:07:53 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 18:07:53 -0500 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud In-Reply-To: <4758704E.70604@sun.com> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <475617A0.4010302@sun.com> <4758704E.70604@sun.com> Message-ID: <475880C9.8060805@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > I strongly think we should have just a single DRB for a link. I'd prefer > that DRB be the only one that forwards data to/from the link, but > I don't think it's that complicated or unsafe to have the DRB data > packet forwarders per VLAN in order to do load splitting. I'd prefer to let the user configure the broadcast links in the middle the way they want to.... :-) I'd prefer to have a DRB per vlan, rather than playing with different schemes to have a DRB per link, no matter how many vlans you might happen to have. There are three situations where you might have multiple parallel vlans across a link: 1. It's a trunk. Well, that's a special case. 2. It's 4000 parallel transit links. Please send this person to the nuthouse. I don't want them designing any network my data runs on. 3. It's 4000 parallel host attachment points. In this case, they don't need to be transit links, and hence, they don't need to be carrying any IS-IS hellos at all. This is why most all implementations have "passive interfaces." > If indeed there are 4000 VLANs on the link, that does mean that there > will be 4000 Hellos, but not multiplied by the number of RBridges. > All RBridges will send Hellos on the single common VLAN specified by the > DRB, plus whoever is VLAN x forwarder, for each value of x, > will send an additional Hello tagged with x. I don't think the DRB should specify the "hello vlan" at all, honestly.... There are a lot of reasons that I think we should do a hello per vlan, and leave it at that. I'll be glad to give some examples, but they are long, and I should really send them in three separate emails, probably, so people can try and understand them all really. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWIDJER27sUhU9OQRAv2pAKCyREqUNSju1VwvM5shRrKYGLjA0gCfbWWk BwcdKZ65gdXMmF+00nZIv9Y= =m5b7 -----END PGP SIGNATURE----- From james.d.carlson at Sun.COM Thu Dec 6 15:01:06 2007 From: james.d.carlson at Sun.COM (James Carlson) Date: Thu, 6 Dec 2007 18:01:06 -0500 Subject: [rbridge] Critical bits for options In-Reply-To: <47587A3C.3090208@sun.com> References: <475614D5.9000207@sun.com> <18262.45631.542916.231584@gargle.gargle.HOWL> <4756DEE8.3030504@isi.edu> <47583EAE.8000907@cisco.com> <18264.19311.851212.263849@gargle.gargle.HOWL> <47587A3C.3090208@sun.com> Message-ID: <18264.32562.684878.142741@gargle.gargle.HOWL> Radia Perlman writes: > I think the possibilities are [a...d] e) Specify that, although the option format is not yet defined, all options are assumed to be non-critical. All implementations are expected to forward properly regardless of Op-Length. This allows us to define the option format later in order to introduce non-critical options without disturbing any existing implementations that have no options. If we ever need critical options in the future, then we bump the V (version) field, and define the option format at the same time. This creates a flag day for the old implementations. But if we're creating "critical" options that all implementations must have, then there's a darned good reason for it, and those old machines need to be upgraded or retired anyway. I'd still prefer to see an option format spec, but the solution above looks slightly better to me than adding summary flags, which I really don't like. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From riw at cisco.com Thu Dec 6 15:20:23 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 18:20:23 -0500 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud (Example 1) Message-ID: <475883B7.3090208@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Suppose we have a simple network, where A, B, and C are all connected to a single link, a broadcast of some type: A--+ | +--B | C--+ Now, assume all three are members of VLANs 1, 2, and 3 on this link. All three are properly configured to use VLAN 1 as the rbridge vlans. A is designated as the DRB. Now, things are running along fine, but, for some reason, A malfunctions, or some such, and can no longer send multicast packets on VLAN 1. The result is that B and C detect they've lost their connection to A, but they won't elect a new DRB at all. Why? Because they are still receiving hello's with A claiming to be the DRB, and with A's original priority and ID, on the other two VLANs. Now the link is completely out of the SPF tree, and not used at all--because of a partial failure on the DRB. This is broken, and unexpected behavior. The IS-IS hellos are there to find per link bidirectional reachability. When you take that away, and assume reachability on one logical link because you have reachability on another logical link, you wind up in a mess that's harder to fix than the original problem you were trying to fix. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWIO3ER27sUhU9OQRAg9CAJ9/HpqRa2601eos5cSV83lDq8w6vgCdExix yeLuu7HePC3NjcDtCKypvWw= =wLRQ -----END PGP SIGNATURE----- From riw at cisco.com Thu Dec 6 15:28:24 2007 From: riw at cisco.com (Russ White) Date: Thu, 06 Dec 2007 18:28:24 -0500 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud (Example 2) In-Reply-To: <475883B7.3090208@cisco.com> References: <475883B7.3090208@cisco.com> Message-ID: <47588598.6040305@cisco.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Suppose you have a simple network where A, B, and C are on the same broadcast segment, but there are several possible VLANs on which rbridge traffic could run: A--+ | +--B | C--+ Now, A is able to talk on VLANs 1, 2, and 3, while B and C are able to talk on VLANs 2, 3, and 4. The "correct" vlan for rbridge traffic to flow across is 2, which A and C are configured for. B, however, is configured to pass rbridge traffic across VLAN 4, which is a misconfiguration. Suppose C is chosen as the DRB in this case. It advertises VLAN 2 as the correct rbridge VLAN, and hence, B's misconfiguration is "covered up." The system works as designed. However, when C fails, B is chosen as the DRB on this link. Since B is misconfigured, the link now falls off the network.... Hmmm.... This isn't what we were after, was it? If there is a large number of routers on the link, if the rbridge which happens to be elected DRB is misconfigured, the entire link is tossed for transit use, rather than just taking those misconfigured rbridges off the link. That's an unexpected, and bad, side effect. The problem here is that you're assuming that while all the other nodes might be misconfigured, the node selected as the DRB won't be misconfigured. That's a rather odd assumption--especially since the DRB is elected, not manually configured. Most folks have to go look when you ask them which router on their network is the DIS, I don't expect it will be much different with rbridges. There's at least one more failure mode in this scheme, I'll try and cover it later. :-) Russ - -- riw at cisco.com CCIE <>< Grace Alone -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHWIWYER27sUhU9OQRAm4XAJ94WqW55xm8eH4si5IO2599YbkNegCgsBZs 4XbzeaK34bje0Vvjdm1Vk7Q= =YqzR -----END PGP SIGNATURE----- From anoop at brocade.com Thu Dec 6 18:32:37 2007 From: anoop at brocade.com (Anoop Ghanwani) Date: Thu, 6 Dec 2007 18:32:37 -0800 Subject: [rbridge] Orphaned endnodes with partitioned VLANs on a cloud In-Reply-To: <941D5DCD8C42014FAF70FB7424686DCF0211FC57@eusrcmw721.eamcs.ericsson.se> References: <941D5DCD8C42014FAF70FB7424686DCF020EE8A1@eusrcmw721.eamcs.ericsson.se> <475617A0.4010302@sun.com> <4C94DE2070B172459E4F1EE14BD2364EB4EF13@HQ-EXCH-5.corp.brocade.com> <941D5DCD8C42014FAF70FB7424686DCF0211FC57@eusrcmw721.eamcs.ericsson.se> Message-ID: <4C94DE2070B172459E4F1EE14BD2364EBC8DC6@HQ-EXCH-5.corp.brocade.com> (There was some problem with the outgoing alias of my email and as a result my posts haven't been making it to the list; just to private parties listed in the recipient list. Hopefully that should be fixed at the time I send this message.) Hi Eric, I can't fully remember the example. Would it be possible for you to put the VLAN memberships on the various ports in your figure? That would help me understand the picture better. (I know I may be asking for much but please be patient with me because I'd like to address all your concerns.) It is possible to misconfigure bridged LANs so that VLANs end up getting partitioned. The automated way to avoid that is using MVRP. Anoop > -----Original Message----- > From: Eric Gray [mailto:eric.gray at ericsson.com] > Sent: Wednesday, December 05, 2007 2:25 PM > To: Anoop Ghanwani; Radia Perlman > Cc: Developing a hybrid router/bridge. > Subject: RE: [rbridge] Orphaned endnodes with partitioned > VLANs on a cloud > > Anoop/Radia, > > This is not exactly the note on which we left off in > the off-line discussion. As I recall (isn't E-Mail great!), > I replied to this with the following: > > ============================================================ > If the RBridges were inserted to replace 802.1Q > bridges, and each was configured as I've described for access > to both VLAN A and B, then the bridge failure I described > would simply result in a different spanning tree for VLAN A. > VLAN A would not be broken, it would simply be (potentially) > less optimal. > ============================================================ > > This was specifically in response to your comment (to > the effect that this wouldn't have worked in an 802.1Q LAN - > so it would have been a misconfiguration there as well). > > In response to the above observation you said that you > agreed that this would not have been a misconfiguration error > in that case. Aactually, what you said exactly was "Yes, you > are right." > > More on this in a separate mail message response to Radia... > > Thanks! > > -- > Eric Gray > Principal Engineer > Ericsson > > > -----Original Message----- > > From: Anoop Ghanwani [mailto:aghanwan at brocade.com] > > Sent: Tuesday, December 04, 2007 11:27 PM > > To: Radia Perlman; Eric Gray > > Cc: Developing a hybrid router/bridge. > > Subject: RE: [rbridge] Orphaned endnodes with partitioned > VLANs on a > > cloud > > Importance: High > > > > > > Radia, > > > > This is what I told Eric during the offline discussion as > well - that > > this is a misconfiguration and as long as bad things aren't > happening, > > it's up to the administrator to configure things so that they work > > correctly when such failures happen. > > > > It is no worse than what can happen in a misconfigured bridged > > network. > > > > Anoop > > > > > -----Original Message----- > > > From: rbridge-bounces at postel.org > > > [mailto:rbridge-bounces at postel.org] On Behalf Of Radia Perlman > > > Sent: Tuesday, December 04, 2007 7:15 PM > > > To: Eric Gray > > > Cc: Developing a hybrid router/bridge. > > > Subject: [rbridge] Orphaned endnodes with partitioned VLANs > > on a cloud > > > > > > Putting in a subject line since Eric didn't. > > > > > > To restate Eric's concern -- it is possible that VLAN A might be > > > partitioned on a link, and since the DRB selects only a single > > > VLAN-A-forwarder, some VLAN A endnodes on the cloud might get > > > orphaned (since they are on the other side of the > partition from the > > > appointed VLAN A forwarder on the link). > > > > > > And that is correct. > > > > > > So, what is the alternative? Everything is an engineering > tradeoff. > > > We could run DRB elections on every possible VLAN, and with a > > > partitioned VLAN, wind up with multiple DRBs for that VLAN. > > > If there were n VLANs, we'd wind up with n DRBs (on