[rbridge] long-awaited review comments on draft-ietf-trill-rbridge-arch-05
James Carlson
james.d.carlson at sun.com
Wed May 14 14:21:34 PDT 2008
Here are my long-awaited review comments on the TRILL architecture
document. I intentionally read through it as someone who wanted an
introduction to RBridges -- not someone who already knows the contents
of the protocol draft, or who has spent too many hours staring at IEEE
documents. Many of the things I found were related to that.
In the notes below, individual notes are separated by a single "-" on
a line. The notes are indented two spaces. The final section has
minor editorial nits.
-
How do STP and RBridges interact? We need to make it clear fairly
early on that neither regular bridges nor RBridges will forward STP
messages, but that regular bridges will forward TRILL IS-IS traffic
while RBridges will not. Thus the expected model is that:
- Directly connected STP-speakers see each other, and do tree
computation as usual. Those separated by RBridges are effectively
on different networks.
- RBridges don't see any of the STP-only speakers as part of the
topology, and thus consider any sequence of regular bridges to be
one hop.
This is eventually described in section 5.5, but that's quite a ways
down in the document.
-
Section 2.2 page 11:
The term "R-tree" is defined, but then never used again.
-
3.2.1 doesn't give enough detail about the nature of the unicast
forwarding database. There must be entries of at least these forms:
{ Destination MAC } -> { TRILL egress address }
{ TRILL address } -> { output link, MAC next hop }
The first represents ingress behavior, and the second is TRILL
transit. It's possible to compose together the first entry with the
second, avoiding a double look-up, so that the first entry looks
like this:
{ Destination MAC } -> { TRILL egress address, output link, MAC next hop }
But the second type of entry must exist for TRILL forwarding within
a campus. It's also possible to optimize the case where the egress
is the local system and thus normal bridge forwarding is needed.
That case looks like:
{ Destination MAC } -> { output link }
However, a system must always recognize its TRILL address and use
that to select an action of decapsulation followed by normal
bridging behavior, which means look-up based on the inner MAC header
to find a local entry of the above form.
-
Section 3.2.2., on page 15, the term "Egress RBridge" is defined for
the multi-destination case in part with this text:
o Egress RBridge - an RBridge that is the tail end of a path
corresponding to a specific Multi-destination TRILL
Forwarding Database entry. All RBridges within a TRILL Campus
I think this gets a bit confusing, because there are also "Egress
RBridges" in the unicast case, and using the same formally term for
two potentially different things seems like a mistake. As
alternatives, I suggest:
- Change "Egress RBridge" into a role that an RBridge plays in the
network, and define it in terms of the responsibilities of that
role outside of this section. The section on multi-destination
traffic can then clarify how a node 'knows' it must play that
role. (For unicast, it's easy. The nickname in the destination
field is *your* nickname.)
- Create a new, distinct term that encompasses the specific behavior
and role of a multi-destination egress.
-
Section 3.2.2., on page 16, it says:
Multi-destination TRILL Forwarding Database entries may also
include Multicast-Group Address specific information relative to
each egress RBridge that is a member of a given well-known
multicast group, to allow scoping of multicast forwarding by
multicast group.
Why are the words "well-known" used here? The point of well-known
group addresses is that the handling is already defined --
membership isn't really needed. Shouldn't this say "of a given
multicast group"? (If not, then what exactly is the significance of
limiting these entries to *just* those for well-known group
addresses?)
-
Section 4.1, page 17:
At an architectural level, it is sufficient to note that every
end station attached to a TRILL Campus should have a primary
point of attachment to the TRILL Campus, as might be defined
(for example) by a Designated RBridge. Furthermore, if it is
I read that several times, and then had to refer to other sections
before the actual meaning of this text became clear. "Primary point
of attachment" doesn't mean to me what it must have meant to the
author. When I first read it, I thought it was a wire or subnet.
Then I started thinking in terms of DLPI PPAs. Then I got _really_
confused. ;-}
The apparent meaning of this text is that, for each end station on
each VLAN, there must be at most one RBridge that acts as the TRILL
encapsulation/decapsulation gateway when talking to other nodes in
the rest of the campus. And one way to do that is to have a
per-VLAN Designated RBridge.
There's no actual "attachment" of any sort.
Unfortunately, the text takes several unclear paragraphs to say
that. It seems that part of the reason it's so unclear is that the
document is trying to drive far out of its way in order to be "fair"
to other possible proposals other than having a Designated RBridge.
Perhaps we could even do per-end-station elections.
I think the text should be shortened up considerably and clarified,
because this point is effectively drowned out by too many words.
(This comment applies to similarly affected sections, such as 5.2,
which seems to be crawling with degenerates. ;-})
-
In general, I think section 4.1 worries itself too much about the
definitions of bridges (802 references and such) and far too little
about the architectural implications for RBridges.
We (those creating TRILL-based RBridges) don't care about bridges.
We should not have to. We shouldn't have to specify that bridges
need to "be consistent" with 802.1D or 802.1Q -- they either are or
aren't, and that's the problem of the bridge vendor.
I note that the document didn't spend any time talking about the
standards for repeaters. Those have about as much bearing to the
matter here.
The *important* part is whether any equipment that may form a
non-RBridge L2 data path between RBridge ports must allow TRILL
communication between those ports such that RBridges can safely
elect or determine a single Designated RBridge. It doesn't matter
how that path is formed (802.1D is one possibility), just that it
exists.
-
Section 5.2, page 21:
As described previously, RBridge learning is similar to typical
bridge learning - i.e. - all RBridges listen promiscuously to L2
Frames on each local LAN and acquire end station location
information associated with source MAC addresses in L2 frames
they observe.
All egress RBridges should also learn from the L2 frames that they
decapsulate. The two cases are distinct and important parts of
learning:
- The ingress learns on which local port the end station exists,
just like any ordinary bridge would do.
- The egress learns which nickname is the remote encapsulator (and
thus per section 4.1, decapsulator) for that end station. This
part is unlike an ordinary bridge.
This latter bit is crucial. It's what requires the encapsulator
(which fills in a source nickname) and decapsulator (which will be
the target of return traffic) to be the same node, or at least
requires the encapsulator to fill in the decapsulator's nickname as
the "sender."
-
Section 5.2, page 22:
The trade-off is between the complexity associated with flooding
data verses the complexity associated with flooding reachability
information.
This is duplication of the information already in 3.2.3. This could
be trimmed down.
-
Section 5.2, page 23:
Note that an egress RBridge will - in most case - be the RBridge
determined to be the primary point of attachment for a
destination end station on the local link or VLAN accessed via
its egress interface(s). Exceptions to this might exist under
circumstances in which use of distinct RBridges for ingress and
I think this digression should just be removed. Not only is it in
conflict with the intent of section 4.1, but (like the whole "point
of attachment" thing) it's a point of unnecessary confusion.
If it becomes feasible for some RBridge implementation strategy to
allow for distinct ingress/egress nodes in some cases, then I think
it's that other document's problem to describe how the deviation
that document describes is consistent with the overall story,
including (particularly) the egress node's learning capability.
By this same token, any implementation could be arbitrarily strange
in areas not specified by the architectural document. For instance,
someone could implement all of this with ATM and map nicknames into
VCIs. It's not really possible (or even useful) to describe all the
ways one could go strange.
The architecture document should describe how the system is intended
to operate and what the parts should do. I don't see a reason to
insert loopholes that allow for unspecified future variations. At
best, it's a distraction, because we don't know how to make that
work. (And, in fact, I suspect it does _NOT_ work in any case,
because it breaks learning.)
-
Sections 5.3.2-2 and 5.3.2-3, pages 28 and 29: there's a lot of
duplication here.
-
I'm surprised that section 5.4 doesn't discuss why IS-IS was chosen,
or what special things need to be done with it in order to make it
work here (such as setting a fixed "area" value).
-
Section 5.5, page 30:
o Transparent Participation (Transparent-STP)
o Active Participation (Participate-STP)
o Blocking Participation (Block-STP)
I don't see that these terms are defined anywhere. It seems
somewhat obvious what they mean -- *if* you already understand
RBridges -- but they're likely to confuse.
This also looks like material that's in the same category as the 5.2
advice about separate ingress/egress. It's possible that someone
could define a "new" version of RBridges that either forwards STP
messages (!) or has each RBridge acting as an STP node in a single
network (!!), but neither of those is really the solution we're
trying to describe. It's not part of the architecture.
-
Section 5.5.1, page 32:
Finally, note that there is a chicken-and-egg problem associated
with RBridge participation in STP where RBridges may themselves
be connected by spanning trees.
I'm not positive that this problem actually occurs. If an RBridge
runs STP, the port will be blocked until STP finishes its usual set
of listening/learning/forwarding timers, so the RBridge network
won't see or use the link either.
STP is the egg, and TRILL is the chicken. I think.
- Editorial nits
Section 1, page 4:
The principal objectives of this architecture is to provide an
^^ are
allow some level of optimization support to be provided in
compliant implementations, in as many case as possible.
^^^^ cases
Section 3.2.1, page 14:
for each VLAN, if this is supported by configuration. Note that
scaling concerns may dictate otherwise, either in specific of
^^^^^^^^ ?
RBridge protocol specification, or in deployment. The Unicast
Section 3.2.2., page 15:
o Zero or more entries grouped for each root RBridge - keyed by
some root RBridge identifier - used to determine forwarding
of broadcast, multicast, and flooded frames originally
RBridge encapsulated by that ingress within the TRILL Campus.
^^^^^^^ TRILL
Each entry would contain an indication of which single interface
a broadcast, multicast or flooded frame would be forwarded for
(The text suddenly jumps into subjunctive mood rather than
staying in future tense. It's unclear to me why this is so,
but it looks like an error in the text.)
Section 3.2.3, page 16:
The Ingress TRILL Forwarding Database determines how arriving
traffic will be encapsulated, for forwarding toward the egress
^
RBridge, via the TRILL Campus. It becomes configured in much the
^
Section 4.6, page 20:
It is the combination of the local MAC desitnation (which is for
^^^^^^^^^^^
a locally attached RBridge) and the TRILL encapsulation that
--
James Carlson, Solaris Networking <james.d.carlson at sun.com>
Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
More information about the rbridge
mailing list