<div dir="ltr">Hi Francois,<div><br></div><div>See inline<br><br><div class="gmail_quote">On Mon, Jul 7, 2008 at 9:04 PM, Francois Tallet (ftallet) <<a href="mailto:ftallet@cisco.com" target="_blank">ftallet@cisco.com</a>> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div dir="ltr" align="left"><font face="Arial" color="#0000ff" size="2"><span>Hi Donald,</span></font></div>
<div dir="ltr" align="left"><font face="Arial" color="#0000ff" size="2"><span>Please see inline.</span></font></div>
<blockquote style="padding-left:5px;margin-left:5px;border-left:#000000 2px solid;margin-right:0px">
<div class="gmail_quote"><div>
<blockquote class="gmail_quote" style="padding-left:1ex;margin:0px 0px 0px 0.8ex;border-left:#ccc 1px solid">2.3
page 7:<br><br>There is no forwarding loop in the case described in figure
4. I met with some of the authors of [5] and I'm relatively sad that this
paper was published when the conclusion is based on a bug present in their
simulator. However, RSTP can indeed suffer from the usual count to infinity
issue specific to distance vector protocols that can delay the convergence
by few seconds.<br></blockquote>
</div><div><div><br>Can you provide some pointer to a discussion of this issue or explain
what is wrong with the description of the problem in that paper? Has a
retraction or counter paper been published? Just saying there was a bug in
their simulator does not explain to me why the problem they describe in text
does not occur.<br></div><font color="#0000ff"><br><span><font face="Arial" size="2">[FT>] I got
confirmation from Prof. Eugene that there was a race
condition in a state machine of their simulator. Don't know of any public
paper. I can certainly show you what is wrong in the convergence described in
the paper if you want. I think you will find an email from Mick
Seaman regarding this in the archive of 802.1. In fact, STP/RSTP/MST make
a lot of effort to ensure there is never a forwarding loop. Again, the count
to infinity problem is real, and the solution proposed in the paper is
relevant.</font></span></font></div></div></blockquote></div></blockquote><div>Yes, that was what I wanted, for you to, as I said, "explain what is wrong with the description of the problem in that paper?" I have no interest in searching through 802.1 mail archives. I'm guessing that the problem with the paper is that the stale information can't cycle for as long as the paper implies because the distance hits a maximum...<br>
<br>I understand that spanning tree protocols try hard to minimize loops and that modern equipment and links are quite reliable so even transient loops are rare. But, as far as I can tell, the claim you repeat above, that there is "never a forwarding loop" with STP, just isn't true. I really hate arguing based on "authority" rather than the real technical facts, but I have spoken with Mick Seaman and Norm Finn and they agree that transient loops are possible with loss of a sufficient number of BPDUs or temporarily undetected changes in topology although, as I say, this is obviously a rare occurrance in the real world.</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><blockquote style="padding-left:5px;margin-left:5px;border-left:#000000 2px solid;margin-right:0px"><div class="gmail_quote">
<div>
<blockquote class="gmail_quote" style="padding-left:1ex;margin:0px 0px 0px 0.8ex;border-left:#ccc 1px solid">I
don't exactly know where the 45 seconds mentioned here come from. With
default timers, a failure is detected in max 20 seconds by STP and
convergence is a matter of 30 seconds. With RSTP/MST, it's 6 second to
detect a failure (when the protocol does not see a link going down) and the
convergence is virtually immediate (does not depend on any
timer).</blockquote>
</div><div><div><font face="Arial" size="2"></font><br>I believe that the earliest version of
the spanning tree protocol waited 45 seconds before going into the forwarding
state. If your figure of 30 seconds is correct for 802.1 STP, then I would
think the figure should be adjusted.<br></div><font color="#0000ff"><span><font face="Arial" size="2">[FT>] Currently, a port goes to forwarding in
2xforward_delay, and forward_delay is 15 seconds by
default. </font></span><br></font></div></div></blockquote></div></blockquote><div>Then the number in the draft should probably be changed to 30 seconds. </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><blockquote style="padding-left:5px;margin-left:5px;border-left:#000000 2px solid;margin-right:0px"><div class="gmail_quote"><div>
<blockquote class="gmail_quote" style="padding-left:1ex;margin:0px 0px 0px 0.8ex;border-left:#ccc 1px solid">...<br><br>2.5
page 8<br>MST arguably provide a max of 65 instances: one CIST and up to 64
MSTIs. I don't understand "one per group of vlans" -> a given vlan is
mapped to a unique instance.</blockquote>
</div><div><div><br>Now that there have been a few comments on this, can you suggested
better wording?<br></div><font color="#0000ff"><span><font face="Arial" size="2">[FT>] The comment was mainly on the 65 instead of 64
(which means that up to 65 different topologies are available). How about
something simple like: added support for multiple spanning trees, up to a
maximum of 65. Each vlan is mapped to one of
those. </font></span></font></div></div></blockquote></div></blockquote><div>OK. Or I suppose it could say something like "one per non-overlapping group of VLANs". </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><blockquote style="padding-left:5px;margin-left:5px;border-left:#000000 2px solid;margin-right:0px"><div class="gmail_quote"><div> <br></div><div>
<blockquote class="gmail_quote" style="padding-left:1ex;margin:0px 0px 0px 0.8ex;border-left:#ccc 1px solid">3.1
page 9-10:<br>I think that using STP, you are guaranteed that there is no
duplication and out of order frames. Frames in transit could be duplicated
or re-ordered while RSTP/MST converges. I think that if TRILL relies on TTL
to mitigate loops, we'll certainly get more duplication of frames, and more
often.</blockquote>
</div><div><div><br>What are your assumptions? A topology change like a link to a hub
coming up could result in frame duplication until detected.<br></div><font color="#0000ff"><span><font face="Arial" size="2">[FT>] A single hub would not be enough to cause a
problem, but a link coming up between two hubs would definitely. That's
explicitly mentioned several times in 802.1. That's a case that can
be avoided by design. I don't think it is correct to define STP based on this
topology, that is explicitly forbidden in the spec. To push this to the
extreme, I can build on purpose a scenario where BPDUs are filtered
and STP does not converge at all. It would not be honest to say STP
does not converge based on this
example;-)</font></span></font><br></div></div></blockquote></div></blockquote><div> </div><div>Could you provide citations to these multiple explicit mentions in "802.1" and exactly where which 802.1 standard forbids having more than one hub in a bridged LAN? A quick search shows no occurrences of hub in 802.1Q and exactly one occurrence in 802.1D. That occurrence in informational Annex K.1 when it mentions that the undetected appearance of connectivity as a way you could get frame duplication or re-ordering, and it gives the appearance of a link between two hubs as one example. That annex is not normative text anyway. </div>
<div><br></div><div>I do not agree that there is any equivalence between (1) dropping or garbling BPDUs due to noise or other line glitches and a link coming up due, for example, to a hub or repeater intermittently failing and transitioning from the failed to the non-failed state and (2) and inserting a deliberate BPDU filter. The first is normal, albeit unlikely, random occurrence due to the imperfections of the real world. The second is a deliberate attempt to cause failure. In any case, I don't think anyone said that "STP does not converge". I merely made the true statement that transient loops are possible with STP.</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><blockquote style="padding-left:5px;margin-left:5px;border-left:#000000 2px solid;margin-right:0px">
<div class="gmail_quote"><div><br></div><div>
<blockquote class="gmail_quote" style="padding-left:1ex;margin:0px 0px 0px 0.8ex;border-left:#ccc 1px solid">3.3
page 10:<br>A new link coming up should not be introducing any temporary
loop. It's true that you can achieve that by bringing up a link between
devices (like hubs) that are not running STP.</blockquote>
</div><div><div><br>Again, what are your assumptions? If you have three bridges A-B-C and
a source of broadcast or multicast frames is sending them into A and a link
appears between A and C, why don't you have a loop until the link is
noticed?<br></div><font color="#0000ff"><span><font face="Arial" size="2">[FT>] STP is much more than a simple distance
vector protocol. In fact, most of the complexity of STP is precisely to handle
those cases. When the link appear between A and C, at least a port stays
blocked until STP has determined it can safely go to forwarding.</font></span></font></div></div></blockquote></div></blockquote><div>Why do you believe this is true in all cases? Once again, I must ask what your assumptions are. Are you assuming some particular technology for the link which was down and comes up and that this signals the ports so at A and C so that if they were in forwarding state that transition out of that state? If you are assuming that, how can you know it to be true for all link technologies?<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><blockquote style="padding-left:5px;margin-left:5px;border-left:#000000 2px solid;margin-right:0px"><div class="gmail_quote">
<div><font color="#0000ff"><span><font face="Arial" size="2">STP
and RSTP work differently, but the goal is definitely to avoid any
temporary loop, even the shortest. Just to give you an idea,
I once worked on a bug that introduced a bridging loop during
reconvergence in a customer network. This looped lasted less than 10ms but
created a really critical situation over there. I guess not all customers are
equally sensitive to this requirement, but some do really expect that the
network will never introduce a forwarding loop.</font></span><span><font face="Arial" size="2"> So I don't know if the
goal is to ease the requirements for TRILL, but again, it is not correct to
say that a temporary bridging loop is expected or even acceptable in
a bridged network handled by STP.</font></span></font></div></div></blockquote></div></blockquote><div>I don't know what you mean exactly by "expected". And I certainly don't claim loops are common. Pretty much all routing and forwarding protocols have as a goal to minimize the probability and duration of temporary loops but, while they should have as an absolute goal the elimination of persistent loops under reasonable assumptions, I have seen no evidence that it is practical to completely eliminate the possibility of transient loops.</div>
<div><br></div><div>The impression I have is that the problem presented in the paper to which you object can occur but cannot persist for tens of seconds as their flawed simulation showed; that it involves a count-to--infinity but infinity is only ~8 in this case; as a result, if RSTP BPDU transmission rate limiting was in effect when this occurred, a transient loop could exist for a lot longer than the 10ms you cite as causing a problem (in fact, earlier in your message you say "few seconds" which can clearly be fatal for some networks).</div>
<div><br></div><div>As far as I can tell, the very common statement that "there is never a forwarding loop with spanning tree" is, in its unqualified form, not true. On closer examination, I have always found that the speaker was speaking informally and what they really meant was one of the following two true statement (1) "given reasonable assumptions, there is never a persistent loop with spanning tree" or (2) "there is never a transient loop with spanning tree unless one of the unlikely conditions that would cause such a transient loop occurs".</div>
<div><br></div><div>While the wording the draft may be overly critical of STP and should probably be adjusted, I really get tired of these sloppy claims of magical perfection for STP.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><blockquote style="padding-left:5px;margin-left:5px;border-left:#000000 2px solid;margin-right:0px"><div class="gmail_quote">
<div><font face="Arial" color="#0000ff" size="2"><span>Regards,</span></font></div>
<div><font face="Arial" color="#0000ff" size="2"><span>Francois</span></font></div><div>
<div><span><font face="Arial" size="2"></font></span> </div>
<blockquote class="gmail_quote" style="padding-left:1ex;margin:0px 0px 0px 0.8ex;border-left:#ccc 1px solid">3.4
page 10:<br>"reference not found"<br><br>3.6 page 11:<br>802.1v and 802.1s
(lower case) are amendments to 802.1Q (capital). Mentioning 802.1Q is
probably enough;-)<br><br>Thanks and regards,<br>François<br>...</blockquote>
<div><br>Thanks,<br>Donald </div></div></div><div>=============================<br>Donald
E. Eastlake 3rd +1-508-634-2066 (home)<br>155 Beaver Street<br>Milford, MA 01757 USA<br><a href="mailto:d3e3e3@gmail.com" target="_blank">d3e3e3@gmail.com</a> </div></blockquote></div>
</blockquote></div>Donald<br>=============================<br> Donald E. Eastlake 3rd +1-508-634-2066 (home)<br> 155 Beaver Street<br> Milford, MA 01757 USA<br> <a href="mailto:d3e3e3@gmail.com" target="_blank">d3e3e3@gmail.com</a>
</div>
</div>