[e2e] Some thoughts on WLAN etc., was: Re: RES: Why Buffering?
detlef.bosau at web.de
Sat Jul 4 13:51:35 PDT 2009
Lachlan Andrew wrote:
> There are many concepts of "connection" at different layers of the
> protocol stack which runs on top of a wireless physical layer, and
> many protocols/layers are not connection oriented. However, some are.
> In the e2e context, a "disconnection" of a lower layer roughly means
> "a period of time over which all packets are lost, which extends for
> more than a few average RTTs plus a few hundred milliseconds". That
> is what it means in a phrase like "transport protocols need to handle
> short time disconnections".
Is'nt this exactly the problem discussed by Raj Jain and Lixia Zhang in
the late eighties in their works on the weaknesses of timeouts?
I totally agree with you here in the area of fixed networks, actually we
use hello packets and the like in protocols like OSPF. But what about
outliers in the RTT on wireless networks, like my 80 ms example?
Most likely, the packet has seen a number of retransmissions and perhaps
one or more of the transmission attempts has seen MAC latencies as well.
(I did not have the appropriate instruments to see this.)
Was there a "short time disconnection" then?
Certainly not, because the system was busy to deliver the packet all the
So the problem is not a "short time disconnection", the problem is that
timeouts don't work - or at least suffer from some shortcomings. (Which
was discussed in the aforementioned works by Lixia Zhang and Raj Jain.)
Hence we see a "bogus disconnection", or to refer to the wonderful word
used by Randy Katz, Reiner Ludwig, Andrej Gurtov and many others:
"spurious disconnections". Actually, a too much delayed "hello reply" is
a typical example of a spurious time out.
Actually, e.g. in TCP, we don't deal with "short time disconnections"
anyway. We use sufficiently large RTO values - and in the rare cases of
an "accepted spurious timeout" (if you read the original work by Stephen
W. Edge, you'll agree that the RTO is some kind of confidence interval
which well accepts a certain residual possibility for spurious timeouts
- it's a matter of engineering to find a suitable design for this value)
we simply do a retransmission.
Actually, the number of spurious timeouts is extremely low - if
detectable at all.
So, the basic strategy of "upper layers" to deal with short time
disconnections, or latencies more than average, is simply not to deal
with them - but to ignore them.
What about a path change? Do we talk about a "short time disconnection"
in TCP, when a link on the path fails and the flow is redirected then?
We typically don't worry.
To me, the problem is not the existence - or non existence - of short
time disconnections at all but the question why we should _explicitly_
deal with a phenomenon where no one worries about?
>> From an end to end point of view, I would like to abandon terms like
>> "channel" and "connection" in a wireless environment.
> We shouldn't throw the baby out with the bathwater. The concept of "a
> period in which most packets are delivered within a (vague) short
> period of time" is useful.
However, the question is how we should deal with a packet which is not
delivered within a certain period of time.
I have to reread the "Freeze TCP" approach which was published a couple
of years ago and which, IIRC, attempted to offer an explicit treatment
for short time disconnections of links.
The problem is, once more, an extremely basic problem well known to all
kinds of science and therefore well discussed in epistemology. It's the
basic problem to find the correct reason for an observed phenomenon.
It's the basic problem of reasoning ex post.
(A pretty well known error in wireless networking, which I made myself,
is to conjecture heavy load from large delays. There are quite some
reasons for large delays or large delay variations in wireless networks,
e.g. varying noise / disturbance. Of course, varying load may lead to
the same phenomenon. But it's imply impossible to reason which one of
the several possible causes does apply from a single observation.
Another instance of the same problem is the huge amount of loss
differentiation literature, which fails for the same reason: Reasoning
ex post. Actually, this problem should be overcome in high school and
it's a matter of education. We shouldn't learn epistemology at
university. We should learn this at school - and shouldn't attend a
university without a solid education in epistemology at all.
However, I'm quite ashamed when I see how often I made the
aforementioned mistake myself.)
Actually, the outlier from Wednesday was a minor one. There was an
average RTT about 5 ms, the outlier was about 80 ms. In HSDPA, a
variation of two or three orders of magnitude is by all means possible.
So, does an outlier indicate a "disconnection" then?
> If we don't call it a "connection", we'd
> come up with another word for it.
Isn't it sufficient to describe the corruption probability?
Actually, some NOs suspend a line when the SNR grows too large. This is
a disconnection "by definition" then.
Admittedly, I have a problem with this attitude. As long as the user
pays the service, it is up to the user to decide whether a packet should
be sent even if the channel is bad. Of course, the bad channel will
result in bad throughput. And of course the NO will so indicate to the
user. However, the decision whether the line should be suspended or not
is not up to the NO.
> Rather than pointing out weaknesses in current terminology, it may be
> better to propose a concept which better models (dis)connectivity, and
> then do a useful design/calculation which is possible using the new
> concept but was impossible with the old concepts. Without that
> validation, the new concept won't replace the old.
I totally agree. However, I did not propose a new concept, but I asked
Detlef Bosau Galileistraße 30 70565 Stuttgart
phone: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau
ICQ: 566129673 http://email@example.com
More information about the end2end-interest