[e2e] What's wrong with this picture?

Sun Sep 13 18:58:02 PDT 2009

Lachlan Andrew wrote:
>> When I send you a book and ship the item "best effort delivery", this means:
>> I will take the item to the parcel service. And I don't care for the rest.
>>     
>
> Yep.  I find that an odd use of "best" too...
>   

It's a realistic one.

Next week, I'm going to have an appointment at the employment exchange. 
And they offer "best effort" care for me.
I.e.: I'm aged 46, so I will get no job offers.
When I apply for a job, I'm considered as too old - and I will not get 
an answer.
My diploma is too old, I could not do my job etc. etc. etc.

I'm simply tired to hear these excuses, while in Stuttgart some "then 
thousands" (no joke but what's said by our local government) IT experts 
are sought.

However, I'm too old, my diploma is too old, my knowledge is too bad 
etc. etc. etc.

That's best effort care - and that's the reason, why I'm answering to 
your post hat half past two in the morning - I cannot sleep. :-(

I apologize for this personal remark, but it's a practical example for 
"best effort care".

>> When each thoroughly designed optimal solution for a problem would at least
>> be half as successful as VJ's "hack", the world would be a better one.
>> In my opinion, the congavoid paper is not a "quick hack" but simply a stroke
>> of genius.
>>     
>
> Absolutely!  It was brilliant to realise that loss was telling us
> something about the network.  He also proposed a very robust response
> to it, over a wide range of conditions.
>   

Actually, the network tells us quite a few things. Of course, I'm too 
old and my diploma is too old, so I'm too stupid to listen at this.... 
(I cannot ignore my bitterness here....) and that was the reason for my 
proposal, I wrote some days ago.

E.g. the network tells us something about possible throughput, network 
congestion, network conditions, even when packets are not delivered 
successfully. Or, to be precise, one can tell the network tell us these 
things.

So, we can of course consider TCP modifications which make TCP run 
better over lossy networks.

And perhaps, we can discuss TCP flavours for particular network 
conditions. This must be done carefully - however, it may be reasonable 
in some cases.

> The only reason I call it a hack is to counter the view that it is a
> carefully engineered solution,

What are you missing there for a "careful engineered" solution?

Would the congavoid algorithm be more compelling, if Van had added ten 
pages with formulae and greek symbols to his work? ;-)

The idea was in fact shamelessly simple :-) The network tells us that it 
cannot carry the amount of data we put into it - and we simply halve the 
amount of data, which is put in the network.

The bucket is not large enough to keep all the water, we fill into it - 
and we simply fill less water into the poor thing ;-)

Of course, Van could have added one page in Greek, one page in Latin and 
one page in Hebrew to his work - however, this  would not cause a 
substantial change.

Quite contrary, the work would be worse than, because it's extremely 
hard to find an analytical, or at least any formal, description of what 
is happening here.  And that can even be seen in Kelly's paper "Charging 
and Rate Control for Elastic
Traffic" from 1997, which is often referred to in this context. It is 
extremely difficult to apply this work to wireless networks.

The difficulty is in the description of
- throughput,
- rate,
- link capacity,
- service time
in wireless networks. It's extremely hard, to find analytical, 
_quantitative_, descriptions for that. So, in that case, it's simply a 
great idea to have _qualitative_ approaches, which offer a simple 
solution when the network, and I'm using your words here, cries for help :-)

>  and that networks should be designed to
> show a particular undesirable symptom of congestion just because "TCP
> needs it".
>
>   

I don't agree here.

We do not intentionally introduce packet drops because we need it for 
TCP. It's quite the other way round: Because there is, generally spoken, 
no scheduling and no central congestion control, please refer to 
Keshav's PhD Thesis for an alternative approach, which is definitely 
_not_ the best effort approach IIRC, we encounter drops.

So, in the beginning, there were the drops. And TCP was introduced later.

And now, as you say, we've seen that a network may cry for help by 
dropping packets - and we made a virtue of necessity then and used these 
drops for congestion control.

And drops are outstandingly well suited for this purpose! There is no 
concern that "drop signaling" may get lost - because loss cannot get 
loss - so we have a reliable signaling mechanism without any further 
ado. We can simply use it.

>   
>>> It is remarkable that it has worked as well as it has, but why do we
>>> have to keep designing networks to support the hack?
>>>       
>> First: Because it works.
>> Second: Up to know, nothing better is known.
>>     
>
>   
I forgot a third reason: We do not even design networks that way that 
they produce drops. The truth is: Packets are dropped - and we can't 
help it! (Except by use of a central scheduling and rate allocation, see 
Keshav's work.)

> It works, except on links with large buffers (which exist, whether or
> not they "should") or for large BDP flows (which exist, and will
> become more widespread), or for links with non-congestion losses
> (which exist, and will continue to without "heroic" ARQ).
>
>   

And that's the problem of "one size fits all". You're right, that in 
LFNs the startup behaviour may be nasty. And there are quite some 
approaches around, Westwood, FAST, only to mention these two. Most of 
them are more or less rate controlled. The problem is to obtain the 
correct rate for a - typically - unknown path. As soon as you employ 
some kind of probing for this purpose, the problems are similar to the 
ones you'll encounter with VJCC.

On the other hand: If you have a certain knowledge of the appropriate 
rates, there's no reason not to use it.

The problem is "one size fits all". When I buy a t-shirt, it's most 
likely to be too short and everyone get's upset.
However, it's the largest size available in the store - and once, an 
acquaintance of mine said she could use
a t-shirt in my size as an evening gown....

So, perhaps, I should visit a store for evening gowns to get something 
appropriate.... However, there's some difficulty
with my unemployment compensation, because even _that_ is calculated 
"one size fits all". No matter, whether you are an unemployed mouse - or 
an unemployed elephant. One size fits all, and so, everybody gets a 
t-shirt in "the size" - and everybody gets upset, when the t-shirt is 
too small for an elephant or too large for a mouse.

> Someone has pointed out that simply the binary backoff of the RTO may
> be enough to prevent congestion collapse. 

Certainly it is. And certainly a lobotomy is enough to cure some kinds 
of mental illness. Unfortunately, there are these regrettable side effects.

Binary backoff is a drastic measure.

And sometimes _too_ drastic. If you encounter some transient link outage 
with your mobile, the RTO rapidly increases into ranges of minutes. And 
it takes quite some time, and I think you mentioned the problem yourself 
some weeks ago, to reestablish a proper ACK clock when the link is 
available again.

>  Who knows what aspect of
> VJ's algorithm is really responsible for making the internet "work",
> and how much is simply that we don't see all the details?
>   

Who _cares_?

I sometimes referred to the well known "Galloping Gertie", i.e. the 
Tacoma bridge disaster.

How do we prevent this kind of disasters? With thousands of sheets of 
paper with formulae?

No. Unfortunately, we could not stop the wind from blowing, when Gertie 
started its gallop, but actually, this is quite usually the way to go: 
When a system starts oscillating, try to get energy out of it.

In some buildings, there are compensators for this purpose. And even if 
they don't exactly match the building's Eigenfrequency, the main thing 
is that they kill energy.

This may be not an elegant mathematical solution, but it protects life.

(Oh, Wikipedia is nice :-) I found the correct English word for it: 
Tuned mass damper.
http://www.popularmechanics.com/technology/industry/1612252.html
)

And be assured: No one counts the people who actually stay in the Taipei 
101 tower, because the number of persons who actually stay in the 
building, might shift the tower's eigenfrequency.

And that's the same in computer networks. If they start oscillating or 
are getting instable (i.e. the queues grow too large) - you kill energy, 
i.e. drop packets.

>> TCP/IP is a generic protocol suite with hardly any assumptions at all.
>>     
>
> Exactly my point.  I don't think TCP should assume that routers drop
> packets instead of buffering them.  We can still use VJ's insight
> (that we should look for symptoms of congestion, and then back off)
> without that assumption.
>
>   

O.k., so you don't kill energy but tune the system ;-)

(The analogy to an oscillating system may become more obvious, if you 
recall newton's cradle. I'm not quite sure about VJ's education but from 
that analogy and the "conservation principle", I strongly presume that 
he is a physicist.)

>> When I review the proposals for TCP changes made in the last decade, I'm not
>> convinced that no one is willing to consider changes to TCP.
>>
>> However, a recent paper submission of mine was rejected, amongst others,
>> with the remark: "Ouch! You're going to change TCP here!".
>>
>> When there are valid reasons to change protocols, we should consider doing
>> so.
>>     
>
> Absolutely.  Many in the academic research community are (too?)
> willing to change TCP.  However, it is hard for the academics to make
> the changes without the IETF's support.
>
>   

Now: In the academic research community, one can hardly be too willing 
to change TCP.
It's simply our job to ask questions and assess answers and solutions.

Whether some change or some protocol will be deployed, is a different story.

I don't know how "large" the Internet was, when VJ proposed VJCC. 40 
nodes? 100 nodes?
So, it wasn't a big deal to simply try this "hack" and see what happens.

Nowadays, there are a little more nodes than then - therefore, 
deployment of new protocols may be a bit more difficult.

> 2009/9/14 Detlef Bosau <detlef.bosau at web.de>:
>   
>>> Maximum Segment Lifetime.
>>>       
>> However, the story remains the same. What is the reason to keep a segment in
>> the network that long?
>>     
>
> The MSL is not that we should try to keep segments in the network that
> long, but that protocols should still work if, by mistake, a packet
> does survive that long.  We don't want a misconfigured router
> somewhere to cause ambiguity between two IP fragments, for example.
>
> It was perhaps misleading of me to bring the MSL into the discussion
> in the first place...  (We want the network to be "safe" under those
> conditions, but shouldn't optimise for them.)  The point was that a
> few seconds of delay is not "wrong", even though it is undesirable.
>
> Cheers,
> Lachlan
>
>   

-- 
Detlef Bosau		Galileistraße 30	70565 Stuttgart
phone: +49 711 5208031	mobile: +49 172 6819937	skype: detlef.bosau	
ICQ: 566129673		http://detlef.bosau@web.de