[e2e] New approach to diffserv...
smd at ab.use.net
Mon Jun 17 08:40:24 PDT 2002
| 1. Why is routing done with middleboxes?
Convenience. The packets have to flow through the middle-box
anyway, and in a hop-by-hop next-hop-forwarding model, it's
easier to have anything in the forwarding path tightly coupled
to the routing "brain".
(A router could upload static routes at intervals, or whatnot,
and this is effectively what would happen in an integration between an
edge router and the sort of architecture the Cisco 12xxx series uses).
| 2. Where is the financial incentive to build networks if the basic
| network architecture FORCES you to deliver a commodity with no value-added?
How to get (almost) thrown out of a taxi:
1. Get asked this question by a brilliant end-to-end researcher
Where "mile" is shorthand for topological distance in
money terms, roughly speaking.
Costs are closely related to geographical mileage
in terrestrial networks; value is generally believed to
be coupled to geographical footprint. However,
the abstract mile can be fitted to a combination of some
sort of distance and the set of equipment in the path.
Users (singular or in aggregate) are ALWAYS in control of
how much traffic they send, and where they send it to.
We like MTU-sized packets; it keeps us able to focus on
fewer headers per second, which are harder to cope with
right now than bits per second.
Next, offer a simple implementational outline:
a. routers talking a link-state protocol know the topology
of the local wan, and can have several metrics associated
with each link; there is no reason why physical distance
could not be discovered this way
b. when you build your forwarding table ("FIB") you
look at your BGP data and assign each prefix a
next-hop; the next-hops, if not directly connected,
are "recursively" looked up in the table mapping
potential BGP next-hops to the next router closer
to that BGP next-hop. That table is constructed
via the IGP.
Thus, from IGP and BGP you know:
prefix BGP-next-hop next-router-info metricA metricB ...
where the next-router-info is essentially an
(interface, subinterface, MAC, label, ...) vector
and metricA is the one fed into the SPF algorithm.
metricB could be, for example, the dollar cost of
forwarding a packet from "here" to the BGP-next-hop,
and is really only important at the edge of your
c. you could work out a settlement mechanism by
propagating your "metricB" from a BGP router
towards your peers, suppliers, or customers
d. you don't want micropayments. you probably don't
want to argue about accounting irregularities,
so offer a prepayment mechanism and implement
it as a simple token bucket
e. your implementation then offers you a service such that
i. you offer a tariff (you can be as dynamic as needed)
indicating the cost in tokens of sending one packet
to destination X from interface Y
ii. you refill the token bucket at some interval
(you can do this as often as needed; perhaps
monthly, perhaps every tenth of a second...)
with a fixed number of tokens
iii. overpayments can add extra tokens as needed;
oversubscription can be fixed by adding fewer
tokens per interval while adjusting the bill down
f. Observe that the mechanism exists ALREADY, and that
the sole difference is contractual; Cisco's CAR and
rate-shape/rate-limit and routing protocols would
trivially support this implementation, and does effectively,
except that any given packet typically is tariffed at
one token, independent of destination.
That is, if you have a 2Gbps connection to network X,
but only want to pay for a 500Mbps one, a bucket
mechanism is used to keep the amount of traffic you
can move into the network at 500Mbps (+/- epsilon).
This already enjoys widespread deployment, and is
in Internet terms, a venerable product.
g. Market this as a cost savings for people who send
mostly local traffic out from their networks,
so that these silly NAPs and exchange points can
finally be done away with -- they can buy fewer
tokens over time.
h. Help customers who want to buy exactly enough tokens.
There are various ways of doing this.
Note that we do not charge for INCOMING traffic, since
there is no practical way to control that, and we don't
want victims paying for being the target of a DDOS attack.
On the other hand, we do want DDOS magnifiers to fix their
networks or pay for infrastructure (and other) improvements
to support floods, so fully charging magnifiers by letting
their tokens exhaust is a nice thing.
The objection from the researcher I was in the taxi with is that
one of the major strengths of the Internet is that you do not care
where the entity you are talking to is located. I agree that this
is a strength -- I enjoy communicating with people far away from me,
but I would certainly think about paying by the packet-mile rather
than paying a flat rate (which is point (f) above), and would use
the cheaper option.
Indeed, flat-rate won't go away even if there is widespread acceptance
of this kind of charging mechanism. The price-point will move though,
depending on the demographics.
However, this is more likely to be a virtuous circle than not --
congestion suddenly becomes something that a provider will be DESPERATE
to avoid, because it would mean an immediate drop in revenue as TCPs
back down. Senders will refine techniques summarized in RFC 2001 to
minimize their token consumption. They will also send more data per
overhead byte, and finally take advantage of native multicast.
Finally, in direct answer to your question: the value is in
transporting a datagram from input interface to the correct
output interface with essentially zero loss, very low delay
(and little delay variance, ideally), and in making places
far away maximally reachable.
| 3. Why is the network engineered in isolation from applications?
Almost all applications are likely to be able to take advantage
of a well-engineered IP network; the only upper layer consideration
is likely to be the ratio of congestion-avoiding vs
non-congestion-avoiding traffic, and what to do about in-network
If someone comes up with an application which REALLY can't
be done across the Internet, well, there are other network
technologies on offer (SDH/SONET, for example) which are
not yet gone the way of the dinosaur.
Essentially the answer is that this approach distributes
the engineering cost of supporting applications away from
the teams which are busily trying to cope with ongoing
huge traffic growth in the "core".
| 4. Isn't e2e just a clever logical deception? It's of course obvious that
| an engineered artifact will have the maximum longevity it if it avoids
| any concession to current needs, but very few of us buy wheels, chassis
| and motor, instead opting for value-added services such as seats and a
| roof. E2e is just rhetorical "argumentum ad absurdum" wrapped up like
| some engineering mystique, no?
No, because what is being bought is a large set of distributed
computations. In one of those distributed computations, the
goal is to render a web page on the screen in front of you.
Some of the computation is done locally; some is done within
the network (decrementing TTLs, header-sanity checksumming,
doing forwarding table lookups, constructing the tables, ...);
some is done across the network (fetching the content).
The end2end argument claims that the most effective approach
is to move a maximal amount of work in this distributed computation
to the hosts on either side. In particular, the reliable transfer
of the data, free from errors caused by in-flight data corruption,
duplication or loss, ordering the data, handling mismatch
between different endianneses or software implementations, and
so on, is -- according to the end2end argument -- best done in the
You can certainly buy a distributed computation which does alot
more of this in the middle of the network. People still use X.25
| Coherent, rational discussion only please - all rants will be ignored.
Oh damn. That's tricky.
More information about the end2end-interest