<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Hi Zama,<br>
<br>
Thanks for your detailed info. Yes. the tcp stack looks good. One
quick question, what kind of FTP client application is used in your
test, the previous one (where the connection issue is found) or a
new third party one?<br>
<br>
If it is the previous one, then your FTP client should also work
well. If you still encounter that issue, there must be something
else wrong.<br>
<br>
If your test is done with a new third party client, repeat the
procedure with the default client you were using.<br>
<br>
Another thing you can check is the congestion window size during the
period from the time the cable is disconnected to the time the
ESTABLISHED connection is done. Notice that the congestion window
size is hard to measure directly, but you can do this by analyzing
the trace file for the period mentioned above. Use tcpdump to
monitor a particular NIC port and analyze the throughput for that
particular FTP flow. The amount of transmitted data over that PERIOD
should be very small because the congestion window during timeout is
around 1 (in terms of packets).<br>
<br>
If all the things above work normally, you should not have seen the
issues observed before. <br>
<br>
Best wishes,<br>
Yan<br>
<br>
On 2/24/2011 6:06 AM, Zama Ques wrote:
<blockquote cite="mid:564728.78836.qm@web137502.mail.in.yahoo.com"
type="cite">
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td style="font: inherit;" valign="top">Hi Yan,<br>
<br>
I tried testing with iperf today. <br>
<br>
Started server on one side and connected to the client
from another host and after sometime disconnected the
cable on the server host.<br>
<br>
Also , reduced tcp_keepalive to 1200 sec , so timeout
value should be like 32 minutes with the other two
tcp_keepalive related kernel parameter (probes and
interval) .<br>
<br>
<br>
The following are my findings .<br>
<br>
I can see that client terminates the ESTABLISHED
connection after around 16 minutes since the server is not
reachable , that is before the TCP keepalive timeout. <br>
<br>
Looks to me like this minutes is somehow related to TCP
retransmission timeout which probably is determined by the
following 3 parameters which comes to be around 18
minutes. . <br>
<br>
$ cat /proc/sys/net/ipv4/tcp_retries1<br>
3<br>
$ cat /proc/sys/net/ipv4/tcp_retries2<br>
15<br>
$ cat /proc/sys/net/ipv4/tcp_fin_timeout <br>
60<br>
<br>
<br>
<span style="font-weight: bold; font-style: italic;">Is my
assumption correct here ?</span><br>
<br>
<br>
The following is the netstats connection flow during my
experiment<br>
<br>
$ for i in {1..1000} ; do netstat -atn | egrep "5001" ;
date ; sleep 60 ; done<br>
tcp 0 447432 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
tcp 0 0 10.66.X.Y:43531
10.66.A.B:5001 TIME_WAIT <br>
Thu Feb 24 14:47:16 IST 2011<br>
tcp 0 3311576 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:48:16 IST 2011<br>
<span style="font-weight: bold;">(Network Cable removed
during this time from the server) </span><br>
tcp 0 3317368 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:49:16 IST 2011<br>
tcp 0 3021976 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:50:16 IST 2011 <br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:51:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:52:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:53:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:54:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:55:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:56:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:57:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:58:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 14:59:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 15:00:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 15:01:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 15:02:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 15:03:16 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
Thu Feb 24 15:04:17 IST 2011<br>
tcp 0 2511048 10.66.X.Y:43533
10.66.A.B:5001 ESTABLISHED <br>
..<span style="font-weight: bold;"> (comes out be arnd 15
minutes from the server went unreachable when the
connection status changed in client) </span><br>
<br>
Thu Feb 24 15:05:17 IST 2011<br>
Thu Feb 24 15:06:17 IST 2011<br>
Thu Feb 24 15:07:17 IST 2011<br>
<br>
<br>
The following packet flow can be seen on client as sniffed
by tcpdump during while I removed the network cable . <br>
<br>
=====<br>
14:50:20.158794 IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1801285561:1801350721, ack 0, win 92, options [nop,nop,TS
val 172872553 ecr 177048109], length 65160<br>
14:50:20.164550 IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1801350721:1801415881, ack 0, win 92, options [nop,nop,TS
val 172872558 ecr 177048115], length 65160<br>
<br>
<span style="font-style: italic; font-weight: bold;">14:50:20.394916
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172872992 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:50:21.258921
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172873856 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:50:22.986922
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172875584 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:50:26.442922
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172879040 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:50:33.354923
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172885952 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:50:47.178932
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172899776 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:51:14.826929
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172927424 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:52:10.122922
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 172982720 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:54:00.714934
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 173093312 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:56:00.714921
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 173213312 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<br style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">14:58:00.714920
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 173333312 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">15:00:00.714921
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 173453312 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">15:02:00.714921
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 173573312 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<span style="font-style: italic; font-weight: bold;">15:04:00.714936
IP edgebeauty-dr.43533 >
shopfrobsoon-dr.commplex-link: Flags [.], seq
1798969993:1798971441, ack 0, win 92, options
[nop,nop,TS val 173693312 ecr 177048116], length 1448</span><br
style="font-style: italic; font-weight: bold;">
<br>
<br>
Does my TCP stack look fine based on the experiments
above . <br>
<br>
<br>
Thanks<br>
Zaman<br style="font-weight: bold;">
<br style="font-weight: bold;">
<br style="font-weight: bold;">
<br style="font-weight: bold;">
<br>
<br>
<br>
--- On <b>Wed, 23/2/11, Yan Cai <i><a class="moz-txt-link-rfc2396E" href="mailto:ycai@ecs.umass.edu"><ycai@ecs.umass.edu></a></i></b>
wrote:<br>
<blockquote style="border-left: 2px solid rgb(16, 16,
255); margin-left: 5px; padding-left: 5px;"><br>
From: Yan Cai <a class="moz-txt-link-rfc2396E" href="mailto:ycai@ecs.umass.edu"><ycai@ecs.umass.edu></a><br>
Subject: Re: end2end-interest Digest, Vol 83, Issue 4<br>
To: "Zama Ques" <a class="moz-txt-link-rfc2396E" href="mailto:queszama@yahoo.in"><queszama@yahoo.in></a><br>
Date: Wednesday, 23 February, 2011, 2:04 PM<br>
<br>
<div id="yiv733957019"> Hi Zaman,<br>
<br>
I guess there might be some unknown ftp configuration
at CLIENT side that causes this issue. You can isolate
the problem first. I-perf can be used to test
functionality of tcp stack on your machine. If it
works as expected, then there is nothing wrong with
tcp stack. Next check the settings of the ftp client
(not the ftp server) to see if there is any specific
configuration that causes this problem. If it is hard
to do that, my suggestion is to install a third party
ftp client application and test with that. <br>
<br>
If none of them works, you might have to trace the
traffic over the cable attached to the client machine
and determine what is going on.<br>
<br>
Best wishes,<br>
Yan<br>
<br>
On 2/23/2011 1:52 AM, Zama Ques wrote:
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td style="font: inherit;" valign="top">Hi
Yan,<br>
<br>
Thanks for your suggestion . I am familiar
with iperf but the issue with us that it is
a prod network and it is advisable for me
not to pump data on the network . Will try
to the experiment between two desktops
connected by a cross over cable. <br>
<br>
What I was trying earlier was that I started
FTP server on one end and connected to the
server from the client side. <br>
<br>
$ ftp 10.66.X.X<br>
Connected to 10.66.X.X<br>
220 (vsFTPd 2.2.2)<br>
Name (10.66.74.141:zama): anonymous<br>
331 Please specify the password.<br>
Password:<br>
230 Login successful.<br>
Remote system type is UNIX.<br>
Using binary mode to transfer files.<br>
<br>
<br>
After that I disconnected the network cable
from the server and was monitoring the
status of the connection on the client side
.<br>
The status of the connection was like this
before and after disconnecting the network
cable. <br>
<br>
---<br>
$ for i in {1..1000} ; do netstat -at |
egrep "ftp" ; date ; sleep 60 ; done<br>
tcp 0 0 edgebeauty.c:50179
shopfrobsoon.c:ftp ESTABLISHED <br>
Wed Feb 23 11:47:53 IST 2011<br>
<br>
tcp 0 0 edgebeauty.c:50179
shopfrobsoon.c:ftp ESTABLISHED <br>
Wed Feb 23 11:48:53 IST 2011<br>
tcp 0 0 edgebeauty.c:50179
shopfrobsoon.c:ftp ESTABLISHED <br>
Wed Feb 23 11:49:53 IST 2011<br>
...<br>
...<br>
Wed Feb 23 12:14:03 IST 2011<br>
tcp 0 0 edgebeauty.c:50179
shopfrobsoon.c:ftp ESTABLISHED <br>
Wed Feb 23 12:15:03 IST 2011<br>
===<br>
<br>
If we see that the time is more than 25
minutes when the server went down and the
client has still maintained the connection
in established state. <br>
<br>
My understanding is that the client should
close the connection after TCP restarsmit
timeout happens or my understanding is
wrong. <br>
<br>
Please clarify . <br>
<br>
--Zaman<br>
<br>
<blockquote style="border-left: 2px solid
rgb(16, 16, 255); margin-left: 5px;
padding-left: 5px;">
<div class="yiv733957019plainMail"><br>
Message: 2<br>
Date: Tue, 22 Feb 2011 09:55:13 -0500<br>
From: Yan Cai <<a
moz-do-not-send="true" rel="nofollow">ycai@ecs.umass.edu</a>><br>
Subject: Re: [e2e] query on behaviour of
tcp_keepalive and tcp<br>
retransmit on Linux based systems<br>
To: <a moz-do-not-send="true"
rel="nofollow">end2end-interest@postel.org</a><br>
Message-ID: <<a
moz-do-not-send="true" rel="nofollow">4D63CE50.8050606@ecs.umass.edu</a>><br>
Content-Type: text/plain;
charset="iso-8859-1"<br>
<br>
Hi<br>
<br>
According to your description, the
expected behavior should be as follows.<br>
At the beginning senders at one side can
send data to the receivers at <br>
the other side, and the receivers can
receive data without any problem. <br>
When some of the receivers become
off-line, the affected senders should <br>
no long receive positive
acknowledgments, therefore, lowering
their <br>
congestion windows (i.e., sending rate).
Since in your case the receiver <br>
is off forever, some senders should
further experience timeout events. <br>
After a few timeouts, the sender should
CLOSE this connection itself.<br>
<br>
As far as I know, the whole procedure
above should be automatically <br>
invoked in the sender side. This is how
TCP (sender) handles exceptions.<br>
<br>
My suggestion is that you run a simple
experiment on your side to see if <br>
TCP in your machine can work that way.
The test can be done using i-perf <br>
to send a long long live TCP flow, and
then take off the receiver in the <br>
middle of the transmission. The
connection is expected to be closed very
<br>
soon after the receiver is off.<br>
<br>
Hope it helpful.<br>
Yan<br>
On 2/22/2011 4:24 AM, Zama Ques wrote:<br>
> We need some clarifications on
TCP_keepalive . We are facing some <br>
> issues on our Prod servers related
to TCP functionality .<br>
><br>
> The issue is like this.<br>
><br>
> We have some machines at one end
sending data in real time to another <br>
> group of machines on the other hand
. Now due to some hardware issues <br>
> on the other hand , some of the
machines becomes unresponsive/crashes. <br>
> The client system which pumps data
never came to know that the server <br>
> went unresponsive . The connection
remains in<br>
> ESTABLISHED state and the client
always tries to send data thinking <br>
> that the connection is alive
because of which we are seeing backlog
on <br>
> client sides.<br>
><br>
> Our understanding is like this on
how TCP will handle the connection.<br>
><br>
><br>
> Q 1) Since the server went down ,
the client will try to the <br>
> retransmit the data until it times
out. What is the behavior of TCP <br>
> after the timeout? Need
clarification on<br>
> the following things.<br>
> a) Will the kernel will close the
established connection after the <br>
> timeout . Looks like no in our case
as we still see the connection <br>
> still in ESTABLISHED state after
around more<br>
> than 2 hours.<br>
> b) Are there any kernel parameters
which decides the when the client <br>
> is timeout after retransmission
fails. What is the behavior of TCP <br>
> after the client retransmission
timeouts.<br>
><br>
><br>
> Q 2 ) There is something called
tcp_keepalive which if implemented in <br>
> the kernel , by default it's there
and comes to be around 2 hrs 2 <br>
> minsutes , i think , the client
will send some TCP probes after the <br>
> keepalive time ineterval and if it
cannot reach the server , then the <br>
> established connection in the
client side will be closed by the kernel
<br>
> . This is my understanding. But I
can see that the connection still <br>
> remains in established after the
tcp_keepalive time . We waited for <br>
> around 2 hrs 30 minutes but the
connection remains in established <br>
> state only. Tried reducing the
keepalive time to be around 10 minutes <br>
> , but the connection remains in
ESTABLISHED state in client side .<br>
><br>
><br>
> Where I went wrong .Please clarify
my doubts raised above . What <br>
> should we do to resolve the problem
we are seeing above . Any help <br>
> will be highly appreciated as we
are going through a hard time to <br>
> resolve the issue .<br>
><br>
> Thanks in Advance<br>
><br>
><br>
<br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <a moz-do-not-send="true"
rel="nofollow" target="_blank"
href="http://mailman.postel.org/pipermail/end2end-interest/attachments/20110222/50be8540/attachment-0001.html">http://mailman.postel.org/pipermail/end2end-interest/attachments/20110222/50be8540/attachment-0001.html</a><br>
<br>
------------------------------<br>
<br>
_______________________________________________<br>
end2end-interest mailing list<br>
<a moz-do-not-send="true" rel="nofollow">end2end-interest@postel.org</a><br>
<a moz-do-not-send="true" rel="nofollow"
target="_blank"
href="http://mailman.postel.org/mailman/listinfo/end2end-interest">http://mailman.postel.org/mailman/listinfo/end2end-interest</a><br>
<br>
<br>
End of end2end-interest Digest, Vol 83,
Issue 4<br>
***********************************************<br>
</div>
</blockquote>
</td>
</tr>
</tbody>
</table>
<br>
</blockquote>
<br>
</div>
</blockquote>
</td>
</tr>
</tbody>
</table>
<br>
</blockquote>
<br>
</body>
</html>