How to prevent TCP connection timeout when FTP'ing large file?
Solution 1
For troubleshooting purposes, try downloading the same file via wget
or curl
. I suspect that PEra is correct, NOOP commands will prevent this, and possibly wget or curl would send them.
Solution 2
If you are going through NAT, chances are the NAT timers are disconnecting you. I see this from hotel rooms where I ssh into a machine and fail to do something for some time (as short as 5 minutes sometimes!)
# echo 60 > /proc/sys/net/ipv4/tcp_keepalive_time
# echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl
# echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes
Try those. This will cause a keepalive to be sent on all TCP streams once every minute regardless of activity on the socket.
Note that the ftp client may not actually USE keepalives. It is something that the application must request. If that fails, perhaps installing another FTP client would work better. The NetBSD FTP client (lukemftp) may be available, and is the best command-line FTP client I've seen to date.
It's also possible the remote end is closing the connection due to inactivity. If it is, it has a rather broken idea of reality. If these TCP keepalive hacks above don't fix it, either the client will have to send some command periodically (NOOP, etc) or the administrators of the FTP server will have to change their end.
Solution 3
It could possibly be any filtering device on the way between your VM and the FTP server. Most firewalls (including home routers) have a state table where idle TCP sessions are reset after certain timeout.
You could change the VMs NIC to bridged mode (instead of NAT) to sort out the host OS. Then, make sure your FTP client sends NOOP commands periodically to keep the command channel open. There are firewalls around which close the data connection, if they see that the command session is closed. Regardless if the data connection is idle or carrying traffic...
HTH,
PEra
Solution 4
If you are doing this from command line, try enabling 'hash' ('binary' is another one I always turn on). This may generate enough traffic on the control port to keep it from timing out.
Solution 5
FTP uses two sockets - one for control, and one for data.
It's likely that it's the NAT state tables on the VM that are causing the timing out the control connection because of inactivity on that socket.
You may be able to get around this by enabling "Active FTP" on the VM system which hopefully will cause VMware to actively watch FTP sessions and keep the control socket alive so long as data is still flowing.
Related videos on Youtube
Comments
-
Kendall almost 2 years
I am not able to FTP (retrieve) a large file from the Internet to my Linux VM. It times out after a while.
The actual error is "Could not read reply from control connection -- timed out." This error occurs after a few minutes, after a good chunk of the file has already been transferred.
The setup is:
FTP Client: ncftpget running in Linux on VMWare Player 3.0 FTP Server: somebody else's machine out on the Internet, configuration unknown Guest OS: Ubuntu 8.10 Linux 32-bit, with vmxnet and vmware tools installed. Host OS: Vista 64-bit Network: Linux VM connects to the Internet via Bridged NIC (also tried NAT) FTP Mode: PASV
I did find some forum postings mentioning a 2-minute timeout somewhere. But exactly where and how to fix it was not clear. Some troubleshooting steps already tried:- I have switched from VMWare Player 3.0 to VirtualBox 3.0.x, but no luck.
- I also changed from NAT to Bridged virtual NICs, but no luck
UPDATE Netstat on the Linux VM and the equivalent admin page on the DIR-655 router both show the connection is alive and well (tcp 'ESTABLISHED' status). Vista doesn't see the connection at all, which I guess is normal if connection state is managed only within the VM.
Here's the output from netsh interface tcp show global on Vista, in case it's useful:
C:\Users\alex>netsh interface tcp show global Querying active state... TCP Global Parameters ---------------------------------------------- Receive-Side Scaling State : enabled Chimney Offload State : disabled Receive Window Auto-Tuning Level : highlyrestricted Add-On Congestion Control Provider : none ECN Capability : disabled RFC 1323 Timestamps : disabled ** The above autotuninglevel setting is the result of Windows Scaling heuristics overriding any local/policy configuration.
-
Admin over 14 yearsHow is the Host OS connected to the internet? A router?
-
Admin over 14 yearsConnection to the Internet is via a DIR-655 wireless router connected into a Zoom X5 DSL modem. In the whole setup, the Host computer is brand-new, the DIR-655 was recently reconfigured (using WPA2 and multiple zones), and both the modem and the VM are pre-existing (the VM was copied from an older computer).
-
Admin over 14 yearsI'm able to monitor the connections on the DIR-655 and they show a countdown to the timeout of 7000+ seconds (over 2 hours). Something else I did just now is to disable IPv6 on the wireless adapter.
-
Admin over 14 yearsDoing some additional troubleshooting with 'netstat' and checking the DIR-655 also... I notice that while both Linux and the DIR-655 see an 'ESTABLISHED' connection, the connection is GONE from Vista's netstat! It's completely gone... Not even a TIME_WAIT.
-
Admin over 14 yearsI'm also trying KeepAliveTime = 60 in HKEY_LOCAL_MACHINE \SYSTEM \CurrentControlSet \Services \Tcpip \Parameters
-
Admin over 14 yearsWell, that didn't make any difference.
-
Kendall over 14 yearsUnusual advice? Active FTP is much harder to get to work with NAT.
-
Alnitak over 14 yearsYes, possibly unusual, but it might be the only way to get VMware to actually co-operate with FTP.
-
Kendall over 14 yearsI tried it just now (using ncftpget -E to force Active mode). The behaviour is unchanged... the transfer goes on for several minutes and then dies in the end with the same error.
-
Alnitak over 14 yearsoh well, it was worth a try. Some other method to keep the TCP connection active (e.g. the NOOP command) will be required.
-
Kendall over 14 yearsI'm trying this right now. Meanwhile, the documentation is throwing me off a bit... at tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html , it says "Remember that keepalive support, even if configured in the kernel, is not the default behavior in Linux. Programs must request keepalive control for their sockets using the setsockopt interface." Does that mean I have to have a modified FTP client, in addition to doing the above? Anyhow, regardless of answer, I'll try it.
-
Kendall over 14 yearsI have done 2 tests and all have succeeded after applying this change. I have to try a few more times to make sure it's not random chance.
-
Vineet Kasat over 14 yearsI believe the documentation on that page is fairly old. Note that I don't actually RUN linux, nor do I have a NAT that does this sort of timeout readily available. This is just a very common situation, and my laptop has it in /etc/sysctl.conf for this reason. That file is likely NetBSD specific.
-
Vineet Kasat over 14 yearsBTW, if this works, it's either your local NAT "router" box (not your windows-based NAT for your VM) or something on the remote end that was killing the control connection.
-
Vineet Kasat over 14 yearsIt's the client that is timing out. I'm not certain he has access to the server side; if he did, he could use something other than FTP and just avoid the mess.
-
Kendall over 14 yearsInteresting results. Although it now works more often than not, I am STILL occasionally getting the same old error!
-
Kendall over 14 yearsAre the hash marks sent by the server over the control connection? That is a clever workaround! I'll have to convert from ncftpget to regular ftp to try this.
-
Kendall over 14 yearsRight on, Michael. If it were my server, it would be running sshd.
-
Kendall over 14 yearsI just looked at the source for the GNU FTP Client (from Inetutils ftp.gnu.org/gnu/inetutils), unfortunately it seems the hashes are printed locally by the client, so they don't do anything to keep the control connection from going idle :(
-
Kendall over 14 yearsDoing some additional troubleshooting with 'netstat' and checking the DIR-655 also... I notice that while both Linux and the DIR-655 see an 'ESTABLISHED' connection, the connection is GONE from Vista's netstat! It's completely gone... Not even a TIME_WAIT.
-
Vineet Kasat over 14 yearsVista is not making the connection, your VM is. The DIR-655 sees the connection since it is doing NAT, but your Vista machine is only barely involved in the whole mess. Vista may show entries in its firewall, but then again it may not. It depends on just when and how the packets make it through Vista from the VM. But, I would not expect the connection to show up in Vista's netstat no matter what, as it does not terminate either end of the connection.
-
dim over 14 yearsThis may be a failure of the FTP client then not sending proper keep-alives on the command port. Try using different clients maybe? And/or use Wireshark to see whats happening on the wire.
-
alaster over 14 yearsThe NOOP-Command is just that, it does nothing. It's send periodically by the FTP client to prevent the command channel from timing out. Usually it's called something like "keepalive" in any client software.
-
Tim Williscroft over 14 yearsclever, sneaky solution
-
Josh over 14 years+1, NOOP is the way to go. It's possible ncftpget doesn't do them.
-
Kendall over 14 yearsThanks for the great suggestion... Apparently curl has been specifically patched for this purpose (sourceforge.net/tracker/…) ... I'm going to try it now.
-
Kendall over 14 yearsOk. Tested successfully with 6 large files. This is the fix that finally did it. curl has a keepalive option which ncftpget lacks. I don't have enough time to back-out my other changes and re-test, so I don't know if this fix works in isolation or only in conjunction with the /proc/sys/net/ipv4/tcp_keepalive_* stuff from Michael Graff.
-
Kendall over 14 yearsIt had never occurred to me that curl had ftp features or that it might be more complete than ncftpget (ncftpget has been around for 15-20 years?). So I'm going to approve this as best answer, since it fixed the problem, even though I don't know 100% if this is what did it by itself.
-
Kendall over 14 yearsthis "solution" is only wishful thinking... it doesn't work, and a quick review of the ftp source code shows why.
-
Kendall over 14 yearsThanks for the idea; another answerer suggested the actual tool (curl) that implements this.
-
Josh over 14 yearsGlad that worked for you!