How to prevent TCP connection timeout when FTP'ing large file?

51,798

Solution 1

For troubleshooting purposes, try downloading the same file via wget or curl. I suspect that PEra is correct, NOOP commands will prevent this, and possibly wget or curl would send them.

Solution 2

If you are going through NAT, chances are the NAT timers are disconnecting you. I see this from hotel rooms where I ssh into a machine and fail to do something for some time (as short as 5 minutes sometimes!)

# echo 60 > /proc/sys/net/ipv4/tcp_keepalive_time
# echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl
# echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes

Try those. This will cause a keepalive to be sent on all TCP streams once every minute regardless of activity on the socket.

Note that the ftp client may not actually USE keepalives. It is something that the application must request. If that fails, perhaps installing another FTP client would work better. The NetBSD FTP client (lukemftp) may be available, and is the best command-line FTP client I've seen to date.

It's also possible the remote end is closing the connection due to inactivity. If it is, it has a rather broken idea of reality. If these TCP keepalive hacks above don't fix it, either the client will have to send some command periodically (NOOP, etc) or the administrators of the FTP server will have to change their end.

Solution 3

It could possibly be any filtering device on the way between your VM and the FTP server. Most firewalls (including home routers) have a state table where idle TCP sessions are reset after certain timeout.

You could change the VMs NIC to bridged mode (instead of NAT) to sort out the host OS. Then, make sure your FTP client sends NOOP commands periodically to keep the command channel open. There are firewalls around which close the data connection, if they see that the command session is closed. Regardless if the data connection is idle or carrying traffic...

HTH,
PEra

Solution 4

If you are doing this from command line, try enabling 'hash' ('binary' is another one I always turn on). This may generate enough traffic on the control port to keep it from timing out.

Solution 5

FTP uses two sockets - one for control, and one for data.

It's likely that it's the NAT state tables on the VM that are causing the timing out the control connection because of inactivity on that socket.

You may be able to get around this by enabling "Active FTP" on the VM system which hopefully will cause VMware to actively watch FTP sessions and keep the control socket alive so long as data is still flowing.

Share:
51,798

Related videos on Youtube

Kendall
Author by

Kendall

merge keep

Updated on September 17, 2022

Comments

  • Kendall
    Kendall almost 2 years

    I am not able to FTP (retrieve) a large file from the Internet to my Linux VM. It times out after a while.

    The actual error is "Could not read reply from control connection -- timed out." This error occurs after a few minutes, after a good chunk of the file has already been transferred.

    The setup is:

    FTP Client:  ncftpget running in Linux on VMWare Player 3.0
    FTP Server:  somebody else's machine out on the Internet, configuration unknown
    Guest OS:    Ubuntu 8.10 Linux 32-bit, with vmxnet and vmware tools installed.
    Host OS:     Vista 64-bit
    Network:     Linux VM connects to the Internet via Bridged NIC (also tried NAT)
    FTP Mode:    PASV
    
    I did find some forum postings mentioning a 2-minute timeout somewhere. But exactly where and how to fix it was not clear. Some troubleshooting steps already tried:

    • I have switched from VMWare Player 3.0 to VirtualBox 3.0.x, but no luck.
    • I also changed from NAT to Bridged virtual NICs, but no luck

    UPDATE Netstat on the Linux VM and the equivalent admin page on the DIR-655 router both show the connection is alive and well (tcp 'ESTABLISHED' status). Vista doesn't see the connection at all, which I guess is normal if connection state is managed only within the VM.

    Here's the output from netsh interface tcp show global on Vista, in case it's useful:

    C:\Users\alex>netsh interface tcp show global
    Querying active state...
    
    TCP Global Parameters
    ----------------------------------------------
    Receive-Side Scaling State          : enabled
    Chimney Offload State               : disabled
    Receive Window Auto-Tuning Level    : highlyrestricted
    Add-On Congestion Control Provider  : none
    ECN Capability                      : disabled
    RFC 1323 Timestamps                 : disabled
    ** The above autotuninglevel setting is the result of Windows Scaling heuristics
    overriding any local/policy configuration.
    
    • Admin
      Admin over 14 years
      How is the Host OS connected to the internet? A router?
    • Admin
      Admin over 14 years
      Connection to the Internet is via a DIR-655 wireless router connected into a Zoom X5 DSL modem. In the whole setup, the Host computer is brand-new, the DIR-655 was recently reconfigured (using WPA2 and multiple zones), and both the modem and the VM are pre-existing (the VM was copied from an older computer).
    • Admin
      Admin over 14 years
      I'm able to monitor the connections on the DIR-655 and they show a countdown to the timeout of 7000+ seconds (over 2 hours). Something else I did just now is to disable IPv6 on the wireless adapter.
    • Admin
      Admin over 14 years
      Doing some additional troubleshooting with 'netstat' and checking the DIR-655 also... I notice that while both Linux and the DIR-655 see an 'ESTABLISHED' connection, the connection is GONE from Vista's netstat! It's completely gone... Not even a TIME_WAIT.
    • Admin
      Admin over 14 years
      I'm also trying KeepAliveTime = 60 in HKEY_LOCAL_MACHINE \SYSTEM \CurrentControlSet \Services \Tcpip \Parameters
    • Admin
      Admin over 14 years
      Well, that didn't make any difference.
  • Kendall
    Kendall over 14 years
    Unusual advice? Active FTP is much harder to get to work with NAT.
  • Alnitak
    Alnitak over 14 years
    Yes, possibly unusual, but it might be the only way to get VMware to actually co-operate with FTP.
  • Kendall
    Kendall over 14 years
    I tried it just now (using ncftpget -E to force Active mode). The behaviour is unchanged... the transfer goes on for several minutes and then dies in the end with the same error.
  • Alnitak
    Alnitak over 14 years
    oh well, it was worth a try. Some other method to keep the TCP connection active (e.g. the NOOP command) will be required.
  • Kendall
    Kendall over 14 years
    I'm trying this right now. Meanwhile, the documentation is throwing me off a bit... at tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html , it says "Remember that keepalive support, even if configured in the kernel, is not the default behavior in Linux. Programs must request keepalive control for their sockets using the setsockopt interface." Does that mean I have to have a modified FTP client, in addition to doing the above? Anyhow, regardless of answer, I'll try it.
  • Kendall
    Kendall over 14 years
    I have done 2 tests and all have succeeded after applying this change. I have to try a few more times to make sure it's not random chance.
  • Vineet Kasat
    Vineet Kasat over 14 years
    I believe the documentation on that page is fairly old. Note that I don't actually RUN linux, nor do I have a NAT that does this sort of timeout readily available. This is just a very common situation, and my laptop has it in /etc/sysctl.conf for this reason. That file is likely NetBSD specific.
  • Vineet Kasat
    Vineet Kasat over 14 years
    BTW, if this works, it's either your local NAT "router" box (not your windows-based NAT for your VM) or something on the remote end that was killing the control connection.
  • Vineet Kasat
    Vineet Kasat over 14 years
    It's the client that is timing out. I'm not certain he has access to the server side; if he did, he could use something other than FTP and just avoid the mess.
  • Kendall
    Kendall over 14 years
    Interesting results. Although it now works more often than not, I am STILL occasionally getting the same old error!
  • Kendall
    Kendall over 14 years
    Are the hash marks sent by the server over the control connection? That is a clever workaround! I'll have to convert from ncftpget to regular ftp to try this.
  • Kendall
    Kendall over 14 years
    Right on, Michael. If it were my server, it would be running sshd.
  • Kendall
    Kendall over 14 years
    I just looked at the source for the GNU FTP Client (from Inetutils ftp.gnu.org/gnu/inetutils), unfortunately it seems the hashes are printed locally by the client, so they don't do anything to keep the control connection from going idle :(
  • Kendall
    Kendall over 14 years
    Doing some additional troubleshooting with 'netstat' and checking the DIR-655 also... I notice that while both Linux and the DIR-655 see an 'ESTABLISHED' connection, the connection is GONE from Vista's netstat! It's completely gone... Not even a TIME_WAIT.
  • Vineet Kasat
    Vineet Kasat over 14 years
    Vista is not making the connection, your VM is. The DIR-655 sees the connection since it is doing NAT, but your Vista machine is only barely involved in the whole mess. Vista may show entries in its firewall, but then again it may not. It depends on just when and how the packets make it through Vista from the VM. But, I would not expect the connection to show up in Vista's netstat no matter what, as it does not terminate either end of the connection.
  • dim
    dim over 14 years
    This may be a failure of the FTP client then not sending proper keep-alives on the command port. Try using different clients maybe? And/or use Wireshark to see whats happening on the wire.
  • alaster
    alaster over 14 years
    The NOOP-Command is just that, it does nothing. It's send periodically by the FTP client to prevent the command channel from timing out. Usually it's called something like "keepalive" in any client software.
  • Tim Williscroft
    Tim Williscroft over 14 years
    clever, sneaky solution
  • Josh
    Josh over 14 years
    +1, NOOP is the way to go. It's possible ncftpget doesn't do them.
  • Kendall
    Kendall over 14 years
    Thanks for the great suggestion... Apparently curl has been specifically patched for this purpose (sourceforge.net/tracker/…) ... I'm going to try it now.
  • Kendall
    Kendall over 14 years
    Ok. Tested successfully with 6 large files. This is the fix that finally did it. curl has a keepalive option which ncftpget lacks. I don't have enough time to back-out my other changes and re-test, so I don't know if this fix works in isolation or only in conjunction with the /proc/sys/net/ipv4/tcp_keepalive_* stuff from Michael Graff.
  • Kendall
    Kendall over 14 years
    It had never occurred to me that curl had ftp features or that it might be more complete than ncftpget (ncftpget has been around for 15-20 years?). So I'm going to approve this as best answer, since it fixed the problem, even though I don't know 100% if this is what did it by itself.
  • Kendall
    Kendall over 14 years
    this "solution" is only wishful thinking... it doesn't work, and a quick review of the ftp source code shows why.
  • Kendall
    Kendall over 14 years
    Thanks for the idea; another answerer suggested the actual tool (curl) that implements this.
  • Josh
    Josh over 14 years
    Glad that worked for you!