SSH file transfers hang after ~700KB
Solution 1
NOTE (TLDR): As salva said, the quick fix is to update to Net::SFTP::Foreign 1.70_08 or newer. There was a bug preventing the library from using the right default queue_size
in Windows. Read on for my original solution.
I was able to eliminate stalled transfers by reducing the Perl Net::SFTP::Foreign module's queue_size
parameter from the default 32 to 23. This is a network performance-related variable and I don't perceive a speed difference when it's reduced. For production, I'm going to dial it back to 16 for safety.
queue_size => $size
read and write requests are pipelined in order to maximize transfer throughput. This option allows to set the maximum number of requests that can be concurrently waiting for a server response.
The queue_size option applies to Net::SFTP::Foreign's new
, get
, put
, rget
, and rput
methods.
If you're using Net::SFTP::Foreign, you can tack on the queue_size
parameter at the end of your constructor or the individual method calls.
my $sftp = Net::SFTP::Foreign->new( $host, queue_size => 16 );
# -- OR --
$sftp->put( $local_file, $remote_file, queue_size => 16 );
I'm using the Net::SFTP::Foreign::Compat wrapper for a Net::SFTP-like API. The wrapper lets you set default parameters to use with the underlying Net::SFTP::Foreign method calls.
use Net::SFTP::Foreign::Compat;
# Note that the Compat CPAN documentation is INCORRECT
# The DEFAULTS hash is under Compat, not Foreign
$Net::SFTP::Foreign::Compat::DEFAULTS{ new } = [ queue_size => 16 ];
# -- OR --
$Net::SFTP::Foreign::Compat::DEFAULTS{ put } = [ queue_size => 16 ];
my $sftp = Net::SFTP::Foreign::Compat->new( $host, %options );
Related thought: The maximum working queue_size
of 23 and the default block_size
of 32KB correlated with the number of bytes sent before the transfer stalled. 23 * 32KB is a little over 700KB. I can't explain why I needed to limit the queue_size
on this deployment, but it solved my problem.
Solution 2
This feels like an MTU issue to me. I've seen similar problems with scping files when there is a path MTU mismatch. TCP buffers on the host can fill up, which is why you send more than a few packets, but once they fill up your client then stalls - long after the MTU issue itself actually had an impact.
To verify this you could look at a tool like scamper. Or you could try manually clamping your MTU to some lower point - say 1280 bytes - and seeing if that fixes it. The chances are good that there's only a slight mismatch, eg because of a PPPoE link in the path that reduces the MTU to 1492.
Solution 3
perhaps you can play with the block_size
parameter? I've experienced some weirdness related to file uploads over SSH/SSL and somehow adjusting the block / chunk size helped. I believe the default is 32 * 1024
, so perhaps lowering it to 16, 8, 4 and trying might make some difference. Might be worth a shot.
Solution 4
The default queue size on Windows was set to 4, but this default was not being picked up due to a bug on the module.
It has been corrected (and BTW, the Windows default queue size changed to 16). You can now download Net::SFTP::Foreign 1.70_08.
Next time you find a problem like this, please, report it!
Related videos on Youtube
Secure Shel
Updated on September 18, 2022Comments
-
Secure Shel over 1 year
OVERVIEW
I'm scripting SFTP file transfers on a Windows 7 box in the field. When I transfer zip files larger than a megabyte or so, the transfer (and SSH) stops unexpectedly after about 700KB. The SSH process seems to hang but it doesn't start wildly consuming CPU or memory and Windows doesn't say it's stopped responding. The server eventually gives up and times out, but the client never returns (despite setting ServerAliveCountMax and ServerAliveInterval). The server logs show the transfer start followed by the timeout and disconnect after a long time.
DETAILS
Re-running the script back to back to back, the same number of bytes are transferred before the hang each time. Walking away and coming back, then running the script again, the byte count will vary by a few tens of kilobytes.
The command-line sftp.exe client works properly with default options; those larger zip files transfer successfully. I can't test scp.exe because the SSH user account doesn't have shell access. I believe the problem persisted across an OS wipe and re-image from XP to 7 so I suspect a network problem, but I can't fathom what kind of problem would cause this behavior.
I'm scripting with Perl and the Net::SFTP::Foreign module. Here are the relevant bits of code extracted from my test case script (SSH output log attached below):
my %connectOpts = ( user => $username ,timeout => $timeout ,warn => sub {} ,more => [ "-o PreferredAuthentications=publickey", "-o ServerAliveCountMax=6", "-o ServerAliveInterval=" . int($timeout/6), "-vvvv" ] ); my $sftp = Net::SFTP::Foreign::Compat->new( $host, %connectOpts ); foreach my $local_file ( @local_files ) { $sftp->do_stat( 'upload.tmp' ) && print( "Overwriting upload.tmp\n" ); $sftp->put( $local_file, 'upload.tmp' ); $sftp->do_rename( 'upload.tmp', basename( $local_file )); }
I use this code and very similar configuration on dozens of other field PCs for the same purpose. This is the first one I've seen with this problem. I'm looking for a root cause or diagnostic advice, not a workaround outside this ecosystem. There are countless ways to script this, but I need the granularity, feedback, and modularity of the Net::SFTP::Foreign module for logging and error reporting. Perl is the only scripting language available.
Thanks. This has been incredibly frustrating.
Configuration:
- Windows 7
- ActivePerl 5.8.9
- Net::SFTP::Foreign v1.69
- OpenSSH 5.3p1 (Windows client from CopSSH distro)
- OpenVPN 2.2-rc (Windows connection back to corporate network)
- Tectia Server 6.2.1.168 (Linux SSH server)
Client output (-vvvv) until I manually terminated ssh.exe after it went sideways:
OpenSSH_5.3p1, OpenSSL 0.9.8k 25 Mar 2009 debug1: Reading configuration data /etc/ssh_config debug2: ssh_connect: needpriv 0 debug1: Connecting to sftp.server.com [xx.xx.xx.xx] port 22. debug1: Connection established. debug1: identity file /cygdrive/c/program files/copssh/.ssh/identity type -1 debug3: Not a RSA1 key file /cygdrive/c/program files/copssh/.ssh/id_rsa. debug2: key_type_from_name: unknown key type '-----BEGIN' debug3: key_read: missing keytype debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug2: key_type_from_name: unknown key type '-----END' debug3: key_read: missing keytype debug1: identity file /cygdrive/c/program files/copssh/.ssh/id_rsa type 1 debug1: identity file /cygdrive/c/program files/copssh/.ssh/id_dsa type -1 debug1: Remote protocol version 2.0, remote software version 6.2.1.168 SSH Tectia Server debug1: no match: 6.2.1.168 SSH Tectia Server debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_5.3 debug2: fd 3 setting O_NONBLOCK debug1: SSH2_MSG_KEXINIT sent debug3: Wrote 792 bytes for a total of 813 debug1: SSH2_MSG_KEXINIT received debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-g roup-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 debug2: kex_parse_kexinit: ssh-rsa,ssh-dss debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour12 8,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rij [email protected] debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour12 8,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rij [email protected] debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,[email protected],hmac-ripemd160 ,[email protected],hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,[email protected],hmac-ripemd160 ,[email protected],hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: none,[email protected],zlib debug2: kex_parse_kexinit: none,[email protected],zlib debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: first_kex_follows 0 debug2: kex_parse_kexinit: reserved 0 debug2: kex_parse_kexinit: diffie-hellman-group1-sha1,diffie-hellman-group14-sha 1,[email protected],diffie-hellman-group-exchange-sha1,diffi e-hellman-group-exchange-sha256 debug2: kex_parse_kexinit: ssh-rsa,[email protected] debug2: kex_parse_kexinit: aes256-cbc,aes192-cbc,aes128-cbc,3des-cbc debug2: kex_parse_kexinit: aes256-cbc,aes192-cbc,aes128-cbc,3des-cbc debug2: kex_parse_kexinit: hmac-sha1 debug2: kex_parse_kexinit: hmac-sha1 debug2: kex_parse_kexinit: none,zlib debug2: kex_parse_kexinit: none,zlib debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: first_kex_follows 0 debug2: kex_parse_kexinit: reserved 0 debug2: mac_setup: found hmac-sha1 debug1: kex: server->client aes128-cbc hmac-sha1 none debug2: mac_setup: found hmac-sha1 debug1: kex: client->server aes128-cbc hmac-sha1 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<2048<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug3: Wrote 24 bytes for a total of 837 debug2: dh_gen_key: priv key bits set: 160/320 debug2: bits set: 1016/2048 debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY debug3: Wrote 272 bytes for a total of 1109 debug3: check_host_in_hostfile: filename /cygdrive/c/program files/copssh/.ssh/known_hosts debug3: check_host_in_hostfile: match line 2 debug3: check_host_in_hostfile: filename /cygdrive/c/program files/copssh/.ssh/known_hosts debug3: check_host_in_hostfile: match line 1 debug1: Host 'sftp.server.com' is known and matches the RSA host key. debug1: Found key in /cygdrive/c/program files/copssh/.ssh/known_hosts:2 debug2: bits set: 995/2048 debug1: ssh_rsa_verify: signature correct debug2: kex_derive_keys debug2: set_newkeys: mode 1 debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug3: Wrote 16 bytes for a total of 1125 debug2: set_newkeys: mode 0 debug1: SSH2_MSG_NEWKEYS received debug1: SSH2_MSG_SERVICE_REQUEST sent debug3: Wrote 52 bytes for a total of 1177 debug3: Received SSH2_MSG_IGNORE debug3: Received SSH2_MSG_IGNORE debug2: service_accept: ssh-userauth debug1: SSH2_MSG_SERVICE_ACCEPT received debug2: key: /cygdrive/c/program files/copssh/.ssh/identity (0x0) debug2: key: /cygdrive/c/program files/copssh/.ssh/id_rsa (0xd4b720) debug2: key: /cygdrive/c/program files/copssh/.ssh/id_dsa (0x0) debug3: Wrote 84 bytes for a total of 1261 debug3: Received SSH2_MSG_IGNORE debug1: Authentications that can continue: publickey,keyboard-interactive debug3: start over, passed a different list publickey,keyboard-interactive debug3: preferred publickey debug3: authmethod_lookup publickey debug3: remaining preferred: debug3: authmethod_is_enabled publickey debug1: Next authentication method: publickey debug1: Trying private key: /cygdrive/c/program files/copssh/.ssh/identity debug3: no such identity: /cygdrive/c/program files/copssh/.ssh/identity debug1: Offering public key: /cygdrive/c/program files/copssh/.ssh/id_rsa debug3: send_pubkey_test debug2: we sent a publickey packet, wait for reply debug3: Wrote 372 bytes for a total of 1633 debug3: Received SSH2_MSG_IGNORE debug1: Server accepts key: pkalg ssh-rsa blen 277 debug2: input_userauth_pk_ok: fp (...) debug3: sign_and_send_pubkey debug1: read PEM private key done: type RSA debug3: Wrote 644 bytes for a total of 2277 debug3: Received SSH2_MSG_IGNORE debug1: Authentication succeeded (publickey). debug2: fd 4 setting O_NONBLOCK debug2: fd 5 setting O_NONBLOCK debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug1: Entering interactive session. debug3: Wrote 68 bytes for a total of 2345 debug3: Received SSH2_MSG_IGNORE debug2: callback start debug2: client_session2_setup: id 0 debug1: Sending subsystem: sftp debug2: channel 0: request subsystem confirm 1 debug2: fd 3 setting TCP_NODELAY debug2: callback done debug2: channel 0: open confirm rwindow 65536 rmax 32768 debug3: Wrote 68 bytes for a total of 2413 debug3: Wrote 52 bytes for a total of 2465 debug3: Received SSH2_MSG_IGNORE debug2: channel_input_status_confirm: type 99 id 0 debug2: subsystem request accepted on channel 0 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 68 bytes for a total of 2533 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 84 bytes for a total of 2617 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 68 bytes for a total of 2685 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 16436 bytes for a total of 19121 debug3: Wrote 4148 bytes for a total of 23269 debug3: Wrote 12372 bytes for a total of 35641 debug3: Wrote 16436 bytes for a total of 52077 debug3: Wrote 4148 bytes for a total of 56225 debug3: Wrote 12196 bytes for a total of 68421 debug3: Received SSH2_MSG_IGNORE debug2: channel 0: rcvd adjust 32899 debug3: Wrote 212 bytes for a total of 68633 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 16436 bytes for a total of 85069 debug3: Wrote 4148 bytes for a total of 89217 debug3: Received SSH2_MSG_IGNORE debug2: channel 0: rcvd adjust 98337 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 12372 bytes for a total of 101589 debug3: Wrote 16436 bytes for a total of 118025 debug3: Wrote 4148 bytes for a total of 122173 debug3: Wrote 12372 bytes for a total of 134545 debug3: Wrote 16436 bytes for a total of 150981 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 155129 debug3: Wrote 12372 bytes for a total of 167501 debug3: Received SSH2_MSG_IGNORE debug2: channel 0: rcvd adjust 196674 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 16436 bytes for a total of 183937 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 188085 debug3: Wrote 12372 bytes for a total of 200457 debug3: Wrote 16436 bytes for a total of 216893 debug3: Wrote 4148 bytes for a total of 221041 debug3: Wrote 12372 bytes for a total of 233413 debug3: Wrote 16436 bytes for a total of 249849 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 253997 debug3: Wrote 12372 bytes for a total of 266369 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 16436 bytes for a total of 282805 debug3: Wrote 4148 bytes for a total of 286953 debug3: Wrote 12372 bytes for a total of 299325 debug3: Received SSH2_MSG_IGNORE debug2: channel 0: rcvd adjust 131204 debug3: Wrote 16436 bytes for a total of 315761 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 319909 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 12372 bytes for a total of 332281 debug3: Wrote 16436 bytes for a total of 348717 debug3: Wrote 4148 bytes for a total of 352865 debug3: Wrote 12372 bytes for a total of 365237 debug3: Wrote 16436 bytes for a total of 381673 debug3: Wrote 4148 bytes for a total of 385821 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 12372 bytes for a total of 398193 debug3: Wrote 16436 bytes for a total of 414629 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 418777 debug3: Wrote 12372 bytes for a total of 431149 debug3: Wrote 16436 bytes for a total of 447585 debug3: Received SSH2_MSG_IGNORE debug2: channel 0: rcvd adjust 393348 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 451733 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 12372 bytes for a total of 464105 debug3: Wrote 16436 bytes for a total of 480541 debug3: Wrote 4148 bytes for a total of 484689 debug3: Wrote 12372 bytes for a total of 497061 debug3: Wrote 16436 bytes for a total of 513497 debug3: Wrote 4148 bytes for a total of 517645 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 12372 bytes for a total of 530017 debug3: Wrote 16436 bytes for a total of 546453 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 550601 debug3: Wrote 12372 bytes for a total of 562973 debug3: Wrote 16436 bytes for a total of 579409 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 583557 debug3: Wrote 12372 bytes for a total of 595929 debug3: Wrote 16436 bytes for a total of 612365 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 616513 debug3: Wrote 12372 bytes for a total of 628885 debug3: Wrote 16436 bytes for a total of 645321 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 649469 debug3: Wrote 12372 bytes for a total of 661841 debug3: Wrote 16436 bytes for a total of 678277 debug3: Wrote 4148 bytes for a total of 682425 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 12372 bytes for a total of 694797 debug3: Wrote 16436 bytes for a total of 711233 debug3: Received SSH2_MSG_IGNORE debug2: channel 0: rcvd adjust 738120 debug3: Received SSH2_MSG_IGNORE debug3: Wrote 4148 bytes for a total of 715381 debug3: Received SSH2_MSG_IGNORE
UPDATE 1
Per @Daniel Lawson's advice, I tried using netsh to clamp the VPN interface's MTU to 1280 to avoid filling the client's TCP buffers in case of an MTU mismatch along the path to the server. The transfer still stalled at about the same point. I also tried disconnecting the VPN and clamping the physical interface's MTU to 1280 with the same failed result.-
Gregory MOUSSAT about 12 yearsYou don't say anything about running your transfert manually from the command line, with sftp and the desired options. Is it okay, or do you have the same problem ?
-
Secure Shel about 12 yearsActually, I did note that "the command-line sftp.exe client works properly." I've just now added a bit more explanation; sftp.exe file transfers succeed with default options. I've also tried Net::SFTP::Foreign without the keepalive options and got the same failure as before.
-
Secure Shel about 12 yearsGreat suggestion. I played with ping -f -l and it looks like the 1500 MTU holds up from client to server. I used the netsh shell to clamp the VPN network interface MTU to 1280, but experienced the same stall. I also tried running the transfer over the physical interface with the MTU clamped to 1280 and the VPN disconnected--no luck there, either.
-
Secure Shel about 12 yearsThanks, this was really close. I varied the Net::SFTP::Foreign::put() block_size parameter (default 32KB), but choosing smaller blocks translated into less data sent before the transfer stalled. There's a related queue_size parameter that controls the "maximum number of requests that can be concurrently waiting for a server response". It starts at 32. By cutting that down to 23, I finally got transfers up to 5MB to succeed. They fail with queue_size 24, so I'll probably back off to 16 for safety. Still not sure why this installation needs this adjustment, but it's working.
-
Yoav Aner about 12 yearsGreat to hear. I should have mentioned the queue_size too, but wasn't sure what it did exactly and didn't want to overwhelm with too many options that might or might not work. This upload chunking tweaks are also a mystery to me. I think it's something rather deep within OpenSSL/OpenSSH, but as long as there are workarounds we can live with it.
-
Secure Shel about 12 yearsHey, nifty. Thanks for the pointer! Since we use Net::SFTP::Foreign on so many boxes without issue (though often v0.61), I just assumed the remote network was at fault here. Never suspected a bug in the library.