CIFS randomly losing connection to Windows share

28,794

I found an interesting related post here cifs mounted folder keeps disconnecting (ubuntu server), talking of a similar problem (same error, Samba shares).

The relevant tidbit here, for following the rest of the answer, is that CIFS mounts use the SMBv1.0 protocol by default, as can be verified issuing the mountcommand, and paying attention to the vers=1.0 field.

$mount //10.2.1.2/XX/ZZ/YY on /mnt/mount_point type cifs (rw,relatime,vers=1.0,cache=strict,username=someusername,domain=XXX,uid=1001,forceuid,gid=1001,forcegid,addr=10.2.1.2,file_mode=0770,dir_mode=0770,nounix,serverino,mapposix,rsize=61440,wsize=65536,echo_interval=60,actimeo=1)

I also found in Stack Overflow, the post Mount CIFS Host is down

This could be also because of protocol mismatch. In 2017 Microsoft patched Windows Servers and advised to disable the SMB1 protocol.

From now on, mount.cifs might have problems with protocol negotiation.

The error displayed is "Host is down." but when you do debug with:

smbclient -L <server_ip> -U <username> -d 256

you will get the error:

protocol negotiation failed: NT_STATUS_CONNECTION_RESET

The post mentions that Windows patches to the protocol/Wannacry and others, are messing up with/or more exactly, some people disabled v1 CIFS requests functionality; similar problems have been happening on the Windows front, and, given the timing, it makes me suspect the problem must be related.

We have not disabled v1 CIFS in this specific server, AFAIK (and testing confirms this), however the MS bulletins suggest the default SMBv1 behaviour was (slightly) changed.

I ended up following the general idea suggested in the mentioned Samba question. From man mounts.cifs:

vers=

    SMB protocol version. Allowed values are:
    • 1.0 - The classic CIFS/SMBv1 protocol. This is the default.

    • 2.0 - The SMBv2.002 protocol. This was initially introduced in Windows Vista Service Pack 1, and Windows Server 2008. Note that the initial release version of Windows Vista spoke a slightly different dialect (2.000) that is not supported.

    • 2.1 - The SMBv2.1 protocol that was introduced in Microsoft Windows 7 and Windows Server 2008R2.

    • 3.0 - The SMBv3.0 protocol that was introduced in Microsoft Windows 8 and Windows Server 2012.

    Note too that while this option governs the protocol version used, not all features of each version are available.

--verbose

    Print additional debugging information for the mount. Note that this parameter must be specified before the -o. For example:
 mount -t cifs //server/share /mnt --verbose -o user=username

As seen by the manual, in recent Windows versions after Windows 8 using at least vers=2.0 may make more sense; the alternative syntax in the command line with the --verbose option that is mentioned is also be useful to further debug any complication that may arise.

As such, as the Windows server which I am mounting stuff from on this question is a Windows server 2008 R2, I put in /etc/fstab:

//10.2.1.2/XX/ZZ/YY /mnt/mount_point cifs credentials=/root/.smbcredentials,iocharset=utf8,file_mode=0770,dir_mode=0770,uid=1001,gid=1001,vers=2.1 0 0

Then remounted it for the option to take effect:

sudo mount -o remount /mnt/mount_point

Now we verify, with mount again, to confirm the negotiated protocol:

$mount //10.2.1.2/XX/ZZ/YY on /mnt/mount_point type cifs (rw,relatime,vers=2.1,cache=strict,username=someusername,domain=XXX,uid=1001,forceuid,gid=1001,forcegid,addr=10.2.1.2,file_mode=0770,dir_mode=0770,nounix,serverino,mapposix,rsize=61440,wsize=65536,echo_interval=60,actimeo=1)

And we can indeed confirm we modified successfully the SMB protocol being used.

See also MS Developer Network - [MS-SMB2]: Versioning and Capability Negotiation - 1.7 Versioning and Capability Negotiation

It should also be noted CIFS v1.0, besides being obsolete, is extremely inefficient and insecure, compared to newer versions of the protocol.

From MS blogs - Stop using SMB1

SMB1 isn’t modern or efficient
When you use SMB1, you lose key performance and productivity optimizations for end users.

  • Larger reads and writes (2.02+) – more efficient use of faster networks or higher latency WANs. Large MTU support.
  • Peer caching of folder and file properties (2.02+) – clients keep local copies of folders and files via BranchCache
  • Durable handles (2.02, 2.1) – allow for connection to transparently reconnect to the server if there is a temporary disconnection
  • Client oplock leasing model (2.02+) – limits the data transferred between the client and server, improving performance on high-latency networks and increasing SMB server scalability
  • Multichannel & SMB Direct (3.0+) – aggregation of network bandwidth and fault tolerance if multiple paths are available between client and server, plus usage of modern ultra-high throughout RDMA infrastructure
  • Directory Leasing (3.0+) – Improves application response times in branch offices through caching

Interestingly enough, this last article suggests the disconnections problems are less likely to appear after a disconnection (Durable handles) if using a protocol >= 2.01, so I would stress again, to not continue using CIFS v1.0. (e.g., while in 1.0, echo_interval=60 does keep it connected, if there is a network glitch, or some other server interruption, the mount won't recover itself without manual intervention, while using CIFS v1.0)

As a last piece of advice, avoid doing sudo mount -a, and start doing:

sudo mount -o remount -a

See my question also CIFS mounting multiple copies of the same share on the same mount point

Share:
28,794

Related videos on Youtube

Rui F Ribeiro
Author by

Rui F Ribeiro

Updated on September 18, 2022

Comments

  • Rui F Ribeiro
    Rui F Ribeiro almost 2 years

    I have had a couple of directories mounted remotely from a Debian Jessie, in a Windows share, for a few months.

    In the last weeks, I keep having complaints of random disconnects from the mountpoint, and had to do a

    sudo mount -a
    

    to regain the mount connectivity, a couple of times (the server is used once or twice a week).

    e.g. the mounts are not recovering often after some period without being used.

    The Windows administrator also told me the Windows server has not been rebooted for a while.

    Today, coincidentally when doing mount -a again, it only worked in the 2nd try, while the first try gave the following error:

    sudo mount -a
    mount error(104): Connection reset by peer
    Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)
    mount error(112): Host is down
    Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)
    

    The directories are mounted from /etc/fstab as such:

    //10.2.1.2/XX/ZZ/YY /mnt/mount_point cifs credentials=/root/.smbcredentials,iocharset=utf8,file_mode=0770,dir_mode=0770,uid=1001,gid=1001 0 0

    When doing a mount command, you can also see the option echo_interval is activated by default at 60 seconds.

    $mount //10.2.1.2/XX/ZZ/YY on /mnt/mount_point type cifs (rw,relatime,vers=1.0,cache=strict,username=someusername,domain=XXX,uid=1001,forceuid,gid=1001,forcegid,addr=10.2.1.2,file_mode=0770,dir_mode=0770,nounix,serverino,mapposix,rsize=61440,wsize=65536,echo_interval=60,actimeo=1)

    What to do?

    • Pieter
      Pieter over 2 years
      echo_interval 60 Here 60 is the echo interval in seconds not minutes. man
    • Rui F Ribeiro
      Rui F Ribeiro over 2 years
      @Pieter Thanks, edited.
  • Shan
    Shan about 3 years
    I was able to mount successfully, But the mount will not be available after a while. if i try df -h the command completes in appx. 15 - 20 secs and then the mount point will not be there in the list. if i try to ls /mount_point im getting "Host is down." message. no clues though. Once im getting this behavior i will not be able to remount. i need to restart the machine to mount again
  • Rui F Ribeiro
    Rui F Ribeiro about 3 years
    @Shan There are a lot of variables on the equation, server side issues, networking issues, firewalling issues. These posts are breadcumbs, they wont automagically solve all complex problems, you have to follow the path of debugging them. While the vocation of the group is not debugging complex network situations for askers per se, if there are specific Unix doubts, try to describe well the issues; it would be better opening a new question.
  • Shan
    Shan about 3 years
    Agreed, But I'm clueless with no logs or traces. anyway I have limited access to the shared folder in Windows. So I asked the team to check for any useful information. thanks a lot :)
  • Rui F Ribeiro
    Rui F Ribeiro about 3 years
    @Shan My non-technical opinion: my job has been for a decade supporting and being the interface in the linux/networking side for people like you. e.g. You would benefit enlisting the help of more experienced sysadmin/network people on your organization, or having outside consulting.