Authentication very slow (both with ssh and su)
Solution 1
Found the issue (thanks this post https://stackoverflow.com/questions/8413975/10-second-delay-between-login-and-shell-prompt)
The issue is in /etc/profile.d/zzzz-vamilocale.sh file, which is trying to read something from vm properties and stuck in that. removing this files solves the issue.
My way to debug this issue: 1. login as problem user 2. invoke "bash --login --verbose" 3. Find out line, which stops the executions 4. Find this line in one file from /etc/profile.d/
Solution 2
As a general way to investigate a slow authentication, check /etc/pam.conf
and /etc/pam.d/su
(and /etc/pam.d/sshd
etc.) to see what kind of authentication the login services perform. Check the system logs to see if anything is logged (look for log entries dating from the time of the authentication).
In your case, it's the kernel logs that reveal the problem. The “page allocation failure” messages indicate that your system is out of virtual memory. Either kill some programs or increase the swap space.
Solution 3
As you get the slow down after login, technically, you need to check what can happen between login and the first shell prompt.
A custom setup of the shell could do a lot of creative things that can fail in miserable ways under some special conditions -
like at login before actually entering interactive mode,
or while DISPLAY is not yet defined.
In this case, we look for things blocking and timing out, or doing slow things -
like the cartoon-esque automounting all home directories,
of all present and past employees.
Except your own, of course.
That was mounted already.
Sounds bad?
No. It's easy to find, and funny, for most.
The time after login, up until before showing the shell, is used by any maintenance tasks, these will slow it down. But maintenance tasks run only sporadically. And, of course, they are not run all together at once, like each week on first login after Monday 8:00 - because that would be slow, right?
There happen lots of unexpected things
The point illustrated is that there may happen a lot after login, before showing the shell.
One class of problems that I have seen to be slow at login on Ubuntu is generating the message of the day - motd.
This is the text shown before the first shell prompt on login.
It may be empty, or just some simple greeting.
Take a look at ls -l /etc/update-motd.d
$ ls -l /etc/update-motd.d
total 28
-rwxr-xr-x 1 root root 1220 Jul 3 2012 00-header
-rwxr-xr-x 1 root root 1358 Jul 3 2012 10-help-text
-rwxr-xr-x 1 root root 149 Mar 19 2012 90-updates-available
-rwxr-xr-x 1 root root 147 Aug 8 2012 91-release-upgrade
-rwxr-xr-x 1 root root 142 Mar 19 2012 98-fsck-at-reboot
-rwxr-xr-x 1 root root 144 Mar 19 2012 98-reboot-required
-rwxr-xr-x 1 root root 1158 Jul 3 2012 99-footer
Especially 90-updates-available
seems to be very slow from time to time in this example - so check what you find there, and possibly disable some for testing by moving them away to /etc/update-motd.d.disabled
.
Related videos on Youtube
techedemic
Updated on September 18, 2022Comments
-
techedemic over 1 year
I'm having a problem on one of our production servers where the login attempts become progressively slower as time passes. After roughly 5 days it becomes so slow that some critical processes/cron entries fail to fire correctly.
Server information:
Hardware: Dell R720 24 GB Memory 2 x Intel Xeon E502620 v2 Processors (24 cores total, including HyperThreading) 8 x 300GB 10K SAS Drives
OS: Red Hat Enterprise Linux 6.5
I experienced issues logging in via SSH and started along a long road of investigating 'red herrings'. Eventually, I noticed that even doing the following took very long:
[someuser#hostname] su - Password:
When running the the 'su -' command as above, it shouldn't involve SSH in any form, as I am just trying to authenticate on the box itself, right?
This has happened 3 weeks in a row on the same box and it's the first time I noticed (and tested I might add) that the local only logins are also slow.
When I log in via the console, it looks as follows:
hostname login: user Password: # I enter the password and hit [enter]
A long time passes and then...
[user@hostname ~] $
When I log in via SSH, it looks as follows (verbosity on and doing a localhost login):
[user@hostname ~]$ ssh -v root@localhost OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010 debug1: Reading configuration data /etc/ssh/ssh_config debug1: Applying options for * debug1: Connecting to localhost [127.0.0.1] port 22. debug1: Connection established. debug1: identity file /usr/local/user/.ssh/identity type -1 debug1: identity file /usr/local/user/.ssh/identity-cert type -1 debug1: identity file /usr/local/user/.ssh/id_rsa type -1 debug1: identity file /usr/local/user/.ssh/id_rsa-cert type -1 debug1: identity file /usr/local/user/.ssh/id_dsa type -1 debug1: identity file /usr/local/user/.ssh/id_dsa-cert type -1 debug1: Remote protocol version 2.0, remote software version OpenSSH_5.3 debug1: match: OpenSSH_5.3 pat OpenSSH* debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_5.3 debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: server->client aes128-ctr hmac-md5 none debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY The authenticity of host 'localhost (127.0.0.1)' can't be established. RSA key fingerprint is 1d:50:5e:a3:e4:63:d6:1d:d8:2c:85:07:95:81:c8:b6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (RSA) to the list of known hosts. debug1: ssh_rsa_verify: signature correct debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug1: SSH2_MSG_NEWKEYS received debug1: SSH2_MSG_SERVICE_REQUEST sent debug1: SSH2_MSG_SERVICE_ACCEPT received debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password debug1: Next authentication method: gssapi-keyex debug1: No valid Key exchange context debug1: Next authentication method: gssapi-with-mic debug1: Unspecified GSS failure. Minor code may provide more information Credentials cache file '/tmp/krb5cc_501' not found debug1: Unspecified GSS failure. Minor code may provide more information Credentials cache file '/tmp/krb5cc_501' not found debug1: Unspecified GSS failure. Minor code may provide more information debug1: Unspecified GSS failure. Minor code may provide more information Credentials cache file '/tmp/krb5cc_501' not found debug1: Next authentication method: publickey debug1: Trying private key: /usr/local/user/.ssh/identity debug1: Trying private key: /usr/local/user/.ssh/id_rsa debug1: Trying private key: /usr/local/user/.ssh/id_dsa debug1: Next authentication method: password root@localhost's password: debug1: Authentication succeeded (password). debug1: channel 0: new [client-session] debug1: Requesting [email protected] debug1: Entering interactive session.
At this point it gets stuck, for a long while again, and eventually logs in successfully.
Any pointers will be appreciated. This is driving me up the walls.
The following also appears in the dmesg output. It repeats for various process names (not only 'cifsd')
cifsd: page allocation failure. order:5, mode:0x20 Pid: 12913, comm: cifsd Not tainted 2.6.32-431.el6.x86_64 #1 Call Trace: [<ffffffff8112f9e7>] ? __alloc_pages_nodemask+0x757/0x8d0 [<ffffffff8116e482>] ? kmem_getpages+0x62/0x170 [<ffffffff8116f09a>] ? fallback_alloc+0x1ba/0x270 [<ffffffff8116eaef>] ? cache_grow+0x2cf/0x320 [<ffffffff8116ee19>] ? ____cache_alloc_node+0x99/0x160 [<ffffffff8116ffe0>] ? kmem_cache_alloc_node_trace+0x90/0x200 [<ffffffff811701fd>] ? __kmalloc_node+0x4d/0x60 [<ffffffff8144feca>] ? __alloc_skb+0x7a/0x180 [<ffffffff81450fe0>] ? skb_copy+0x40/0xb0 [<ffffffffa014f57c>] ? tg3_start_xmit+0xa8c/0xd80 [tg3] [<ffffffff81460354>] ? dev_hard_start_xmit+0x224/0x480 [<ffffffff8147bd0a>] ? sch_direct_xmit+0x15a/0x1c0 [<ffffffff81460858>] ? dev_queue_xmit+0x228/0x320 [<ffffffff8149a0d8>] ? ip_finish_output+0x148/0x310 [<ffffffff8149a358>] ? ip_output+0xb8/0xc0 [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0 [<ffffffff81499655>] ? ip_local_out+0x25/0x30 [<ffffffff81499b30>] ? ip_queue_xmit+0x190/0x420 [<ffffffff8112ff2f>] ? free_hot_page+0x2f/0x60 [<ffffffff814aee3e>] ? tcp_transmit_skb+0x40e/0x7b0 [<ffffffff814b1380>] ? tcp_write_xmit+0x230/0xa90 [<ffffffff814b1f00>] ? __tcp_push_pending_frames+0x30/0xe0 [<ffffffff814a9663>] ? tcp_data_snd_check+0x33/0x100 [<ffffffff814ad261>] ? tcp_rcv_established+0x381/0x7f0 [<ffffffff8152873a>] ? schedule_timeout+0x19a/0x2e0 [<ffffffff814b5643>] ? tcp_v4_do_rcv+0x2e3/0x490 [<ffffffff814a130a>] ? tcp_prequeue_process+0x7a/0xa0 [<ffffffff814a4a2c>] ? tcp_recvmsg+0xacc/0xe80 [<ffffffff814c58ca>] ? inet_recvmsg+0x5a/0x90 [<ffffffff8105a625>] ? select_idle_sibling+0x95/0x150 [<ffffffff81449ab3>] ? sock_recvmsg+0x133/0x160 [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81059216>] ? enqueue_task+0x66/0x80 [<ffffffff8105571d>] ? check_preempt_curr+0x6d/0x90 [<ffffffff81065c5e>] ? try_to_wake_up+0x24e/0x3e0 [<ffffffff81065e02>] ? default_wake_function+0x12/0x20 [<ffffffff8109b2b6>] ? autoremove_wake_function+0x16/0x40 [<ffffffff81449b24>] ? kernel_recvmsg+0x44/0x60 [<ffffffffa01fd7c9>] ? cifs_readv_from_socket+0x1a9/0x260 [cifs] [<ffffffffa020b11d>] ? cifs_add_credits+0x5d/0x70 [cifs] [<ffffffffa01fd8a7>] ? cifs_read_from_socket+0x27/0x30 [cifs] [<ffffffffa01fda03>] ? cifs_demultiplex_thread+0x153/0xe10 [cifs] [<ffffffff81065e02>] ? default_wake_function+0x12/0x20 [<ffffffffa01fd8b0>] ? cifs_demultiplex_thread+0x0/0xe10 [cifs] [<ffffffff8109aef6>] ? kthread+0x96/0xa0 [<ffffffff8100c20a>] ? child_rip+0xa/0x20 [<ffffffff8109ae60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20
-
Admin almost 10 yearsThe call trace looks like you have a memory issue. This can make the entire system slow, especially when referred to logins, that as the process(es) is not constantly alive.
-
Admin almost 10 yearsCan you post your /etc/ssh/sshd_config file. I thinks you have to disable the kerberos authentication part. Try to comment out # Kerberos options and # GSSAPI options sections
-
Admin almost 10 yearsIs this problem present after a fresh system reboot? Is the problem present even if you use
bash --login
? Is your home directory in a network filesystem? -
Admin almost 10 years@pqnet No the logins are fast after a reboot. I cannot answer your second question, as I restarted the server and it is fast now. This is a standalone machine with all storage locally attached.
-
-
techedemic almost 10 yearsThanks. We found that to be the problem, but I am curious why a machine with significant specs would run out of resources so quickly. The machine is not swapping (much). I read somewhere that modifying the following at runtime might fix the problem (or obviously permanently in the sysctl file):
/proc/sys/vm/min_free_kbytes = 512000
/proc/sys/vm/zone_reclaim_mode = 512000
I haven't tried this but the moment the problem re-appears we will have a look at it. -
Valath over 9 yearsThank you! this has been vexing me for quite some time now.
-
petertc about 7 yearsIn this way I found the problem is
__sdkman_secure_curl_with_timeouts "$VERSION_URL"
in my machine that slow the login procedure. -
Jeff Schaller about 5 yearsIt's be great if you could connect SELinux in any way to the symptoms in the question.