How can I improve nscd's cache hit rate?

12,054

Solution 1

What part of your webserver is even doing DNS lookups? Most webserver configurations explicitly disable reverse DNS lookup of each incoming user, for speed (because DNS is slow in general).

As Patrick notes, nscd is doing the right thing and respecting the positive TTL values. Yes, you could override it (unbound would let you do this easily, just modify server.cache-min-ttl, has warnings about increasing it beyond 1 hour for the same reasons). HOWEVER, your queries are probably mostly rDNS, which will tend to have longer TTLs in general.

Additionally, since your maximum number of cached values is so low, I'd like to note that you're hardly getting any traffic.

If you do care about where you users repeat from that often, I'd suggest logging it outside nscd, and not worrying about it anymore.

Edit (2013/12//09): nscd -g hosts stats from dev.gentoo.org (no blocks in comments):

nscd configuration:
 4h  8m 43s  server runtime
hosts cache:
        yes  cache is enabled
         no  cache is persistent
         no  cache is shared
        422  suggested size
    1108744  total data pool size
     966632  used data pool size
        600  seconds time to live for positive entries
         20  seconds time to live for negative entries
      67878  cache hits on positive entries
       2479  cache hits on negative entries
       9464  cache misses on positive entries
       4276  cache misses on negative entries
         83% cache hit rate
       6951  current number of cached values
       7641  maximum number of cached values
         33  maximum chain length searched
          1  number of delays on rdlock
          0  number of delays on wrlock
          0  memory allocations failed
        yes  check /etc/hosts for changes

Solution 2

This parameter:
yes cache is shared

allows applications to root around in nscd's cache, and such activity is not logged. This is the expected and most efficient behavior.

Set that to NO and you will see your hit rate rise dramatically, but it is somewhat slower.

See: http://alpacapowered.wordpress.com/2013/03/08/nscd-dns-caching-and-postfix/comment-page-1/#comment-1374

Solution 3

It may be a bit off-topic but instead of using nscd you can switch to sssd (which I consider its successor).

I'm using it on SUSE Linux Enterprise Server 11.3 (fully supported) and I'm glad that I did the switch. It has many more and finer grained configuration options than nscd and also has capabilities that go far beyond what nscd can achieve.

At least I guess it is worth a look: https://fedorahosted.org/sssd/

Solution 4

nscd is respecting the upstream TTL values.

If the DNS server for google.com says the TTL for the A record of google.com is 10 seconds, and you have a positive-time-to-live of 36000, the record will still expire in 10 seconds.

Share:
12,054

Related videos on Youtube

Bratchley
Author by

Bratchley

Updated on September 18, 2022

Comments

  • Bratchley
    Bratchley almost 2 years

    My Goal: Let nscd maintain a fairly large DNS cache in excess memory since I have it available.

    Description:

    I have a webserver that has a broadly dispersed but high-repeat user base. It has plenty of memory so I thought I'd improve response time by caching lookups but according to nscd -g I'm only at a 6% cache hit rate (meaning nscd is most likely introducing more latency saving to the cache or looking through the cache for an entry it will never find, than it's preventing by going out to the network):

    hosts cache:
    
                yes  cache is enabled
                yes  cache is persistent
                yes  cache is shared
                211  suggested size
             216064  total data pool size
               2328  used data pool size
              36000  seconds time to live for positive entries
                 20  seconds time to live for negative entries
               4455  cache hits on positive entries
                  0  cache hits on negative entries
              17357  cache misses on positive entries
              42348  cache misses on negative entries
                  6% cache hit rate
                 17  current number of cached values
                 40  maximum number of cached values
                  3  maximum chain length searched
                  0  number of delays on rdlock
                  0  number of delays on wrlock
                  0  memory allocations failed
                yes  check /etc/hosts for changes
    

    Probably a large contributor to the 6% hit rate is the fact that apparently it's only cached 17 entries. Doing a strings /var/db/nscd/hosts shows that the host cache entries it has created are mostly for machines on our internal network. It's good to have these cached since the daily re-publish of the website is likely sped up but my goal is to speed up end user experience without making any real configuration changes.

    This is the relevant segment of nscd.conf:

        threads                 10
        server-user             nscd
        debug-level             0
        paranoia                no
        [.....snip......]
        enable-cache            hosts           yes
        positive-time-to-live   hosts           36000
        negative-time-to-live   hosts           20
        suggested-size          hosts           10657
        check-files             hosts           yes
        persistent              hosts           yes
        shared                  hosts           yes
        max-db-size             hosts           33554432
    

    Basically, I need help understanding how my host cache can be so small even though I've set the positive TTL's on the host cache to be incredibly high. I'm sure it's the small number of actual cached entries that is causing the hit rate to be so low.

    I'm assuming since the hit rate is 6% but my positive TTL is fairly large, that means my current workload is performing DNS host lookups, but they're just not being save. I have no idea why these aren't being saved nor what to check next. What I had expected would be a fairly large DNS cache now.

    Even if the hit rate stayed small (i.e: clients weren't repeating as often as I thought) I'd still expect those DNS lookups to be cached but looking at the "current number of cached values" that doesn't appear to be happening either.

    • Nils
      Nils over 10 years
      Is there any reason why you want to lookup clients in DNS?
  • Bratchley
    Bratchley over 10 years
    Is there a way to override this behavior?
  • Totor
    Totor over 10 years
    Here is an ugly way: you could filter the incoming DNS answer packet with iptables, send it to a userspace program (NFQUEUE target) which will then counterfeit it to change the TTL.
  • phemmer
    phemmer over 10 years
    I would not recommend this even if it were possible. One scenario: When servers are brought down for maintenance, they are removed from DNS. The admins will then wait for the DNS records to expire before shutting the server itself down. By overriding the TTL you'll be sending traffic to a server that could be shut down.
  • Bratchley
    Bratchley over 10 years
    The DNS lookups are being done so that upper management can see DNS names in their reports (generated from the log files). I suppose I could set up BIND to do this, but my question is how to manage nscd since that's more generally useful especially for stuff like user ID's and groups.
  • Bratchley
    Bratchley over 10 years
    Also, I realize the maximum number of cached values is low, but I still think that with a hit rate of 6% when I set the TTL to 10 hours would result in a larger cache than 17 cached host values. If there's a way to get nscd to hold onto the records longer, that's preferable and would minimize the impact of the reverse DNS.
  • robbat2
    robbat2 over 10 years
    @JoelDavis just using BIND/unbound isn't going to increase the hit rate directly. I do see a related problem for you if you're doing the rDNS lookup later from logfiles: there is no gaurentee that the rDNS points to the same entry now that it did when the event happened.
  • Bratchley
    Bratchley over 10 years
    This is on RHEL5 but when we upgrade the server (probably to RHEL7) I can look into it. It's interesting for a lot of reasons. Can SSSD cache UID's in general or does it just do authentication and DNS caching?
  • Bratchley
    Bratchley over 10 years
    Marking this as the answer since I think this is as close to an answer that the question makes possible.
  • Bratchley
    Bratchley over 10 years
    Checked in #sssd on FreeNode and one of the devs said that they still recommend using something else for DNS caching and mentioned the unbound and nscd options. They did say, though, that it is designed to cache user and group information.
  • robbat2
    robbat2 over 10 years
    There is unscd as well out there; but runs into the same root problem