DNS SERVFAIL and Incorrect Flag only via TCP: Broken DNS Servers?

11,095

Yes, it's poor configuration and/or implementation - there's no reason for an authoritative server to return root referrals in an otherwise valid response.

Furthermore, I'm seeing other errors that simply shouldn't happen from those two Worldnic servers:

  • sometimes it gives the right answer, but with a SERVFAIL error code and without the AA bit set.

  • UDP replies are always truncated at 512 bytes, even with EDNS0 (RFC 2671) specified. This means that DNSSEC won't work with this name server

  • It's not just the ADDITIONAL section that's a problem, it's putting the root name servers in the AUTHORITY section of an authoritative (AA bit set) answe.

Share:
11,095

Related videos on Youtube

Rob Olmos
Author by

Rob Olmos

I enjoy programming, security, and stuff.

Updated on September 17, 2022

Comments

  • Rob Olmos
    Rob Olmos almost 2 years

    Is it poor configuration to return the root name servers in the additional section for a CNAME lookup that points to another domain? Particularly the one I'm seeing this with is a CNAME hosted by Network Solutions with the CNAME pointing to a different domain & TLD.

    I ask if this is poor configuration because all these additional records result in exceeding the size of the UDP packet forcing the query to be re-done with TCP.

    dig www.unitedstatesartists.org +trace

    A name server response:

    example.org. 86400  IN      NS      ns15.worldnic.com.
    example.org. 86400  IN      NS      ns16.worldnic.com.
    ;; Received 95 bytes from 199.249.120.1#53(b2.org.afilias-nst.org) in 79 ms
    
    ;; Warning: Message parser reports malformed message packet.
    ;; Truncated, retrying in TCP mode.
    www.example.org. 7200 IN    CNAME   load-01-123.us-west-1.elb.amazonaws.com.
    .  518400  IN      NS      a.root-servers.net.
    .  518400  IN      NS      b.root-servers.net.
    .  518400  IN      NS      c.root-servers.net.
    .  518400  IN      NS      d.root-servers.net.
    .  518400  IN      NS      e.root-servers.net.
    .  518400  IN      NS      f.root-servers.net.
    .  518400  IN      NS      g.root-servers.net.
    .  518400  IN      NS      h.root-servers.net.
    .  518400  IN      NS      i.root-servers.net.
    .  518400  IN      NS      j.root-servers.net.
    .  518400  IN      NS      k.root-servers.net.
    .  518400  IN      NS      l.root-servers.net.
    .  518400  IN      NS      m.root-servers.net.
    ;; Received 526 bytes from 205.178.190.8#53(ns15.worldnic.com) in 173 ms
    

    Returning the additional records or not is random. Sometimes when they don't return the additional there's still a truncated response and dig retries in TCP.

    example.org. 86400  IN      NS      ns15.worldnic.com.
    example.org. 86400  IN      NS      ns16.worldnic.com.
    ;; Received 95 bytes from 199.19.56.1#53(a0.org.afilias-nst.info) in 82 ms
    
    ;; Warning: Message parser reports malformed message packet.
    ;; Truncated, retrying in TCP mode.
    www.example.org. 7200 IN    CNAME   load-01-123.us-west-1.elb.amazonaws.com.
    ;; Received 107 bytes from 205.178.190.8#53(ns15.worldnic.com) in 164 ms
    

    Update 2010-12-08

    With more testing found:

    • Network Solutions responds with a SERVFAIL (server failure) with a recursive query (dig's default if not +trace) yet still gives the correct answer.
    • Setting dig's +norecurse works fine but not always. Sometimes a SERVFAIL is returned - Not good. Details of possibly why follows below
    • Network Solutions' inclusion of the root servers in the authoritative and additional section causes the UDP truncation and requires TCP to complete.

    Overview of the following capture:

    • Non-recursive request record from ns15
    • ns15 answer includes root servers in auth and additional and marks reply as truncated
    • Non-recursive request is retried in TCP due to truncated UDP
    • Similar answer from ns15 using TCP except "recursion desired" is incorrectly set and "server failure" code is also set

    We've already created a ticket with them but we'll see if it goes anywhere. Follows is the DNS packets from tshark details earlier:

    First question (via UDP):

    Domain Name System (query)
        Transaction ID: 0x27ef
        Flags: 0x0000 (Standard query)
            0... .... .... .... = Response: Message is a query
            .000 0... .... .... = Opcode: Standard query (0)
            .... ..0. .... .... = Truncated: Message is not truncated
            .... ...0 .... .... = Recursion desired: Don't do query recursively
            .... .... .0.. .... = Z: reserved (0)
            .... .... ...0 .... = Non-authenticated data OK: Non-authenticated data is unacceptable
    

    First answer (via UDP):

    Domain Name System (response)
        [Request In: 1]
        [Time: 0.078623000 seconds]
        Transaction ID: 0x27ef
        Flags: 0x8600 (Standard query response, No error)
            1... .... .... .... = Response: Message is a response
            .000 0... .... .... = Opcode: Standard query (0)
            .... .1.. .... .... = Authoritative: Server is an authority for domain
            .... ..1. .... .... = Truncated: Message is truncated
            .... ...0 .... .... = Recursion desired: Don't do query recursively
            .... .... 0... .... = Recursion available: Server can't do recursive queries
            .... .... .0.. .... = Z: reserved (0)
            .... .... ..0. .... = Answer authenticated: Answer/authority portion was not authenticated by the server
            .... .... .... 0000 = Reply code: No error (0)
    

    Second question (via TCP):

    Domain Name System (query)
        Length: 56
        Transaction ID: 0xbc37
        Flags: 0x0000 (Standard query)
            0... .... .... .... = Response: Message is a query
            .000 0... .... .... = Opcode: Standard query (0)
            .... ..0. .... .... = Truncated: Message is not truncated
            .... ...0 .... .... = Recursion desired: Don't do query recursively
            .... .... .0.. .... = Z: reserved (0)
            .... .... ...0 .... = Non-authenticated data OK: Non-authenticated data is unacceptable
    

    Second answer (via TCP, notice "recursion desire"):

    Domain Name System (response)
        [Request In: 6]
        [Time: 0.147357000 seconds]
        Length: 107
        Transaction ID: 0xbc37
        Flags: 0x8102 (Standard query response, Server failure)
            1... .... .... .... = Response: Message is a response
            .000 0... .... .... = Opcode: Standard query (0)
            .... .0.. .... .... = Authoritative: Server is not an authority for domain
            .... ..0. .... .... = Truncated: Message is not truncated
            .... ...1 .... .... = Recursion desired: Do query recursively
            .... .... 0... .... = Recursion available: Server can't do recursive queries
            .... .... .0.. .... = Z: reserved (0)
            .... .... ..0. .... = Answer authenticated: Answer/authority portion was not authenticated by the server
            .... .... .... 0010 = Reply code: Server failure (2)
    
    • Alnitak
      Alnitak over 13 years
      please show the lookup details. It's unusual, but not necessarily poor configuration.
    • BestPractices
      BestPractices over 11 years
      I realized this was about 2 years ago that you posted this question, but did you ever get a resolution to your problem? I'm having the same issue with Network Solutions.
    • Rob Olmos
      Rob Olmos over 11 years
      @BestPractices Sorry the late reply and no unfortunately no resolution. We decided to instead run our own load balancer that doesn't need a CNAME or apex redirect since we couldn't move the DNS to Route 53.
    • Efren
      Efren over 6 years
      Please mark the answer if it answers the question.
  • Rob Olmos
    Rob Olmos over 13 years
    Thanks. Let me investigate this further and try bringing these issues to their attention (I doubt they'll care).
  • Rob Olmos
    Rob Olmos over 13 years
    I did some more testing and detailed my findings in the updated question.