Distributed Transaction failure with Windows Server 2008

9,758

After getting Microsoft involved and many network traces done and analyzed, we finally tracked down the issue. The application server was part of an Network Load Balancing cluster, and there is a flaw in how the IPv6 implementation on Windows Server 2008 R2 interacts with the Network Load Balancing component.

Since the servers had publicly routable IPv4 addresses, the IPv6 stack automatically created a "6to4" address. This is a special IPv6 address that corresponds to the machine's publically routable IPv4 address. It did this for both the machine's own address as well as the shared cluster address. The flaw is that it then registered both 6to4 addresses in DNS under its own name. This is different than how the IPv4 stack works on the same machine. With IPv4, the cluster IP address is not registered in DNS.

The result is that when the application server connected to the new database server and the database server tried to do the reverse bind to the application server, it would see that there were IPv6 addresses for the application server and attempt to connect using one of those addresses. But because it used the 6to4 address that corresponded to the cluster IP address, another server in the cluster would receive the connection and, since DTC on that server was not expecting a reverse bind, it failed.

The existing database server, being Windows Server 2003 R2, did not use IPv6 and so did not run into the problem.

The solution was to disable automatic 6to4 address generation. You can do this through group policy or by using the following command line:

netsh interface 6to4 set state disabled

To set it back, you would run the following command:

netsh interface 6to4 set state default

To see the current settings, run the following command. On Windows 2008 R2/Windows 7 and later, it will also indicate if the current setting is due to Group Policy.

netsh interface 6to4 show state
Share:
9,758

Related videos on Youtube

Scott
Author by

Scott

Updated on September 18, 2022

Comments

  • Scott
    Scott almost 2 years

    I'm having a problem with a new server setup. Distributed Transactions started by the application server to the new server are failing, but they work fine with an existing database server. I need help determining the cause of the problem.

    For various reasons, the new server is not running the latest versions of either Windows or SQL Server.

    Setup

    APPLICATION SERVER

    • OS: Windows Server 2008 R2
    • NetBIOS Name: WEB-02
    • Configured to talk to multiple database servers, some local, some remote.
    • DCOM ports restricted to a range of 5000-5020 for talking through firewall to remote servers.
    • Windows firewall enabled
    • DTC Properties
      • Network DTC Access checked
      • Allow Remote Clients, Allow Remote Administration unchecked
      • Transaction manager Communication
        • Allow Inbound, Allow Outbound checked
        • No Authentication Required
      • Enable XA Transactions unchecked
      • Enable SNA LU 6.2 Transactions checked

    NEW DATABASE SERVER

    • OS: Windows Server 2008
    • NetBIOS Name: DB-06
    • SQL Server 2005
    • No restrictions on DCOM ports
    • Windows firewall disabled
    • DTC Properties
      • Network DTC Access checked
      • Allow Remote Clients unchecked,
      • Allow Remote Administration checked
      • Transaction manager Communication
        • Allow Inbound, Allow Outbound checked
        • No Authentication Required
      • Enable XA Transactions unchecked
      • "Enable SNA LU 6.2 Transactions" does not exist

    EXISTING DATABASE SERVER

    • OS: Windows Server 2003 R2
    • NetBIOS Name: DB-04
    • SQL Server 2005
    • No restrictions on DCOM ports
    • Windows firewall disabled
    • DTC Properties
      • Network DTC Access checked
      • Allow Remote Clients unchecked,
      • Allow Remote Administration checked
      • Transaction manager Communication
        • Allow Inbound, Allow Outbound checked
        • No Authentication Required
      • Enable XA Transactions unchecked
      • "Enable SNA LU 6.2 Transactions" does not exist

    All three servers are part of the same domain and are on the same subnet. Only an Ethernet switch is between them, no router, hardware firewall, nor security device.

    Problem

    An ASP.NET application runs on the application server and works correctly when performing a transaction against the existing database server (DB-04). When performing the same steps against the new database server (DB-06), it fails and reports the error message: Communication with the underlying transaction manager has failed.

    Troubleshooting Steps

    We've seen this error before with this application, and it normally means that the Distributed Transaction Coordinator is not configured correctly or a firewall is interfering. In the past, I have used DTCPing to troubleshoot and correct any errors.

    This time however, although DTCPing is failing, I am not able to determine the cause of the problem, as both the existing and new database servers appear to be configured the same, except for OS version.

    The following is from the DTCPing log file when running a test from the application server (WEB-02) to the new database server (DB-06). Note that I have changed the IP addresses and DNS names.

    From log file on application server

    10-14, 16:08:11.346-->Error(0x424) at clutil.cpp @256
    10-14, 16:08:11.346-->-->OpenCluster
    10-14, 16:08:11.346-->-->1060(The specified service does not exist as an installed service.)
    ++++++++++++++++++++++++++++++++++++++++++++++
         DTCping 1.9 Report for WEB-02  
    ++++++++++++++++++++++++++++++++++++++++++++++
    Firewall Port Settings:
        Port:5000-5020
    RPC server is ready
    ++++++++++++Validating Remote Computer Name++++++++++++
    10-14, 16:08:22.796-->Start DTC connection test
    Name Resolution:
        DB-06-->1.1.1.6-->s6.mydomain.com
    10-14, 16:08:22.812-->Start RPC test (WEB-02-->DB-06)
    RPC test failed
    

    From log file on new database server

    10-14, 16:07:46.128-->Error(0x424) at clutil.cpp @256
    10-14, 16:07:46.128-->-->OpenCluster
    10-14, 16:07:46.129-->-->1060(The specified service does not exist as an installed service.)
    ++++++++++++++++++++++++++++++++++++++++++++++
         DTCping 1.9 Report for DB-06  
    ++++++++++++++++++++++++++++++++++++++++++++++
    RPC server is ready
    10-14, 16:08:22.785-->RPC server:DB-06 received following information:
        Network Name: DB-06
        Source  Port: 56535
        Partner LOG: WEB-022872.log
        Partner CID: 1ACD8780-9446-4E94-869D-6F1BDF787BBF
    

    After clicking PING on the database server, the following is added to the log file. In the output window, there is a pause between invoking the RPC method and it failing, so it fails after a timeout.

    ++++++++++++Validating Remote Computer Name++++++++++++
    10-14, 16:13:18.924-->Start DTC connection test
    Name Resolution:
        Web-02-->1.1.1.2-->web-02.mydomain.com
    10-14, 16:13:18.933-->Start RPC test (DB-06-->Web-02)
    Problem:fail to invoke remote RPC method
    Error(0x6D9) at dtcping.cpp @303
    -->RPC pinging exception
    -->1753(There are no more endpoints available from the endpoint mapper.)
    RPC test failed
    

    As explained in Troubleshooting MSDTC issues with the DTCPing tool under section "ERROR MESSAGE 4 - There are no more endpoints from the endpoint mapper", there are in fact more endpoints for the mapper. I have run netstat -an on the application server (the one with restricted ports) and it is only using 10 of the 20 ports available.

    • LCJ
      LCJ over 9 years
      Were you able to find out the root cause?
    • Scott
      Scott over 9 years
      @Lijo Yes, we did find the root cause, though it was a bit esoteric. I've posted it as the answer to the question.