What might prevent IKE handshake success in building an IPSEC tunnel?

11,321

With a little assistance from Cisco I did some deeper analysis of what was happening, and figured out the things that I needed to be checking for. The useful things that Cisco told me:

  • debug crypto isakmp 5 gives enough detail to see whether problems are occurring with ISAKMP traffic
  • clear crypto isakmp sa clears out any stale security associations.
  • clear crypto isakmp {client_ip_address} can be used on the HQ to clear out a specific security association (you don't necessarily want to clear all your security associations if it is only one device that is having trouble!
  • packet captures at both ends are really useful to figure out what is going on

Reading up a little on the IPSEC suite, and ISAKMP more specifically showed that the following need to be allowed through any firewalls in the path:

  • ISAKMP traffic on UDP port 500
  • ISAKMP (used for NAT-Tunnelling) traffic on UDP port 4500
  • ESP traffic (IP Protocol 50)
  • AH traffic (IP Protocol 51)

It seems a lot of people out there don't realise the important difference between IP protocols and TCP/UDP ports.

The following packet captures focussed on the above types of traffic. These were set up on both the remote and HQ ASAs:

object service isakmp-nat-t 
    service udp destination eq 4500 
    description 4500
object-group service ISAKMP-Services
    description Traffic required for ISAKMP
    service-object esp 
    service-object ah 
    service-object object isakmp-nat-t 
    service-object udp destination eq isakmp
access-list ISAKMP extended permit object-group ISAKMP-Services host {hq_ip_address} host {remote_ip_address}
access-list ISAKMP extended permit object-group ISAKMP-Services host {remote_ip_address} host {hq_ip_address}
capture ISAKMP access-list ISAKMP interface outside

You can then download the captures from each device at https://{device_ip_address}/capture/ISAKMP/pcap and analyse it in Wireshark.

My packet captures showed that ISAKMP traffic outlined above was getting fragmented - since those packets are encrypted, once they are fragmented it is hard to put them back together and things break.

Giving this information to the ISP meant they could do their own focussed checking, and resulted in them making some changes to a firewall. Turns out the ISP was blocking all ICMP traffic on their edge router, which meant that Path MTU Discovery was broken, resulting in fragmented ISAKMP packets. Once they stopped blanket blocking ICMP the VPN came up (and I expect all their customers started getting better service in general).

Share:
11,321

Related videos on Youtube

dunxd
Author by

dunxd

I'm currently freelance specialising in international connectivity and infrastructure working with clients in the humanitarian space. If your organisation struggles to work effectively because of limited internet options in far flung locations, maybe I can help. Until 2017 I worked at a large international development charity in London, as International Operations Manager. I managed a team of Regional ICT Service Managers, based in developing world countries, who kept the users happy through fixing problems, setting up great connectivity and generally making sure users could do their day jobs. I think I did a good job as a manager - some of my team went on to great things! I previously worked at the same place as International Network Systems Analyst. I looked after a bunch of ICT systems in offices in the developing world, as well as looking after systems in our HQ. I gained a lot of knowledge in that job, and the techy side competes with the people stuff in the new role, hence I still hang out here a lot. I'm passionate about the use of ICT in developing countries, both in terms of dealing with the inherent problems for ICT in those places, and using ICT as a tool for development.

Updated on September 18, 2022

Comments

  • dunxd
    dunxd almost 2 years

    We use Cisco ASA for our IPSEC VPNs, using the EZVPN method. From time to time we encounter problems where an ISP has made a change to their network and our VPN stops working. Nine times out of ten the ISP denies that their change could have stopped this working - I suspect because they don't understand exactly what might have caused the problem. Rather than just bashing heads with them I want to try and point them in a direction that might get a speedier resolution.

    In my current incident, I can ssh onto the external interface of the ASA and do a little poking around:

     sh crypto isakmp sa
    
       Active SA: 1
        Rekey SA: 0 (A tunnel will report 1 Active and 1 Rekey SA during rekey)
    Total IKE SA: 1
    
    1   IKE Peer: {Public IP address of London ASA}
        Type    : user            Role    : initiator
        Rekey   : no              State   : AM_TM_INIT_XAUTH_V6C
    

    At the other end of the link I see the following:

    Active SA: 26
    <snip>
    25  IKE Peer: {public IP address of Port-Au-Prince-ASA}
        Type    : user            Role    : responder
        Rekey   : no              State   : AM_TM_INIT_MODECFG_V6H
    

    I can't find any documentation for what AM_TM_INIT_XAUTH_V6C or AM_TM_INIT_MODECFG_V6H, but I'm pretty sure it means that the IKE handshake has failed for some reason.

    Can anyone suggest any likely things that might be preventing IKE from succeeding, or specific details of what AM_TM_INIT_XAUTH_V6C means?

    Update: We connected the ASA at the site of a customer of another ISP. The VPN connection came up immediately. This confirms that the problem is not configuration related. The ISP is now accepting responsibility and investigating further.

    Update: The connection suddenly came back online last week. I have notified the ISP to see if they changed anything, but not heard back yet. Frustratingly I am now seeing a similar issue on another site. I found a Cisco doc on the effects of fragmentation on VPN. I am starting to think that this may be the cause of the issues I am seeing.

    • dunxd
      dunxd about 13 years
      I've got a bunch of output from debug crypto isakmp 255 - too much to drop in here. Can anyone give me any pointers for what in that might be relevant for troubleshooting. I can then add it to the question.
    • Tom O'Connor
      Tom O'Connor about 13 years
      Pastebin it, then link to the pastebin ;)
    • dunxd
      dunxd about 13 years
      Is anyone seriously going to trawl through a debug dump? That would be kind. However, how much of the info in there would I need to redact before pastebin it? Would prefer if someone kind enough to look through would specify what sort of things they would be looking for, and I paste that specifically.
    • Tom O'Connor
      Tom O'Connor about 13 years
      I'm at pains to say this, after having re-read the question, but I'm not sure I'd want to deal with an ISP who don't know the effect of their changes on their own damn network. Find an ISP who don't block ESP, IKE or IPSEC, then use them for the VPN connection.
    • ravi yarlagadda
      ravi yarlagadda about 13 years
      Do you have some less-verbose logs that would be easier to redact? Even having the informational-level logs should give a pretty good idea of where a tunnel build is failing.
    • dunxd
      dunxd about 13 years
      @Tom wouldn't that be nice, but the site is in Haiti and I work for an NGO with limited budget, so I don't have that kind of luxury.
  • dunxd
    dunxd almost 13 years
    We tested the ASA on a connection going through another ISP. It worked. The ISP has admitted the problem is likely at their end.