Regular Expression for validating DNS label ( host name)

32,293

Solution 1

^(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)$

I used the following testbed written in Python to verify that it works correctly:

tests = [
    ('01010', False),
    ('abc', True),
    ('A0c', True),
    ('A0c-', False),
    ('-A0c', False),
    ('A-0c', True),
    ('o123456701234567012345670123456701234567012345670123456701234567', False),
    ('o12345670123456701234567012345670123456701234567012345670123456', True),
    ('', True),
    ('a', True),
    ('0--0', True),
]

import re
regex = re.compile('^(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)$')
for (s, expected) in tests:
    is_match = regex.match(s) is not None
    print is_match == expected

Solution 2

Javascript regex based on Marks answer:

pattern = /^(?![0-9]+$)(?!.*-$)(?!-)[a-zA-Z0-9-]{1,63}$/g;

Solution 3

Ruby regular expressions are multiline by default, and so something like Rails warns against using ^ and $. This is Mark's answer with safe start- and end of string characters:

\A(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)\z

Solution 4

It is worth noting that DNS labels and hostname components have slightly different rules. Most notably: '_' is not legal in any component of a hostname, but is a standard part of labels used for things like SRV records.

A more readable and portable approach is to require a string to match both of these POSIX ERE's:

^([[:alnum:]][[:alnum:]\-]{0,61}[[:alnum:]]|[[:alpha:]])$
^.*[[:^digit:]].*$

Those should be easy to use in any standard-compatible ERE implementation. Perl-style backtracking as in the Python example is widely available, but has the problem of not being exactly the same everywhere that it seems to work. Ouch.

It is possible in principle to make a single ERE of those two lines, but it would be long and unwieldy. The first line handles all of the rules other than the ban on all-digits, the second kills those.

Solution 5

A revised regex based on comments here and my own reading of RFCs 1035 & 1123:

Ruby: \A(?!-)[a-zA-Z0-9-]{1,63}(?<!-)\z (tests below)

Python: ^(?!-)[a-zA-Z0-9-]{1,63}(?<!-)$ (not tested by me)

Javascript: pattern = /^(?!-)[a-zA-Z0-9-]{1,63}$/g; (based on Tom Lime's answer, not tested by me)

Tests:

tests = [
  ['01010', true],
  ['abc', true],
  ['A0c', true],
  ['A0c-', false],
  ['-A0c', false],
  ['A-0c', true],
  ['o123456701234567012345670123456701234567012345670123456701234567', false],
  ['o12345670123456701234567012345670123456701234567012345670123456', true],
  ['', false],
  ['a', true],
  ['0--0', true],
  ["A0c\nA0c", false]
]

regex = /\A(?!-)[a-zA-Z0-9-]{1,63}(?<!-)\z/
tests.each do |label, expected|
  is_match = !!(regex =~ label)
  puts is_match == expected
end

Notes:

  1. Thanks to Mark Byers for the original code fragment
  2. solidsnack points out that RFC 1123 allows all-numeric labels (https://www.rfc-editor.org/rfc/rfc1123#page-13)
  3. RFC 1035 does not allow zero-length labels (https://www.rfc-editor.org/rfc/rfc1035): <label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
  4. I've added a test specifically for Ruby that ensures a new line is not embedded in the label. This is thanks to notes by ssorallen.
  5. This code is available here: https://github.com/Xenapto/domain-label-validation - I'm happy to accept pull requests if you want to update it.
Share:
32,293
Rwahyudi
Author by

Rwahyudi

Updated on August 01, 2022

Comments

  • Rwahyudi
    Rwahyudi almost 2 years

    I would like to validate a hostname using only regualr expression.

    Host Names (or 'labels' in DNS jargon) were traditionally defined by RFC 952 and RFC 1123 and may be composed of the following valid characters.

    List item

    • A to Z ; upper case characters
    • a to z ; lower case characters
    • 0 to 9 ; numeric characters 0 to 9
    • - ; dash

    The rules say:

    • A host name (label) can start or end with a letter or a number
    • A host name (label) MUST NOT start or end with a '-' (dash)
    • A host name (label) MUST NOT consist of all numeric values
    • A host name (label) can be up to 63 characters

    How would you write Regular Expression to validate hostname ?

  • Rwahyudi
    Rwahyudi over 14 years
    Spot on Mark - just what I'm after !!
  • Tom Lime
    Tom Lime over 11 years
    Mark, thanks you saved my time and I will save someothers time by reposting your regex adapted for javascript.
  • Ross Allen
    Ross Allen over 10 years
    Use \A and \z in place of ^ and $, respectively, in Ruby since Ruby regular expressions are multi-line by default: \A(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)\z.
  • KingxMe
    KingxMe over 10 years
    It is actually okay for a label (part of a domain name) to be all numeric. However, for the whole domain name to be all numeric is in practice disallowed, since TLDs are not all numeric, and it is expected that one can distinguish syntactically between IPs and domain names. tools.ietf.org/html/rfc1123#page-13
  • Dominic Sayers
    Dominic Sayers over 10 years
    01010 is a valid label (RFC 1123). The empty string is an invalid label (RFC 1035)
  • Corey Ballou
    Corey Ballou over 8 years
    Read my answer below which requires the addition of underscores to your regex.
  • Patrick Mevzek
    Patrick Mevzek over 5 years
    @CoreyBallou No, underscores are not allowed in hostnames. They are only allowed in domain names, so it all depends on the resource record. _whatever CNAME elsewhere is valid (because owner of a CNAME is a domain name not an hostname) but _whatever IN A 192.0.2.42 is not valid because owner of an A record is an hostname and not a domain name.
  • peedee
    peedee over 2 years
    I find your first regex matching more than I want. May I suggest following improvement? ^[[:alnum:]][[:alnum:]\-]{0,61}[[:alnum:]]$|^[[:alnum:]]$
  • Bill Cole
    Bill Cole over 2 years
    Thanks! I have made a slightly different improvement that makes the meaning more human-obvious. Note that I've left the second alternative pattern as 'alpha' instead of 'alnum' because all-digit labels are not legal.