Regular Expression for validating DNS label ( host name)
Solution 1
^(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)$
I used the following testbed written in Python to verify that it works correctly:
tests = [
('01010', False),
('abc', True),
('A0c', True),
('A0c-', False),
('-A0c', False),
('A-0c', True),
('o123456701234567012345670123456701234567012345670123456701234567', False),
('o12345670123456701234567012345670123456701234567012345670123456', True),
('', True),
('a', True),
('0--0', True),
]
import re
regex = re.compile('^(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)$')
for (s, expected) in tests:
is_match = regex.match(s) is not None
print is_match == expected
Solution 2
Javascript regex based on Marks answer:
pattern = /^(?![0-9]+$)(?!.*-$)(?!-)[a-zA-Z0-9-]{1,63}$/g;
Solution 3
Ruby regular expressions are multiline by default, and so something like Rails warns against using ^
and $
. This is Mark's answer with safe start- and end of string characters:
\A(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)\z
Solution 4
It is worth noting that DNS labels and hostname components have slightly different rules. Most notably: '_' is not legal in any component of a hostname, but is a standard part of labels used for things like SRV records.
A more readable and portable approach is to require a string to match both of these POSIX ERE's:
^([[:alnum:]][[:alnum:]\-]{0,61}[[:alnum:]]|[[:alpha:]])$
^.*[[:^digit:]].*$
Those should be easy to use in any standard-compatible ERE implementation. Perl-style backtracking as in the Python example is widely available, but has the problem of not being exactly the same everywhere that it seems to work. Ouch.
It is possible in principle to make a single ERE of those two lines, but it would be long and unwieldy. The first line handles all of the rules other than the ban on all-digits, the second kills those.
Solution 5
A revised regex based on comments here and my own reading of RFCs 1035 & 1123:
Ruby: \A(?!-)[a-zA-Z0-9-]{1,63}(?<!-)\z
(tests below)
Python: ^(?!-)[a-zA-Z0-9-]{1,63}(?<!-)$
(not tested by me)
Javascript: pattern = /^(?!-)[a-zA-Z0-9-]{1,63}$/g;
(based on Tom Lime's answer, not tested by me)
Tests:
tests = [
['01010', true],
['abc', true],
['A0c', true],
['A0c-', false],
['-A0c', false],
['A-0c', true],
['o123456701234567012345670123456701234567012345670123456701234567', false],
['o12345670123456701234567012345670123456701234567012345670123456', true],
['', false],
['a', true],
['0--0', true],
["A0c\nA0c", false]
]
regex = /\A(?!-)[a-zA-Z0-9-]{1,63}(?<!-)\z/
tests.each do |label, expected|
is_match = !!(regex =~ label)
puts is_match == expected
end
Notes:
- Thanks to Mark Byers for the original code fragment
- solidsnack points out that RFC 1123 allows all-numeric labels (https://www.rfc-editor.org/rfc/rfc1123#page-13)
- RFC 1035 does not allow zero-length labels (https://www.rfc-editor.org/rfc/rfc1035):
<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
- I've added a test specifically for Ruby that ensures a new line is not embedded in the label. This is thanks to notes by ssorallen.
- This code is available here: https://github.com/Xenapto/domain-label-validation - I'm happy to accept pull requests if you want to update it.
Rwahyudi
Updated on August 01, 2022Comments
-
Rwahyudi almost 2 years
I would like to validate a hostname using only regualr expression.
Host Names (or 'labels' in DNS jargon) were traditionally defined by RFC 952 and RFC 1123 and may be composed of the following valid characters.
List item
- A to Z ; upper case characters
- a to z ; lower case characters
- 0 to 9 ; numeric characters 0 to 9
- - ; dash
The rules say:
- A host name (label) can start or end with a letter or a number
- A host name (label) MUST NOT start or end with a '-' (dash)
- A host name (label) MUST NOT consist of all numeric values
- A host name (label) can be up to 63 characters
How would you write Regular Expression to validate hostname ?
-
Rwahyudi over 14 yearsSpot on Mark - just what I'm after !!
-
Tom Lime over 11 yearsMark, thanks you saved my time and I will save someothers time by reposting your regex adapted for javascript.
-
Ross Allen over 10 yearsUse
\A
and\z
in place of^
and$
, respectively, in Ruby since Ruby regular expressions are multi-line by default:\A(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)\z
. -
KingxMe over 10 yearsIt is actually okay for a label (part of a domain name) to be all numeric. However, for the whole domain name to be all numeric is in practice disallowed, since TLDs are not all numeric, and it is expected that one can distinguish syntactically between IPs and domain names. tools.ietf.org/html/rfc1123#page-13
-
Dominic Sayers over 10 years
01010
is a valid label (RFC 1123). The empty string is an invalid label (RFC 1035) -
Corey Ballou over 8 yearsRead my answer below which requires the addition of underscores to your regex.
-
Patrick Mevzek over 5 years@CoreyBallou No, underscores are not allowed in hostnames. They are only allowed in domain names, so it all depends on the resource record.
_whatever CNAME elsewhere
is valid (because owner of a CNAME is a domain name not an hostname) but_whatever IN A 192.0.2.42
is not valid because owner of an A record is an hostname and not a domain name. -
peedee over 2 yearsI find your first regex matching more than I want. May I suggest following improvement?
^[[:alnum:]][[:alnum:]\-]{0,61}[[:alnum:]]$|^[[:alnum:]]$
-
Bill Cole over 2 yearsThanks! I have made a slightly different improvement that makes the meaning more human-obvious. Note that I've left the second alternative pattern as 'alpha' instead of 'alnum' because all-digit labels are not legal.