How does /etc/hosts match work?
Solution 1
Under Linux, /etc/hosts is parsed until match to resolve entries. This is also how it was handled in the 1983 release of 4.3 BSD which was the first general release of the modern Unix IP stack.
It is important to remember that at the time of 4.3 BSD, the entire list of Internet connected hosts was kept at a centralized Network Information Center and numbered only about 325. A connected host would retrieve the list of every host on the internet from the NIC, and a linear search through a few hundred lines was good enough. It was about this time that the IETF realized that wasn't going to scale well so proposed the Domain Name System. Thereafter, if you had more than a few hundred lines in /etc/hosts, you were "doing it wrong".
Note too that /etc/hosts is processed by libc, a user-space library. The kernel has no idea that anything exists except a sockaddr
. So this answer applies only to the handling of /etc/hosts and specifically ignores overarching name resolver systems which vary widely in their caching behaviors and time complexities.
Solution 2
How does /etc/hosts match work?
It's depend on /etc/nsswitch.conf
and /etc/host.conf
to decide whether to read information in /etc/hosts
As @msw said processed by libc, a user-space library, it's read nsswitch.conf and read Tag "hosts: files dns" , then file means read /etc/hosts
and dns means /etc/resolve.conf
Now suppose you hit the http://www.google.com in firefox,
- then firefox first going to resolve google.com with help of local resolver ( libresolve.so ), then
- it's first check the priority of file and dns ,
- in default case it will search in file means "/etc/hosts",
- in case if google.com is not match then it will refer "/etc/resolv.conf", in this file it's check the
nameserver
tag if it's not configure then resolver send dns Query to localhost on port 53, - if nameserver define suppose
nameserver 8.8.8.8
, then it will send query to eg.dig google.com @8.8.8.8
. of course now we can get ans of the query from pubic dns.
EDIT 1 Also users can maintain it's own hosts file using "HOSTALIASES" variable, so it will first check this file, before reading /etc/hosts.
Ex.
echo "fb www.fb.com" >> ~/my_hosts
echo "export HOSTALIASES=~/my_hosts" >> ~/.bashrc
source ~/.bashrc
Related videos on Youtube
theseankelly
Updated on September 18, 2022Comments
-
theseankelly almost 2 years
I'm baffled. I've got a linux backup script I used when I used to use linux as my main OS. Now that I've moved to windows, I want to keep using it under cygwin. I've ported it over, but am seeing a peculiar issue.
As you'll see in the code below, basically I create a folder structure /device/backups/machinename/. I've generalized it so it determines machine name via
hostname
. If the folder structure doesn't exist, it creates it. What I'm seeing is that the script is generally working, but it occasionally likes to create a duplicate machine name folder with an odd square character after it. This character shows as a question mark in cygwin. So, I'd see: /device/backups/machinename and /device/backups/machinename? at the same time. Cygwin seems to get confused between these two folders, sometimes backing up to the first and sometimes backing up to the second. It also doesn't create this folder consistently, but if I let things run every day for a week, it'll show up.Also note it's designed to run on a per folder basis, folder names passed in as arguments. Keeps a week of archives in the format FolderName.0.tar.gz, FolderName.1.tar.gz, etc.
I'm going to try to work around it by hard coding the machine name for now, but I'm really interested in figuring out what the problem is. Here's my script's source:
#!/bin/bash #Backup Docs for FOLDER in $@ do # Location of folder to be backed up FOLDERLOCATION="/home/sean" # Mount point of the backup device DEVICE="/cygdrive/f" # Hostname of the machine being backed up HOSTNAME=`hostname` # NOTE: when I originally posted this question, the above line read: # HOSTNAME=`txtmsgbreakup` # which doesn't make any sense. I failed at changing the hard-coded solution back # to the original command that produces the string "txtmsgbreakup" (my system's name) BACKUPFOLDER="$FOLDERLOCATION/$FOLDER/" BACKUPDEST="$DEVICE/backups/$HOSTNAME/$FOLDER/" # Check to see if device is mounted if [ -d $DEVICE ] then # Create directory if necessary if [ ! -d $BACKUPDEST ]; then mkdir -p $BACKUPDEST fi # Capture before time for logging before=$(date +%s) # First, tar up the old into file named after day of week DOW=`date +%w` FILENAME="$DEVICE/backups/$HOSTNAME/$FOLDER.$(( ($DOW+6)%7 )).tar.gz" if [ -e $FILENAME ]; then rm $FILENAME fi tar -czPf $FILENAME $BACKUPDEST # Now perform the backup rsync -a --del --ignore-errors $BACKUPFOLDER $BACKUPDEST after=$(date +%s) # Calculate how long the backup took elapsed_seconds=$(($after-$before)) es=$((elapsed_seconds % 60)) em=$(( (elapsed_seconds / 60) % 60 )) eh=$((elapsed_seconds / 3600 )) # Write it all to the system log echo "$(date) - $BACKUPFOLDER backed up. Elapsed time: $(printf '%02d:%02d:%02d' $eh $em $es)" >> /var/log/backup else #External is not mounted if this branch is executed, log this. echo "$(date) - $BACKUPFOLDER not backed up: Backup device not mounted." >> /var/log/backup fi done
-
theseankelly over 12 yearsI should add: Backup device is an external HDD formatted NTFS.
-
Gordon Davisson over 12 years
ls -b
should also show what the weird character is. Since you're on Windows, I'd guess it's a carriage return (\r) since Windows tends to use \r\n to terminate lines. If you've edited the script with a Windows editor, you'll need to purge the \r's from it. Another possibility is thattxtmsgbreakup
produces Windows-style output, in which case you'll need to clean its output before using it. -
Keith Thompson over 12 years
ls | cat -A
should be enough to show you what the funny character is.ls
usually renders non-printable characters as?
, but if its output is redirected it shows them literally;cat -A
translates an ASCII CR character to^M
(to pick a decidedly non-random example). -
theseankelly over 12 yearsMade a stupid mistake changing the script back to its original form. I had solved the problem by hardcoding HOSTNAME="txtmsgbreakup" rather than HOSTNAME=
hostname
. I changed it back to HOSTNAME=txtmsgbreakup
which is invalid. The script never ran like that, it was only an error in what I posted, so that's not the problem. -
theseankelly over 12 yearsI also suspected a CR, and I'll take a look to see what the actual character is soon. However, even if it IS a CR, why would it only happen on rare occasion?
-
-
shellter over 12 yearsis this part of a startup script, that the device is still mounting when the script starts? (not likely, but just an idea). you can use
ls -d ${BACKUPFOLDER}* | od --format=m
may help by showing what the odd-ball character. Good luck. -
theseankelly over 12 yearsYeah, I'm a moron. I had changed my script to hardcode "txtmsgbreakup" as you suggest, but changed it back to the original for this example. I changed the quotes to ticks, but forgot to change the string to "hostname". So that line is actually
hostname
and nottxtmsgbreakup
, which evaluates to "txtmsgbreakup" on my particular machine. Sorry. I'll edit the original question too. -
user almost 11 years+1 for going to the source (literally) and the history lesson. Would upvote this twice if I could.