How does /etc/hosts match work?

921

Solution 1

Under Linux, /etc/hosts is parsed until match to resolve entries. This is also how it was handled in the 1983 release of 4.3 BSD which was the first general release of the modern Unix IP stack.

It is important to remember that at the time of 4.3 BSD, the entire list of Internet connected hosts was kept at a centralized Network Information Center and numbered only about 325. A connected host would retrieve the list of every host on the internet from the NIC, and a linear search through a few hundred lines was good enough. It was about this time that the IETF realized that wasn't going to scale well so proposed the Domain Name System. Thereafter, if you had more than a few hundred lines in /etc/hosts, you were "doing it wrong".

Note too that /etc/hosts is processed by libc, a user-space library. The kernel has no idea that anything exists except a sockaddr. So this answer applies only to the handling of /etc/hosts and specifically ignores overarching name resolver systems which vary widely in their caching behaviors and time complexities.

Solution 2

How does /etc/hosts match work?

It's depend on /etc/nsswitch.conf and /etc/host.conf to decide whether to read information in /etc/hosts

As @msw said processed by libc, a user-space library, it's read nsswitch.conf and read Tag "hosts: files dns" , then file means read /etc/hosts and dns means /etc/resolve.conf

Now suppose you hit the http://www.google.com in firefox,

  1. then firefox first going to resolve google.com with help of local resolver ( libresolve.so ), then
  2. it's first check the priority of file and dns ,
  3. in default case it will search in file means "/etc/hosts",
  4. in case if google.com is not match then it will refer "/etc/resolv.conf", in this file it's check the nameserver tag if it's not configure then resolver send dns Query to localhost on port 53,
  5. if nameserver define suppose nameserver 8.8.8.8, then it will send query to eg. dig google.com @8.8.8.8. of course now we can get ans of the query from pubic dns.

EDIT 1 Also users can maintain it's own hosts file using "HOSTALIASES" variable, so it will first check this file, before reading /etc/hosts.

Ex.

echo "fb  www.fb.com" >> ~/my_hosts
echo "export HOSTALIASES=~/my_hosts" >> ~/.bashrc
source ~/.bashrc 
Share:
921

Related videos on Youtube

theseankelly
Author by

theseankelly

Updated on September 18, 2022

Comments

  • theseankelly
    theseankelly almost 2 years

    I'm baffled. I've got a linux backup script I used when I used to use linux as my main OS. Now that I've moved to windows, I want to keep using it under cygwin. I've ported it over, but am seeing a peculiar issue.

    As you'll see in the code below, basically I create a folder structure /device/backups/machinename/. I've generalized it so it determines machine name via hostname. If the folder structure doesn't exist, it creates it. What I'm seeing is that the script is generally working, but it occasionally likes to create a duplicate machine name folder with an odd square character after it. This character shows as a question mark in cygwin. So, I'd see: /device/backups/machinename and /device/backups/machinename? at the same time. Cygwin seems to get confused between these two folders, sometimes backing up to the first and sometimes backing up to the second. It also doesn't create this folder consistently, but if I let things run every day for a week, it'll show up.

    Also note it's designed to run on a per folder basis, folder names passed in as arguments. Keeps a week of archives in the format FolderName.0.tar.gz, FolderName.1.tar.gz, etc.

    I'm going to try to work around it by hard coding the machine name for now, but I'm really interested in figuring out what the problem is. Here's my script's source:

    #!/bin/bash
    #Backup Docs 
    
    for FOLDER in $@
    do
    
    # Location of folder to be backed up
    FOLDERLOCATION="/home/sean"
    # Mount point of the backup device
    DEVICE="/cygdrive/f"
    # Hostname of the machine being backed up
    HOSTNAME=`hostname`
          # NOTE: when I originally posted this question, the above line read:
          # HOSTNAME=`txtmsgbreakup`
          # which doesn't make any sense.  I failed at changing the hard-coded solution back 
          # to the original command that produces the string "txtmsgbreakup" (my system's name)
    
    BACKUPFOLDER="$FOLDERLOCATION/$FOLDER/"
    BACKUPDEST="$DEVICE/backups/$HOSTNAME/$FOLDER/"
    
    # Check to see if device is mounted
    if [ -d $DEVICE ]
    then
        # Create directory if necessary
        if [ ! -d $BACKUPDEST ]; then
            mkdir -p $BACKUPDEST
        fi
    
        # Capture before time for logging
        before=$(date +%s)
    
        # First, tar up the old into file named after day of week
        DOW=`date +%w`
        FILENAME="$DEVICE/backups/$HOSTNAME/$FOLDER.$(( ($DOW+6)%7 )).tar.gz"
        if [ -e $FILENAME ]; then
            rm $FILENAME
        fi
        tar -czPf $FILENAME $BACKUPDEST
    
        # Now perform the backup
        rsync -a --del --ignore-errors $BACKUPFOLDER $BACKUPDEST
        after=$(date +%s)
    
        # Calculate how long the backup took
        elapsed_seconds=$(($after-$before))
        es=$((elapsed_seconds % 60))
        em=$(( (elapsed_seconds / 60) % 60 ))
        eh=$((elapsed_seconds / 3600 ))
    
        # Write it all to the system log
        echo "$(date) - $BACKUPFOLDER backed up.  Elapsed time:  $(printf '%02d:%02d:%02d' $eh $em $es)" >> /var/log/backup
    
    else
        #External is not mounted if this branch is executed, log this.  
        echo "$(date) - $BACKUPFOLDER not backed up:  Backup device not mounted." >> /var/log/backup
    fi
    
    done
    
    • theseankelly
      theseankelly over 12 years
      I should add: Backup device is an external HDD formatted NTFS.
    • Gordon Davisson
      Gordon Davisson over 12 years
      ls -b should also show what the weird character is. Since you're on Windows, I'd guess it's a carriage return (\r) since Windows tends to use \r\n to terminate lines. If you've edited the script with a Windows editor, you'll need to purge the \r's from it. Another possibility is that txtmsgbreakup produces Windows-style output, in which case you'll need to clean its output before using it.
    • Keith Thompson
      Keith Thompson over 12 years
      ls | cat -A should be enough to show you what the funny character is. ls usually renders non-printable characters as ?, but if its output is redirected it shows them literally; cat -A translates an ASCII CR character to ^M (to pick a decidedly non-random example).
    • theseankelly
      theseankelly over 12 years
      Made a stupid mistake changing the script back to its original form. I had solved the problem by hardcoding HOSTNAME="txtmsgbreakup" rather than HOSTNAME=hostname. I changed it back to HOSTNAME=txtmsgbreakup which is invalid. The script never ran like that, it was only an error in what I posted, so that's not the problem.
    • theseankelly
      theseankelly over 12 years
      I also suspected a CR, and I'll take a look to see what the actual character is soon. However, even if it IS a CR, why would it only happen on rare occasion?
  • shellter
    shellter over 12 years
    is this part of a startup script, that the device is still mounting when the script starts? (not likely, but just an idea). you can use ls -d ${BACKUPFOLDER}* | od --format=m may help by showing what the odd-ball character. Good luck.
  • theseankelly
    theseankelly over 12 years
    Yeah, I'm a moron. I had changed my script to hardcode "txtmsgbreakup" as you suggest, but changed it back to the original for this example. I changed the quotes to ticks, but forgot to change the string to "hostname". So that line is actually hostname and not txtmsgbreakup, which evaluates to "txtmsgbreakup" on my particular machine. Sorry. I'll edit the original question too.
  • user
    user almost 11 years
    +1 for going to the source (literally) and the history lesson. Would upvote this twice if I could.