How does Windows determine/handle the DOS short name of any given file?

10,656

Solution 1

The short filename is created with the file. The algorithm works like this (usually, but see moocha's reply):

counter = 1
stripped_filename = strip_dots(strip_non_ascii_characters(filename))
shortfn = first_6_characters(stripped_filename)
while (file_exists(shortfn + "~" + counter + "." + extension)) {
    increment counter by 1
    if more digits are added to counter, shorten shortfn by 1 
    /* e.g. if counter comes to 9 and shortf~9.txt is taken. try short~10.txt next */
}

This means that once the file is created, it will keep its short name until it's deleted.

As soon as the file is deleted, the short name may be used again.

If you move the file somewhere else, it may get a new short name (e.g. you're moving c:\somefilewithlongname.txt ("c:\somefi~1.txt") to d:\stuff\somefilewithlongname.txt, if there's d:\stuff\somefileelse.txt ("d:\stuff\somefi~1.txt"), the short name of the moved file will be somefi~2.txt). It seems that the short name is only persistent within a given directory on a given machine.

So: the short filenames will be generated by the filesystem, usually by the method outlined above. It is better to assume that short filenames are not persistent, as c:\longfi~1.txt on one machine might be "c:\longfilename.txt", whereas on another it might be "c:\longfish_story.txt"; also, when a file is deleted, the short name is immediately available again.

Solution 2

If I were you, I would never rely on any version of any file system driver (be it Microsoft's, be it another OS's) to be consistent about the algorithm it uses to generate short file names. The exact behavior of the Microsoft Fastfat and NTFS drivers is not "officially" documented (except as somewhat high level overviews) thus are not part of the API contract. What works today might not work tomorrow if you update the driver.

In addition, there is absolutely no requirement that short names contain tilde characters - see for example this post by Raymond Chen.

There's a treasure trove of info to be found about this topic in the MSDN blogs - for example:

Also, do not rely on the sole presence of alphanumerical characters. Look at the Linux VFAT driver which says, for example, that any combination of uppercase letters, digits, and the following characters is valid: $ % ' ` - @ { } ~ ! # ( ) & _ ^. NTFS will operate in compatibility mode with that...

Solution 3

I believe MSDOS stores the association between the long and the short name in a per directory file.

It does not depends on the date/time.

If you move your files in a new directory... this will reset the algo mentionned by Piskvor applies itself again

In the new directory (after a move), you will get:

ALONGF~1.TXT alongfilename1.txt
ALONGF~2.TXT alongfilename2.txt
ALONGF~3.TXT alongfilename3.txt

even though alongfilename2.txt has initially been created third.

Share:
10,656
Matt Refghi
Author by

Matt Refghi

Updated on June 05, 2022

Comments

  • Matt Refghi
    Matt Refghi almost 2 years

    I have a folder with these files:

    alongfilename1.txt <--- created first
    alongfilename3.txt <--- created second
    

    When I run DIR /x in command prompt, I see these short names assigned:

    ALONGF~1.TXT alongfilename1.txt
    ALONGF~2.TXT alongfilename3.txt
    

    Now, if I add another file:

    alongfilename1.txt 
    alongfilename2.txt <--- created third
    alongfilename3.txt
    

    I see this:

    ALONGF~1.TXT alongfilename1.txt
    ALONGF~3.TXT alongfilename2.txt
    ALONGF~2.TXT alongfilename3.txt
    

    Fine. It seems to be assigning the "~#" according to the date/time that I created the file. Is this correct?

    Now, if I delete "alongfilename1.txt", the other two files keep their short names.

    ALONGF~3.TXT alongfilename2.txt
    ALONGF~2.TXT alongfilename3.txt
    

    When will that ID (in this case, ~1) be released for use in another shortname. Will it ever?

    Also, is it possible that a file on my machine has a short name of X, whereas the same file has a short name of Y on another machine? I'm particularly concerned for installations whose custom actions utilize DOS short names.

    Thanks, guys.

  • Cem Kalyoncu
    Cem Kalyoncu over 14 years
    Also spaces are stripped and extension is shortened to 3 chars
  • littlebroccoli
    littlebroccoli almost 13 years
    Yes, and volumes provided by Samba will not use predictable short names.