Unicode filename to python subprocess.call()

12,494

Solution 1

I found a fine workaround, it's a bit messy, but it works.

subprocess.call is going to pass the text in its own encoding to the terminal, which might or not be the one it's expecting. Because you want to make it portable, you'll need to know the machine's encoding at runtime.

The following

notepad = 'C://Notepad.exe'
subprocess.call([notepad.encode(sys.getfilesystemencoding())])

attempts to figure out the current encoding and therefore applies the correct one to subprocess.call

As a sidenote, I have also found that if you attempt to compose a string with the current directory, using

os.cwd() 

Python (or the OS, don't know) will mess up directories with accented characters. To prevent this I have found the following to work:

os.cwd().decode(sys.getfilesystemencoding())

Which is very similar to the solution above.

Hope it helps.

Solution 2

If your file exists, you can use short filename (aka 8.3 name). This name is defined for existent files, and should cause no trouble to non-Unicode aware programs when passed as argument.

One way to obtain one (needs Pywin32 to be installed):

import win32api
short_path = win32api.GetShortPathName(unicode_path)

Alternatively, you can also use ctypes:

import ctypes
import ctypes.wintypes

ctypes.windll.kernel32.GetShortPathNameW.argtypes = [
    ctypes.wintypes.LPCWSTR, # lpszLongPath
    ctypes.wintypes.LPWSTR, # lpszShortPath
    ctypes.wintypes.DWORD # cchBuffer
]
ctypes.windll.kernel32.GetShortPathNameW.restype = ctypes.wintypes.DWORD

buf = ctypes.create_unicode_buffer(1024) # adjust buffer size, if necessary
ctypes.windll.kernel32.GetShortPathNameW(unicode_path, buf, len(buf))

short_path = buf.value

Solution 3

It appears that to make this work, the subprocess code would have to be modified to use a wide character version of CreateProcess (assuming that one exists). There's a PEP discussing the same change made for the file object at http://www.python.org/dev/peps/pep-0277/ Perhaps you could research the Windows C calls and propose a similar change for subprocess.

Share:
12,494
otrov
Author by

otrov

Updated on June 13, 2022

Comments

  • otrov
    otrov about 2 years

    I'm trying to run subprocess.call() with unicode filename, and here is simplified problem:

    n = u'c:\\windows\\notepad.exe '
    f = u'c:\\temp\\nèw.txt'
    
    subprocess.call(n + f)
    

    which raises famous error:

    UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8'

    Encoding to utf-8 produces wrong filename, and mbcs passes filename as new.txt without accent

    I just can't read any more on this confusing subject and spin in circle. I found here lot of answers for many different problems in past so I thought to join and ask for help myself

    Thanks