起源:
所下载视频,有音视频分离者,需要合并起来,采用python之subprocess.Popen()调用ffmpeg实现。python版本为2.7.13,而音视频文件路径,有unicode字符者,合并失败。
此问题由来已久,终于不忍受,用尽工夫寻其机现,终于寻得蛛丝蚂迹,完成其修复。
其原因为:python 2.7.x中subprocess.Popen()函数,最终调用了kernel32.dll中的CreateProcess函数Ansi版本CreateProcessA,传非Ansi参数给它会被它拒绝,而触发异常。
测试代码如下:
# encoding: utf-8 from __future__ import unicode_literals import subprocess file_path = r'D:Percy Faith [CA US] by chkjns ♫ 175 songsv.txt' args = u'notepad "%s"' % file_path try: p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE) stdout, stderr = p.communicate() if p.returncode != 0: stderr = stderr.decode('utf-8', 'replace') print stderr except Exception as e: print e.message
其异常为UnicodeEncodeError,表现字串为:'ascii' codec can't encode character u'u266b' in position 42: ordinal not in range(128)
1、Python 2.x先天缺陷
此问题百度不到有效答案,于是上stackoverlow,一路摸去,找到了python官方对此bug描述,其链接如下:
Issue 19264: subprocess.Popen doesn't support unicode on Windows - Python tracker
七嘴八舌众说纷纭,但总算大致捋清原委,即上述其调用了Ansi版本的CreateProcess所致。循此原因,是否替换为Unicode函数CreateProcessW即可?
最后一条回复:
msg289664 - (view) Author: Valentin LAB (Valentin LAB) Date: 2017-03-15 10:39
给出了折中方案,其方案即如所思,做函数层替换。
2、win_subprocess.py
Valentin LAB这哥们,就在github上开放了源码,解决此问题。其核心为win_subprocess.py,内容如下:
(其代码中有用os,他却忘记import,贴代码时已做修正)
## issue: https://bugs.python.org/issue19264 import ctypes import subprocess import _subprocess import os from ctypes import byref, windll, c_char_p, c_wchar_p, c_void_p, Structure, sizeof, c_wchar, WinError from ctypes.wintypes import BYTE, WORD, LPWSTR, BOOL, DWORD, LPVOID, HANDLE ## ## Types ## CREATE_UNICODE_ENVIRONMENT = 0x00000400 LPCTSTR = c_char_p LPTSTR = c_wchar_p LPSECURITY_ATTRIBUTES = c_void_p LPBYTE = ctypes.POINTER(BYTE) class STARTUPINFOW(Structure): _fields_ = [ ("cb", DWORD), ("lpReserved", LPWSTR), ("lpDesktop", LPWSTR), ("lpTitle", LPWSTR), ("dwX", DWORD), ("dwY", DWORD), ("dwXSize", DWORD), ("dwYSize", DWORD), ("dwXCountChars", DWORD), ("dwYCountChars", DWORD), ("dwFillAtrribute", DWORD), ("dwFlags", DWORD), ("wShowWindow", WORD), ("cbReserved2", WORD), ("lpReserved2", LPBYTE), ("hStdInput", HANDLE), ("hStdOutput", HANDLE), ("hStdError", HANDLE), ] LPSTARTUPINFOW = ctypes.POINTER(STARTUPINFOW) class PROCESS_INFORMATION(Structure): _fields_ = [ ("hProcess", HANDLE), ("hThread", HANDLE), ("dwProcessId", DWORD), ("dwThreadId", DWORD), ] LPPROCESS_INFORMATION = ctypes.POINTER(PROCESS_INFORMATION) class DUMMY_HANDLE(ctypes.c_void_p): def __init__(self, *a, **kw): super(DUMMY_HANDLE, self).__init__(*a, **kw) self.closed = False def Close(self): if not self.closed: windll.kernel32.CloseHandle(self) self.closed = True def __int__(self): return self.value CreateProcessW = windll.kernel32.CreateProcessW CreateProcessW.argtypes = [ LPCTSTR, LPTSTR, LPSECURITY_ATTRIBUTES, LPSECURITY_ATTRIBUTES, BOOL, DWORD, LPVOID, LPCTSTR, LPSTARTUPINFOW, LPPROCESS_INFORMATION, ] CreateProcessW.restype = BOOL ## ## Patched functions/classes ## def CreateProcess(executable, args, _p_attr, _t_attr, inherit_handles, creation_flags, env, cwd, startup_info): """Create a process supporting unicode executable and args for win32 Python implementation of CreateProcess using CreateProcessW for Win32 """ si = STARTUPINFOW( dwFlags=startup_info.dwFlags, wShowWindow=startup_info.wShowWindow, cb=sizeof(STARTUPINFOW), ## XXXvlab: not sure of the casting here to ints. hStdInput=int(startup_info.hStdInput), hStdOutput=int(startup_info.hStdOutput), hStdError=int(startup_info.hStdError), ) wenv = None if env is not None: ## LPCWSTR seems to be c_wchar_p, so let's say CWSTR is c_wchar env = (unicode("").join([ unicode("%s=%s ") % (k, v) for k, v in env.items()])) + unicode("