给定某字符,只需要保留其中的有效汉字或者字母,数字之类的。去掉特殊符号或者以某种格式进行拆分的时候,就可以采用re.split的方法。例如
=============================== RESTART: Shell ===============================
>>> s = '''Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.'''
>>> s
'Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.'
>>> #现在要对s拆分,去掉里面多余的字符,只提取 数字,字母这些有效字符。
>>>
>>> import re
>>> x = re.split(r'[.(:,[)" ]', s) #把特殊符号和空格都作为拆分条件输入
['Python', '3', '6', '1', '', 'v3', '6', '1', '69c0db5', '', 'Mar', '21', '2017', '', '18', '41', '36', '', '', 'MSC', 'v', '1900', '64', 'bit', '', 'AMD64', ']', 'on', 'win32
Type', '', 'copyright', '', '', '', 'credits', '', 'or', '', 'license', '', '', '', 'for', 'more', 'information', '']
>>>
>>> words = [i for i in x if i]
>>> words
['Python', '3', '6', '1', 'v3', '6', '1', '69c0db5', 'Mar', '21', '2017', '18', '41', '36', 'MSC', 'v', '1900', '64', 'bit', 'AMD64', ']', 'on', 'win32
Type', 'copyright', 'credits', 'or', 'license', 'for', 'more', 'information']
>>>
使用S.join() 方法拼接:
>>> #字符串的拼接 >>> >>> help(str.join) Help on method_descriptor: join(...) S.join(iterable) -> str Return a string which is the concatenation of the strings in the iterable. The separator between elements is S. >>> l = list(range(1,9)) >>> >>> s = "".join([str(i) for i in l]) >>> s '12345678' >>> s = "".join(str(i) for i in l) >>> s '12345678' >>>
顺便提一下
如果有任何问题,你可以在这里找到我 ,软件测试交流qq群,209092584