python数据类型之字符串
数据类型—字符串str
特性:有序,不可变的数据类型,可迭代的数据类型
官方帮助文档
- class str(object)
- | str(object='') -> str
- | str(bytes_or_buffer[, encoding[, errors]]) -> str
- |
- | Create a new string object from the given object. If encoding or
- | errors is specified, then the object must expose a data buffer
- | that will be decoded using the given encoding and error handler.
- | Otherwise, returns the result of object.__str__() (if defined)
- | or repr(object).
- | encoding defaults to sys.getdefaultencoding().
- | errors defaults to 'strict'.
- |
- | Methods defined here:
- | capitalize(...)
- | S.capitalize() -> str
- |
- | Return a capitalized version of S, i.e. make the first character
- | have upper case and the rest lower case.
- |
- | casefold(...)
- | S.casefold() -> str
- |
- | Return a version of S suitable for caseless comparisons.
- |
- | center(...)
- | S.center(width[, fillchar]) -> str
- |
- | Return S centered in a string of length width. Padding is
- | done using the specified fill character (default is a space)
- |
- | count(...)
- | S.count(sub[, start[, end]]) -> int
- |
- | Return the number of non-overlapping occurrences of substring sub in
- | string S[start:end]. Optional arguments start and end are
- | interpreted as in slice notation.
- |
- | encode(...)
- | S.encode(encoding='utf-8', errors='strict') -> bytes
- |
- | Encode S using the codec registered for encoding. Default encoding
- | is 'utf-8'. errors may be given to set a different error
- | handling scheme. Default is 'strict' meaning that encoding errors raise
- | a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
- | 'xmlcharrefreplace' as well as any other name registered with
- | codecs.register_error that can handle UnicodeEncodeErrors.
- |
- | endswith(...)
- | S.endswith(suffix[, start[, end]]) -> bool
- | Return True if S ends with the specified suffix, False otherwise.
- | With optional start, test S beginning at that position.
- | With optional end, stop comparing S at that position.
- | suffix can also be a tuple of strings to try.
- |
- | expandtabs(...)
- | S.expandtabs(tabsize=8) -> str
- |
- | Return a copy of S where all tab characters are expanded using spaces.
- | If tabsize is not given, a tab size of 8 characters is assumed.
- |
- | find(...)
- | S.find(sub[, start[, end]]) -> int
- | Return the lowest index in S where substring sub is found,
- | such that sub is contained within S[start:end]. Optional
- | arguments start and end are interpreted as in slice notation.
- |
- | Return -1 on failure.
- |
- | format(...)
- | S.format(*args, **kwargs) -> str
- | Return a formatted version of S, using substitutions from args and kwargs.
- | The substitutions are identified by braces ('{' and '}').
- |
- | format_map(...)
- | S.format_map(mapping) -> str
- | Return a formatted version of S, using substitutions from mapping.
- | The substitutions are identified by braces ('{' and '}').
- |
- | index(...)
- | S.index(sub[, start[, end]]) -> int
- | Return the lowest index in S where substring sub is found,
- | such that sub is contained within S[start:end]. Optional
- | arguments start and end are interpreted as in slice notation.
- |
- | Raises ValueError when the substring is not found.
- |
- | isalnum(...)
- | S.isalnum() -> bool
- | Return True if all characters in S are alphanumeric
- | and there is at least one character in S, False otherwise.
- |
- | isalpha(...)
- | S.isalpha() -> bool
- | Return True if all characters in S are alphabetic
- | and there is at least one character in S, False otherwise.
- |
- | isdecimal(...)
- | S.isdecimal() -> bool
- | Return True if there are only decimal characters in S,
- | False otherwise.
- |
- | isdigit(...)
- | S.isdigit() -> bool
- | Return True if all characters in S are digits
- | and there is at least one character in S, False otherwise.
- |
- | isidentifier(...)
- | S.isidentifier() -> bool
- | Return True if S is a valid identifier according
- | to the language definition.
- |
- | Use keyword.iskeyword() to test for reserved identifiers
- | such as "def" and "class".
- |
- | islower(...)
- | S.islower() -> bool
- | Return True if all cased characters in S are lowercase and there is
- | at least one cased character in S, False otherwise.
- |
- | isnumeric(...)
- | S.isnumeric() -> bool
- | Return True if there are only numeric characters in S,
- | False otherwise.
- |
- | isprintable(...)
- | S.isprintable() -> bool
- | Return True if all characters in S are considered
- | printable in repr() or S is empty, False otherwise.
- |
- | isspace(...)
- | S.isspace() -> bool
- | Return True if all characters in S are whitespace
- | and there is at least one character in S, False otherwise.
- |
- | istitle(...)
- | S.istitle() -> bool
- | Return True if S is a titlecased string and there is at least one
- | character in S, i.e. upper- and titlecase characters may only
- | follow uncased characters and lowercase characters only cased ones.
- | Return False otherwise.
- |
- | isupper(...)
- | S.isupper() -> bool
- |
- | Return True if all cased characters in S are uppercase and there is
- | at least one cased character in S, False otherwise.
- |
- | join(...)
- | S.join(iterable) -> str
- | Return a string which is the concatenation of the strings in the
- | iterable. The separator between elements is S.
- |
- | ljust(...)
- | S.ljust(width[, fillchar]) -> str
- | Return S left-justified in a Unicode string of length width. Padding is
- | done using the specified fill character (default is a space).
- |
- | lower(...)
- | S.lower() -> str
- | Return a copy of the string S converted to lowercase.
- |
- | lstrip(...)
- | S.lstrip([chars]) -> str
- | Return a copy of the string S with leading whitespace removed.
- | If chars is given and not None, remove characters in chars instead.
- |
- | partition(...)
- | S.partition(sep) -> (head, sep, tail)
- | Search for the separator sep in S, and return the part before it,
- | the separator itself, and the part after it. If the separator is not
- | found, return S and two empty strings.
- |
- | replace(...)
- | S.replace(old, new[, count]) -> str
- | Return a copy of S with all occurrences of substring
- | old replaced by new. If the optional argument count is
- | given, only the first count occurrences are replaced.
- |
- | rfind(...)
- | S.rfind(sub[, start[, end]]) -> int
- | Return the highest index in S where substring sub is found,
- | such that sub is contained within S[start:end]. Optional
- | arguments start and end are interpreted as in slice notation.
- | Return -1 on failure.
- |
- | rindex(...)
- | S.rindex(sub[, start[, end]]) -> int
- | Return the highest index in S where substring sub is found,
- | such that sub is contained within S[start:end]. Optional
- | arguments start and end are interpreted as in slice notation.
- |
- | Raises ValueError when the substring is not found.
- |
- | rjust(...)
- | S.rjust(width[, fillchar]) -> str
- | Return S right-justified in a string of length width. Padding is
- | done using the specified fill character (default is a space).
- |
- | rpartition(...)
- | S.rpartition(sep) -> (head, sep, tail)
- | Search for the separator sep in S, starting at the end of S, and return
- | the part before it, the separator itself, and the part after it. If the
- | separator is not found, return two empty strings and S.
- |
- | rsplit(...)
- | S.rsplit(sep=None, maxsplit=-1) -> list of strings
- | Return a list of the words in S, using sep as the
- | delimiter string, starting at the end of the string and
- | working to the front. If maxsplit is given, at most maxsplit
- | splits are done. If sep is not specified, any whitespace string
- | is a separator.
- |
- | rstrip(...)
- | S.rstrip([chars]) -> str
- | Return a copy of the string S with trailing whitespace removed.
- | If chars is given and not None, remove characters in chars instead.
- |
- | split(...)
- | S.split(sep=None, maxsplit=-1) -> list of strings
- | Return a list of the words in S, using sep as the
- | delimiter string. If maxsplit is given, at most maxsplit
- | splits are done. If sep is not specified or is None, any
- | whitespace string is a separator and empty strings are
- | removed from the result.
- |
- | splitlines(...)
- | S.splitlines([keepends]) -> list of strings
- | Return a list of the lines in S, breaking at line boundaries.
- | Line breaks are not included in the resulting list unless keepends
- | is given and true.
- |
- | startswith(...)
- | S.startswith(prefix[, start[, end]]) -> bool
- | Return True if S starts with the specified prefix, False otherwise.
- | With optional start, test S beginning at that position.
- | With optional end, stop comparing S at that position.
- | prefix can also be a tuple of strings to try.
- |
- | strip(...)
- | S.strip([chars]) -> str
- | Return a copy of the string S with leading and trailing
- | whitespace removed.
- | If chars is given and not None, remove characters in chars instead.
- |
- | swapcase(...)
- | S.swapcase() -> str
- | Return a copy of S with uppercase characters converted to lowercase
- | and vice versa.
- |
- | title(...)
- | S.title() -> str
- | Return a titlecased version of S, i.e. words start with title case
- | characters, all remaining cased characters have lower case.
- |
- | translate(...)
- | S.translate(table) -> str
- | Return a copy of the string S in which each character has been mapped
- | through the given translation table. The table must implement
- | lookup/indexing via __getitem__, for instance a dictionary or list,
- | mapping Unicode ordinals to Unicode ordinals, strings, or None. If
- | this operation raises LookupError, the character is left untouched.
- | Characters mapped to None are deleted.
- |
- | upper(...)
- | S.upper() -> str
- | Return a copy of S converted to uppercase.
- |
- | zfill(...)
- | S.zfill(width) -> str
- | Pad a numeric string S with zeros on the left, to fill a field
- | of the specified width. The string S is never truncated.
- |
- | ----------------------------------------------------------------------
- | Static methods defined here:
- |
- | maketrans(x, y=None, z=None, /)
- | Return a translation table usable for str.translate().
- |
- | If there is only one argument, it must be a dictionary mapping Unicode
- | ordinals (integers) or characters to Unicode ordinals, strings or None.
- | Character keys will be then converted to ordinals.
- | If there are two arguments, they must be strings of equal length, and
- | in the resulting dictionary, each character in x will be mapped to the
- | character at the same position in y. If there is a third argument, it
- | must be a string, whose characters will be mapped to None in the result.
字符串的方法 #80000e
总共44个方法
['capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
创建字符串
class str(object)
| str(object='') -> str
| str(bytes_or_buffer[, encoding[, errors]]) -> str
str关键字创建
>>> str(123)
'123'
>>> str([12,34])
'[12, 34]'
一对单双引号,一对三引号多行的字符串
>>> a = 'china'
>>> a
'china'
大小写转换
全部小写str.lower()、全部大写str.upper()
>>> a = 'ChiNa'
>>> a.lower()
'china'
>>> a
'ChiNa'
>>> a.upper()
'CHINA
str.casefold():把所有字母变为小写,与lower类似,lower只支持英文字母A~Z,但是casefold可以把非英文变为小写。
>>> 'B,b,cdEfg'.casefold()
'b,b,cdefg'
>>> 'ß'.casefold()
'ss'
>>> 'ß'.lower()
'ß'
s.swapcase() :字符串全部字符大小写互换
>>> s = 'AbCdEFghijK'
>>> s.swapcase()
'aBcDefGHIJk'
s.capitalize():字符串首个单词首字母大写
>>> s = 'james hsiao'
>>> s.capitalize()
'James hsiao'
s.title(): 字符串中全部单词首字母大写
>>> s = 'james hsiao'
>>> s.title()
'James Hsiao'
isXXX判断,返回的是布尔值
['isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper']
str.istitle():字符串是否每个单词首字母是大写,是返回True
str.isupper:字符串是否每个字母都是大写,是返回True
str.islower:字符串是否每个字母都是小写,是返回True
判断是否小写、大写、首字母大写。要求字符串中至少要包含一个字符串字符,否则直接返回False。
>>> print('Aa Bc'.istitle())
True
>>> print('Aa_Bc'.istitle())
True
>>> print('Aa bc'.istitle())
False
>>> print('Aa_bc'.istitle())
False
>>> print('A234A'.isupper())
True
>>> print('Aa'.isupper())
False
>>> print('a34'.islower())
True
str.isdecimal():如果字符串中只有十进制字符,则返回True(只支持十进制的阿拉伯数字)
str.isnumeric():如果字符串中只有数字字符,则返回True(除了单字节数字b" "是Error,支持中文数字)
str.isdigit():如果字符串中的所有字符都是数字,并且至少有一个字符,则返回True(支持bytes类型的字符串)
>>> a = b'123'
>>> a.isdigit() # 支持bytes类型的字符串
True
>>> a.isdecimal() # 报错
Traceback (most recent call last):
File "<pyshell#14>", line 1, in <module>
a.isdecimal()
AttributeError: 'bytes' object has no attribute 'isdecimal'
>>> a.isnumeric() # 报错
Traceback (most recent call last):
File "<pyshell#20>", line 1, in <module>
a.isnumeric()
AttributeError: 'bytes' object has no attribute 'isnumeric'
>>> '一二三123'.isnumeric() # 包括中文数字
True
>>> 'IIII'.isnumeric()
False
str.isalpha():如果字符串中的所有字符都是字母或汉字,并且至少有一个字符,则返回True,否则返回 False。
>>> '你好ni'.isalpha()
True
>>> '你好ni234'.isalpha()
False
>>> '你好ni&%$'.isalpha()
False
str.isalnum():如果字符串至少有一个字符,并且所有字符都是字母或数字则返回 True,否则返回 False。
>>> 'asdfghjkhl'.isalnum()
True
>>> 'asdfg.. hjkhl'.isalnum()
False
str.isspace():判断字符串是否是空白(空格、制表符、换行符等)字符
str.isprintable():是否是可打印字符(例如制表符、换行符就不是可打印字符,但空格是)
str.isidentifier():是否满足标识符定义规则
判断是否为空白。没有任何字符不算是空白。注意不是空
>>> print(' '.isspace())
True
>>> print(' '.isspace())
True
>>> print('
'.isspace())
True
>>> print(''.isspace())
False
>>> print('Aa BC'.isspace())
False
----------
判断是否是可打印字符。
>>> print('
'.isprintable())
False
>>> print(' '.isprintable())
False
>>> print('acd'.isprintable())
True
>>> print(' '.isprintable())
True
>>> print(''.isprintable())
True
----------
判断是否满足标识符定义规则。
标识符定义规则为:只能是字母或下划线开头、不能包含除数字、字母和下划线以外的任意字符。
>>> print('abc'.isidentifier())
True
>>> print('2abc'.isidentifier())
False
>>> print('abc2'.isidentifier())
True
>>> print('_abc2'.isidentifier())
True
>>> print('_abc_2'.isidentifier())
True
>>> print('_Abc_2'.isidentifier())
True
>>> print('Abc_2'.isidentifier())
True
填充
str.center(width[, fillchar]):中间对齐,字符串用单字符fillchar填充,长度为wideth,若指定的长度小于原字符串的长度则返回原始字符串。
>>> 'winner'.center(20)
' winner '
>>> 'winner'.center(20,'*#')
Traceback (most recent call last):
File "<pyshell#48>", line 1, in <module>
'winner'.center(20,'*#')
TypeError: The fill character must be exactly one character long
>>> 'winner'.center(20,'=')
'=======winner======='
str.ljust(width,fillchar):左对齐,字符串用单字符fillchar填充,长度为wideth,若指定的长度小于原字符串的长度则返回原始字符串。
>>> 'winner'.ljust(20)
'winner '
>>> 'winner'.ljust(20,'*#')
Traceback (most recent call last):
File "<pyshell#51>", line 1, in <module>
'winner'.ljust(20,'*#')
TypeError: The fill character must be exactly one character long
>>> 'winner'.ljust(20,'=')
'winner=============='
str.rjust(width,fillchar):右对齐,字符串用单字符fillchar填充,长度为wideth,若指定的长度小于原字符串的长度则返回原始字符串。
>>> 'winner'.rjust(20)
' winner'
>>> 'winner'.rjust(20,'*#')
Traceback (most recent call last):
File "<pyshell#54>", line 1, in <module>
'winner'.rjust(20,'*#')
TypeError: The fill character must be exactly one character long
>>> 'winner'.rjust(20,'=')
'==============winner'
str.zfill(width):用0填充在字符串S的左边使其长度为width。如果S前有正负号+/-,则0填充在这两个符号的后面,且符号也算入长度。
>>> print('abc'.zfill(5))
00abc
>>> print('-abc'.zfill(5))
-0abc
>>> print('+abc'.zfill(5))
+0abc
>>> print('42'.zfill(5))
00042
>>> print('-42'.zfill(5))
-0042
>>> print('+42'.zfill(5))
+0042
修剪strip、lstrip和rstrip
str.strip([chars]):移除左右两边的字符char。不指定chars或者指定为None,则默认移除空白(空格、制表符、换行符)。
str.lstrip([chars]):移除左边的字符char。不指定chars或者指定为None,则默认移除空白(空格、制表符、换行符)。
str.rstrip([chars]):移除右边的字符char。不指定chars或者指定为None,则默认移除空白(空格、制表符、换行符)。
1.移除单个字符或空白。
>>> ' spacious '.lstrip()
'spacious '
>>> ' spacious '.rstrip()
' spacious'
>>> 'spacious '.lstrip('s')
'pacious '
>>> 'spacious'.rstrip('s')
'spaciou'
2.移除字符中的字符。
>>> print('www.example.com'.lstrip('cmowz.'))
example.com
>>> print('wwwz.example.com'.lstrip('cmowz.'))
example.com
>>> print('wwaw.example.com'.lstrip('cmowz.'))
aw.example.com
>>> print('www.example.com'.strip('cmowz.'))
'example'
子串搜索
str.count(sub[, start[, end]]):返回字符串中子串sub出现的次数,可以指定从哪里开始计算(start)以及计算到哪里结束(end),索引从0开始计算,不包括end边界。
>>> print('xyabxyxy'.count('xy'))
3
# 次数2,因为从index=1算起,即从'y'开始查找,查找的范围为'yabxyxy'
>>> print('xyabxyxy'.count('xy',1))
2
# 次数1,因为不包括end,所以查找的范围为'yabxyx'
>>> print('xyabxyxy'.count('xy',1,7))
1
# 次数2,因为查找的范围为'yabxyxy'
>>> print('xyabxyxy'.count('xy',1,8))
2
str.endswith(suffix[, start[, end]]):检查字符串是否已suffix结尾,返回布尔值的True和False。suffix可以是一个元组(tuple)。可以指定起始start和结尾end的搜索边界。
str.startswith(prefix[, start[, end]]):判断字符串是否是以prefix开头,返回布尔值的True和False。prefix可以是一个元组(tuple)。可以指定起始start和结尾end的搜索边界。
1.suffix是普通的字符串时。
>>> print('abcxyz'.endswith('xyz'))
True
# False,因为搜索范围为'yz'
>>> print('abcxyz'.endswith('xyz',4))
False
# False,因为搜索范围为'abcxy'
>>> print('abcxyz'.endswith('xyz',0,5))
False
>>> print('abcxyz'.endswith('xyz',0,6))
True
----------
2.suffix是元组(tuple)时,只要tuple中任意一个元素满足endswith的条件,就返回True。
# tuple中的'xyz'满足条件
>>> print('abcxyz'.endswith(('ab','xyz')))
True
# tuple中'ab'和'xy'都不满足条件
>>> print('abcxyz'.endswith(('ab','xy')))
False
# tuple中的'z'满足条件
>>> print('abcxyz'.endswith(('ab','xy','z')))
True
str.find(sub[, start[, end]])
str.rfind(sub[, start[, end]])
str.index(sub[, start[, end]])
str.rindex(sub[, start[, end]])
find()搜索字符串S中是否包含子串sub,如果包含,则返回sub的索引位置,否则返回"-1"。可以指定起始start和结束end的搜索位置
index()和find()一样,唯一不同点在于当找不到子串时,抛出ValueError错误。
rfind()则是返回搜索到的最右边子串的位置,如果只搜索到一个或没有搜索到子串,则和find()是等价的。
同理rindex()。
>>> print('abcxyzXY'.find('xy'))
3
>>> print('abcxyzXY'.find('Xy'))
-1
>>> print('abcxyzXY'.find('xy',4))
-1
>>> print('xyzabcabc'.find('bc'))
4
>>> print('xyzabcabc'.rfind('bc'))
7
>>> print('xyzabcabc'.rindex('bcd'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: substring not found
----------
可以使用in操作符来判断字符串S是否包含子串sub,它返回的不是索引位置,而是布尔值。
>>> 'xy' in 'abxycd'
True
>>> 'xyz' in 'abxycd'
False
替换
str.replace(old, new[, count]):将字符串中的子串old替换为new字符串,如果给定count,则表示只替换前count个old子串。如果搜索不到子串old,则无法替换,直接返回字符串S(不创建新字符串对象)。
>>> print('abcxyzoxy'.replace('xy','XY'))
abcXYzoXY
>>> print('abcxyzoxy'.replace('xy','XY',1))
abcXYzoxy
>>> print('abcxyzoxy'.replace('mn','XY',1))
abcxyzoxy
str.expandtabs(N):将字符串S中的 替换为一定数量的空格。默认N=8。
注意,expandtabs(8)不是将 直接替换为8个空格。例如'xyz ab'.expandtabs()会将 替换为5个空格,因为"xyz"占用了3个字符位。
另外,它不会替换换行符(
或
)。
>>> '01 012 0123 01234'.expandtabs(4)
'01 012 0123 01234'
>>> '01 012 0123 01234'.expandtabs(8)
'01 012 0123 01234'
>>> '01 012 0123 01234'.expandtabs(7)
'01 012 0123 01234'
>>> print('012 0123
01234'.expandtabs(7))
012 0123
01234
str.translate(table)
static str.maketrans(x[, y[, z]])
str.maketrans()生成一个字符一 一映射的table,然后使用translate(table)对字符串S中的每个字符进行映射。
如果你熟悉Linux,就知道tr命令,translate()实现的功能和tr是类似的。
例如,现在想要对"I love fairy"做一个简单的加密,将里面部分字符都替换为数字,这样别人就不知道转换后的这句话是什么意思。
>>> in_str='abcxyz'
>>> out_str='123456'
# maketrans()生成映射表
>>> map_table=str.maketrans(in_str,out_str)
# 使用translate()进行映射
>>> my_love='I love fairy'
>>> result=my_love.translate(map_table)
>>> print(result)
I love f1ir5
注意,maketrans(x[, y[, z]])中的x和y都是字符串,且长度必须相等。
----------
如果maketrans(x[, y[, z]])给定了第三个参数z,在这个参数字符串中的每个字符都会被映射为None。
例如,不替换"o"和"y"。
>>> in_str='abcxyz'
>>> out_str='123456'
>>> map_table=str.maketrans(in_str,out_str,'ay')
>>> my_love='I love fairy'
>>> result=my_love.translate(map_table)
>>> print(result)
I love fir
分割
str.partition(sep)
str.rpartition(sep)
1、搜索字符串中的子串sep,并从sep处对字符串进行分割,最后返回一个包含3元素的元组:sep左边的部分是元组的第一个元素,sep自身是元组的二个元素,sep右边是元组的第三个元素。
2、partition(sep)从左边第一个sep进行分割,rpartition(sep)从右边第一个sep进行分割。
3、如果搜索不到sep,则返回的3元素元组中,有两个元素为空。partition()是后两个元素为空,rpartition()是前两个元素为空。
# 只搜索到一个sep时,两者结果相同
>>> print('abcxyzopq'.partition('xy'))
('abc', 'xy', 'zopq')
>>> print('abcxyzopq'.rpartition('xy'))
('abc', 'xy', 'zopq')
# 搜索到多个sep时,分别从左第一个、右第一个sep分割
>>> print('abcxyzxyopq'.partition('xy'))
('abc', 'xy', 'zxyopq')
>>> print('abcxyzxyopq'.rpartition('xy'))
('abcxyz', 'xy', 'opq')
# 搜索不到sep
>>> print('abcxyzxyopq'.partition('xyc'))
('abcxyzxyopq', '', '')
>>> print('abcxyzxyopq'.rpartition('xyc'))
('', '', 'abcxyzxyopq')
str.split(sep=None, maxsplit=-1)
str.rsplit(sep=None, maxsplit=-1)
str.splitlines([keepends=True])
都是用来分割字符串,并生成一个列表。
1、split()根据sep对S进行分割,maxsplit用于指定分割次数,如果不指定maxsplit或者给定值为"-1",则会从做向右搜索并且每遇到sep一次就分割直到搜索完字符串。如果不指定sep或者指定为None,则改变分割算法:以空格为分隔符,且将连续的空白压缩为一个空格。
2、rsplit()和split()是一样的,只不过是从右边向左边搜索。
3、splitlines()用来专门用来分割换行符。虽然它有点像split(' ')或split(' '),但它们有些区别,见下文解释。
sep为单个字符时
>>> '1,2,3'.split(',')
['1', '2', '3']
只分割了一次
>>> '1,2,3'.split(',',1)
['1', '2,3']
sep为多个字符时
>>> '<hello><><world>'.split('<>')
['<hello>', '<world>']
# 不指定sep时
>>> '1 2 3'.split()
['1', '2', '3']
>>> '1 2 3'.split(maxsplit=1)
['1', '2 3']
>>> ' 1 2 3 '.split()
['1', '2', '3']
>>> ' 1 2 3
'.split()
['1', '2', '3']
显式指定sep为空格、制表符、换行符时
>>> ' 1 2 3
'.split(' ')
['', '1', '', '2', '', '3', '', '
']
>>> ' 1 2 3
'.split(' ')
[' 1 2 3
']
>>> ' 1 2
3
'.split('
')
[' 1 2', '3 ', ''] 注意列表的最后一项''
>>> ''.split('
')
['']
----------
splitlines()中可以指定各种换行符,常见的是
、
、
。如果指定keepends为True,则保留所有的换行符。
>>> 'ab c
de fg
kl
'.splitlines()
['ab c', '', 'de fg', 'kl']
>>> 'ab c
de fg
kl
'.splitlines(keepends=True)
['ab c
', '
', 'de fg
', 'kl
']
----------
将split()和splitlines()相比较一下:
split
>>> ''.split('
')
[''] # 因为没换行符可分割
>>> 'One line
'.split('
')
['One line', '']
splitlines
>>> "".splitlines()
[] # 因为没有换行符可分割
>>> 'Two lines
'.splitlines()
['Two lines']
join
str.join(iterable)
将可迭代对象(iterable)中的字符串使用str连接起来。注意,iterable中必须全部是字符串类型,否则报错。
字符串string、列表list、元组tuple、字典dict、集合set。
字符串
>>> L='python'
>>> '_'.join(L)
'p_y_t_h_o_n'
元组
>>> L1=('1','2','3')
>>> '_'.join(L1)
'1_2_3'
集合。注意,集合无序。
>>> L2={'p','y','t','h','o','n'}
>>> '_'.join(L2)
'n_o_p_h_y_t'
列表
>>> L2=['py','th','o','n']
>>> '_'.join(L2)
'py_th_o_n'
字典(所有的key)
>>> L3={'name':"malongshuai",'gender':'male','from':'China','age':18}
>>> '_'.join(L3)
'name_gender_from_age'
iterable参与迭代的部分必须是字符串类型,不能包含数字或其他类型。
>>> L1=(1,2,3)
>>> '_'.join(L1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected str instance, int found
字符串的索引切片
索引是按照正向递增(从0开始),反向递减从-1开始。s[索引]返回索引的字符串值
切片s[M:N:K]返回切片的字符串。当K为负值时,M>N
>>> s = 'ABCDEFGHIJK'
>>> s[2]
'C'
>>> s = 'ABCDEFGHIJK'
>>> s[1:5:2] # 可以进行切片
'BD'
>>> s = 'ABCDEFGHIJK'
>>> s[5:1:-2]
'FD'
>>> s[1:5:-2] k为负值时,M<N,输出为空
''
字符串的循环
>>> for i in a:
print(i)
A
B
C
D
E
F
G
H
I
J
K
>>> for index,i in enumerate(a): # 获取索引和对应的字符串子串的值
print(index,i)
0 A
1 B
2 C
3 D
4 E
5 F
6 G
7 H
8 I
9 J
10 K