zoukankan      html  css  js  c++  java
  • Python内置数据结构之字符串

    1、字符串定义和特点

      定义:

        一个个字符组成的有序的序列,是字符的集合。如: 'sadasd121da'

        使用单引号、双引号、三引号引住的字符序列。

      特点:

        1、和数字一样都是字面常量。

        2、不可变的,有序,连续空间,可迭代

        3、Python3起,字符串就是Unicode类型

    2、字符串的定义,初始化:

      举例:

    s = 'strint'
    s = "string"
    s = '''string'''
    s = 'hello 
    world'
    s = r'hello 
    world' # 此时的
     就是普通字符
    s = 'C:win
    t' 
        ---->      
    C:win
    t  
    
    s = R"C:win
    t" 
        ---->
    C:win
    t
    
    s = 'C:win\nt'
        ---->
    C:win
    t
    
    s = """ select * from user """

    3、字符串元素的访问--下标

      —字符串支持使用索引访问

        sql = "select * from user where name = 'tom'"

        sql[4]  ------>  'c'

        sql[4] = 'o'  因为字符串不可变,所以此操作抛异常   

      —有序的字符集合,字符序列,所以可以用for循环迭代。

        for c in sql:

          print(c) 

          print(type(c))   #  <class 'str'>

      —可迭代

        lst = list(sql) #  ['s', 'd', 'a']

    4、字符串join链接****** 

      'string'.join(iterable)  ------>  返回 str

     1 sql = 'welcome to  beijing'
     2 s = sql.join(['1','1','1']) # s = sql.join(('#','1','1'))
     3 print(s)
     4 
     5 ------->
     6 '1welcome to  beijing1welcome to  beijing1'
     7 
     8 
     9 s = ''.join(sql)
    10 print(s)
    11 
    12 ------->
    13 'welcome to  beijing'
    14 
    15 s = '-'.join(sql)
    16 print(s)
    17 
    18 ------->
    19 'w-e-l-c-o-m-e- -t-o- - -b-e-i-j-i-n-g'
    20 
    21 
    22 lst = ['1','2','3']
    23 print('""'.join(lst)) # 1""2""3
    24 print(' '.join(lst)) # 1 2 3
    25 lst2 = ['1',[1,2],'3'] # [1,2]是列表,不是字符串, '[1,2]'
    26 # print(' '.join(lst2))
    27 lst3 = ['1','[1,2]','3']
    28 print(' '.join(lst3)) # 1 [1,2] 3

    5、字符串 + 链接

      + -----> 返回新的str(类似列表,元组)

    6、比较 +  和 join 连接效率

     1 # 结论:join的效率都低于 +
     2 import datetime
     3 
     4 start = datetime.datetime.now()
     5 for i in range(1000000):
     6     s = ''.join(['pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab','pythontab'])
     7 delat = (datetime.datetime.now()- start).total_seconds()
     8 print('--',delat)
     9 
    10 start = datetime.datetime.now()
    11 for i in range(1000000):
    12     s = 'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'+'pythontab'
    13 delat = (datetime.datetime.now()- start).total_seconds()
    14 print(delat)
    15 
    16 website= '%s%s%s' % ('python', 'tab', '.com')


    -- 0.159004
    0.0624
     1 # 链接次数少也是一样的结果!
     2 
     3 import datetime
     4 
     5 start = datetime.datetime.now()
     6 for i in range(1000000):
     7     s = ''.join(['pythontab','pythontab','pythontab'])
     8 delat = (datetime.datetime.now()- start).total_seconds()
     9 print('--',delat)
    10 
    11 start = datetime.datetime.now()
    12 for i in range(1000000):
    13     s = 'pythontab'+'pythontab'+'pythontab'
    14 delat = (datetime.datetime.now()- start).total_seconds()
    15 print(delat)
    16 
    17 website= '%s%s%s' % ('python', 'tab', '.com')
    18 
    19 
    20 
    21 
    22 -- 0.228405
    23 0.078

    7、字符串分隔

      分隔字符串的方法分为2类:

        split系:

          将字符串按照分隔符分割成若干个字符串,并返回列表。

        partition 系

          将字符串按照分隔符分割成两段,返回这2段和分隔符的元组

      -——split

        帮助文档:str.split(sep=None, maxsplit=-1)  ------> 返回一个字符串列表

          从左往右依次分割

          sep 指定分隔符,缺省情况下空白字符串作为分隔符

          maxsplit 指定分割次数,-1 表示遍历整个字符串

     1                                       split 
     2 
     3 s1 = "Tom is a super student."
     4 #s1.split() # ['Tom', 'is', 'a', 'super', 'student.'] 默认分隔符为 空白字符串
     5 s2 = "Tom      is a super student."
     6 s2.split() # ['Tom', 'is', 'a', 'super', 'student.']
     7 s2.split('s') # ['Tom      i', ' a ', 'uper ', 'tudent.']
     8 s2.split('super') # ['Tom      is a ', ' student.']
     9 s2.split('super ')# ['Tom      is a ', 'student.']
    10 s2.split(' ') # ['Tom', '', '', '', '', '', 'is', 'a', 'super', 'student.']
    11 s3 = "Tom is 	a super student."
    12 s3.split(' ')# ['Tom', 'is', 'a', 'super', 'student.']s3
    13 s3.split(' ',maxsplit=2) # ['Tom', 'is', 'a super student.']
    14 s3.split('	',maxsplit=2) # ['Tom is ', 'a super student.']
    15 
    16 s4 = "Tom is 		a   super student."
    17 s4.split() # ['Tom', 'is', 'a', 'super', 'student.'] 注意空白字符和空格的区别
    18 s4.split('	',maxsplit=-1) # ['Tom is ', '', 'a   super student.']
    19 s4.split('	',maxsplit=1) # ['Tom is ', '	a   super student.']
     1                       rsplit
     2 
     3 s1 = "Tom  is a 			super 	student."
     4  
     5 s1.rsplit() # ['Tom', 'is', 'a', 'super', 'student.']
     6 s1.rsplit('s') # ['Tom i', ' a ', 'uper ', 'tudent.']
     7 s1.rsplit('super') # ['Tom is a ', ' student.']
     8 s1.rsplit(' ') # ['Tom', 'is', 'a', 'super', 'student.']
     9 s1.rsplit(' ',maxsplit=2) # ['Tom is a', 'super', 'student.']
    10 s1.rsplit('	',maxsplit=2) # ['Tom is a 		', 'super ', 'student.']

      ——splitlines([keepends]) ----> 字符串列表

          按照行来切分字符串

          keepends 指定的是保留分隔符

          行分隔符包括 , ,  等(默认的),可以自定义的行尾分隔符,当然使用此方法就不能默认了,必须指明    

    1                       splitlines
    2 
    3 s1 = "I'm a student.
    You're a teacher"
    4 s1.splitlines() # ["I'm a student.", "You're a teacher"]
    5 s1.splitlines(True) # ["I'm a student.
    ", "You're a teacher"]
    6 
    7 s1 = "I'm a student.
    
    
    
    You're a teacher" 
    8 s1.splitlines() # ["I'm a student.", '', '', '', "You're a teacher"]
    9 s1.splitlines(True) # ["I'm a student.
    ", '
    ', '
    ', '
    ', "You're a teacher"]

     

      ——partition(sep) ---->  元组 (head,sep,tail)

          从左往右,遇到分隔符就把字符串分割成两部分,返回头,分隔符,尾三部分的三元组,如果没有找到分隔符,就返回头,2个空元素的三元组。

          sep 分割字符串,必须指定

     1                     partition  rpartition
     2 
     3 s1 = "I'm a student.s
    You're a teacher"
     4 s1.partition('s') # ("I'm a ", 's', "tudent.s
    You're a teacher")
     5 s1.rpartition('s')# ("I'm a student.", 's', "
    You're a teacher")
     6         # rpartition 从右往左 
     7 s1.partition('stu') # ("I'm a ", 'stu', "dent.s
    You're a teacher")
     8 s1.partition(' ') #("I'm", ' ', "a student.s
    You're a teacher")
     9 s1.partition('_') # ("I'm a student.s
    You're a teacher", '', '')
    10 
    11 s = '[1,2],[3],[4,5]'
    12 s.partition('3') # ('[1,2],[', '3', '],[4,5]')

     8、字符串大小写

      upper():全大写 ,返回新字符串

      lower():全小写,返回新字符串

      swapcase():交换大小写,返回新字符串

     1 In [18]: s = 'dsadasd'
     2 
     3 In [19]: s.upper()
     4 Out[19]: 'DSADASD'
     5 
     6 In [20]: s
     7 Out[20]: 'dsadasd'
     8 
     9 In [21]: s = 'dsaD'
    10 
    11 In [22]: s.lower()
    12 Out[22]: 'dsad'
    13 
    14 In [23]: s.swapcase()
    15 Out[23]: 'DSAd'
    测试

    9、字符串排版:

      title() ----> 返回str,标题的每个单词都大写

      capitalize() -----> 返回 str ,首个单词大写

      center(width [, fillchar] ) ----> 返回str  width 打印宽度, fillchar 填充的字符

      zfill(width) -----> str width 是宽度

      ljust(width [,fillchar])  --->  str   左对齐

      rjust

    10、字符串修改*

      —replace(old,new[,count]) -----> str 

        字符串中找到匹配替换为新子串,返回新字符串

        count 表示替换几次,不指定就是全部替换

     1 IIn [1]: a ='www magedu com'
     2 
     3 In [2]: a.replace('ww','p')
     4 Out[2]: 'pw magedu com'
     5 
     6 In [3]: a.replace('www','p')
     7 Out[3]: 'p magedu com'
     8 
     9 In [4]: a.replace('w','p')
    10 Out[4]: 'ppp magedu com'
    11 
    12 In [5]: a.replace('w','p',2)
    13 Out[5]: 'ppw magedu com'
    14 
    15 In [6]: a.replace('w','python',2)
    16 Out[6]: 'pythonpythonw magedu com'
    17 
    18 In [7]: a.replace('www','python',2)
    19 Out[7]: 'python magedu com'
    20 
    21 In [8]: a.replace('www','python',5)
    22 Out[8]: 'python magedu com'
    test

      —strip([char]) -----> str

        从字符串两端 去除指定得到字符集chars中的所有字符

        如果插头上没有志宁,去除两端的空白字符。

    In [10]: b = '  
    
    very very very 	'
    
    In [11]: b
    Out[11]: '  
    
    very very very 	'
    
    In [12]: b.strip()# 默认空白字符
    Out[12]: 'very very very'
    
    In [13]: b.strip(' 	')
    Out[13]: '
    
    very very very'
    
    In [14]: b.strip(' 	
    ')
    Out[14]: '
    
    very very very'
    
    In [15]: b.strip(' 	
    ')
    Out[15]: '
    
    very very very'
    
    In [16]: b.strip(' 	
    y')
    Out[16]: '
    
    very very ver'
    
    In [17]: b.strip(' 	
    very')
    Out[17]: '
    
    '
    
    In [18]: b.strip('	
    very')
    Out[18]: '  
    
    very very very '
    
    
    
    In [20]: c = 'abc abc ab bc'
    
    In [21]: c.strip('abc')
    Out[21]: ' abc ab '
    test-strip
    1 In [22]: a
    2 Out[22]: 'www magedu com'
    3 
    4 In [23]: a.lstrip('w')
    5 Out[23]: ' magedu com'
    6 
    7 In [24]: a.lstrip('m')
    8 Out[24]: 'www magedu com'
    lstrip-test

    11、字符串查找*

      —find(sub [,start [, end]]) -----> int

        在指定的区间,从左到右,查找子串sub,找到返回索引,找不到返回 -1

      — rfind (sub [,start [, end]]) -----> int

        在指定能够的区间,从右往左,查找子串sub,找到返回正索引,找不到返回-1

     1 In [30]: a
     2 Out[30]: 'www magedu com'
     3 
     4 In [31]: s
     5 Out[31]: 'I am very very very sorry '
     6 
     7 In [32]: a.find('ww')
     8 Out[32]: 0
     9 
    10 In [33]: a.find('w')
    11 Out[33]: 0
    12 
    13 In [34]: a.find('m')
    14 Out[34]: 4
    15 
    16 In [35]: a.find('m',5,-1)
    17 Out[35]: -1
    18 
    19 In [36]: a.find('m',5)
    20 Out[36]: 13
    21 
    22 In [38]: a.rfind('w')
    23 Out[38]: 2
    24 
    25 In [39]: a.rfind('ww')
    26 Out[39]: 1
    27 
    28 In [40]: a.find('ss')
    29 Out[40]: -1
    30 
    31 In [41]: a.find('m',4,10000) # end可以无限写,但是没有意义
    32 Out[41]: 4
    find - test
     1 In [54]: s
     2 Out[54]: 'I am very very very sorry'
     3 
     4 In [55]: s.find('very',1,-1)
     5 Out[55]: 5
     6 
     7 In [56]: s.find('very',-5,-1)
     8 Out[56]: -1
     9 
    10 In [57]: s.find('very',-10,-1)
    11 Out[57]: 15
    12 
    13 In [58]: s.find('very',-1,-10)
    14 Out[58]: -1
    15 
    16 In [59]: s.rfind('very',-1,-10)
    17 Out[59]: -1
    18 
    19 总结:不论 rfind 还是 find,start都应该在end 的左侧,比如:5就在-1 的左侧,-1不在 -10 的左侧。rfind只是在区间内,找最右侧先匹配到的项。
    关于start和end的大小关系

      —index(sub [,start[,end]]) ----> int  ,但是找不到会抛异常,得写异常处理,所以一般很少用,选用find 

      **** 时间复杂度:

          find 和 index 都是翻找,随着数据量增大,效率下降,都是O(n)

      —len(string)跟列表,元组一样,会提供

      —count(str [,start [,end]]) ----> int   时间复杂度是O(n)

          在指定区间,从左往右,统计子串sub出现的次数

     1 In [61]: s
     2 Out[61]: 'I am very very very sorry'
     3 
     4 In [62]: s.count('very')
     5 Out[62]: 3
     6 
     7 In [63]: s.count('very',10,-1)
     8 Out[63]: 2
     9 
    10 In [64]: s.count('very',-10,-1)
    11 Out[64]: 1
    12 
    13 In [65]: s.count('very',-1,-10)
    14 Out[65]: 0
    count - test

    12、字符串判断*

       —startswith (suffix [,start [,end]]) -----> bool    ------> 可以用find 替换的

          a.find('www', 0, len('www')) == 0  ------> True or False  ------>好处是不抛异常

       —endwith ([prefix [ ,start [, end]])----> bool

     1 In [68]: s
     2 Out[68]: 'I am very very very sorry'
     3 
     4 In [69]: s.startswith('s')
     5 Out[69]: False
     6 
     7 In [70]: s.startswith('w')
     8 Out[70]: False
     9 
    10 In [71]: s.startswith('I')
    11 Out[71]: True
    test

      — is 系列

        isalnum() ----> bool 

        isalpha()

        isdecimal();是否只包含十进制数字

        isdigit(): 是否全部是数字

        isidentifiter() :是不是字母和下划线开头,其他都是字母,数字,下划线

        islower() # 遍历,还不如都转大写或小写

        isupper()

        isspace()

    13、字符串格式化:

        —字符串的格式化是一种能够凭借字符串输出样式的手段,更灵活。

          join 拼接,被拼接对象是可迭代对象

          + 拼接,但是需要将非字符串转化为字符串

        — printf - style formatting 来源于c语言的ptinf函数

            格式要求:

              占位符: 使用 % 和格式字符组成,例如 %s %d等。

              fornat % valuse ,格式字符串和被格式的值之间使用%分隔

              values 只能是一个对象,或是一个和格式字符串占位符数目相等的元组,或一个字典。

    1 https://www.cnblogs.com/post/readauth?url=/JerryZao/p/9417465.html
     1 In [90]: '%.2f'% 1.23333
     2 Out[90]: '1.23'
     3 
     4 In [82]: 'i an %d' % 9
     5 Out[82]: 'i an 9'
     6 
     7 In [83]: 'i an %20d' % 9
     8 Out[83]: 'i an                    9'
     9 
    10 In [84]: 'i an %-20d' % 9
    11 Out[84]: 'i an 9                   '
    12 
    13 In [85]: '%3.2f%%, 0x%x, 0X%02X' %(89.32234324,10,12)
    14 Out[85]: '89.32%, 0xa, 0X0C'
    test 
    1 IIn [91]: '%#x' % 10
    2 Out[91]: '0xa'
    3 
    4 In [92]: '%#X' % 10
    5 Out[92]: '0XA'
    6 
    7 In [93]: 'ox%X' % 10
    8 Out[93]: 'oxA'
    test

     

       

      — 使用format 的格式化******

           format 函数格式字符串语法----Python鼓励使用

             '{} {xxx}'.format(*args, **kwargs) ------> str

              args 是位置参数 ,是一个元组

              kwargs 是关键字参数,是一个字典

              花括号表示占位符

              {} 表示按照顺序匹配位置参数,{n} 表示取位置参数索引为n 的值

              {xxx} 表示在关键字参数中搜索名称一致的

              {{}},表示打印花括号

    对齐:
    
    In [97]: '{0}*{1}={2:<2}'.format(3,2,2*3)
    Out[97]: '3*2=6 '
    
    In [98]: '{0}*{1}={2:<02}'.format(3,2,2*3)
    Out[98]: '3*2=60'
    
    In [99]: '{0}*{1}={2:>2}'.format(3,2,2*3)
    Out[99]: '3*2= 6'
    
    In [100]: '{0}*{1}={2:>02}'.format(3,2,2*3)
    Out[100]: '3*2=06'
    
    In [101]: '{0}*{1}={2:0>2}'.format(3,2,2*3)
    Out[101]: '3*2=06'
    
    In [102]: '{0}*{1}={2:0<2}'.format(3,2,2*3)
    Out[102]: '3*2=60'
    
    
    In [132]: '{:*^30}'.format('*')
    Out[132]: '******************************'
    In [124]: '{:<#3}'.format(6)
    Out[124]: '6  '
    
    In [125]: '{:#<3}'.format(6)
    Out[125]: '6##'
    
    In [126]: '{:^3}'.format(6)
    Out[126]: ' 6 '
    
    进制:
    
    In [104]: 'int:{0:d}'.format(42)
    Out[104]: 'int:42'
    
    In [105]: 'int:{0:d},hex{0:x},oct{0:o}'.format(42)
    Out[105]: 'int:42,hex2a,oct52'
    
    In [106]: 'int:{0:d},hex{0:#x},oct{0:#o}'.format(42)
    Out[106]: 'int:42,hex0x2a,oct0o52'
    
    In [107]: 'int:{0:d},hex{0:#x},oct{0:#o},bin{0:b}'.format(42)
    Out[107]: 'int:42,hex0x2a,oct0o52,bin101010'
    
    In [108]: 'int:{0:d},hex{0:#x},oct{0:#o},bin{0:#b}'.format(42)
    Out[108]: 'int:42,hex0x2a,oct0o52,bin0b101010'
    
    In [134]: '{:02X}'.format(123)
    Out[134]: '7B'
    
    
    浮点数:
    
    In [111]: print('{:g}'.format(3**0.5)) #通用精度
    1.73205
    
    In [112]: print('{:f}'.format(3**0.5))
    1.732051
    
    In [113]: print('{:}'.format(3**0.5))
    1.7320508075688772
    
    In [114]: print('{:10f}'.format(3**0.5)) # 右对齐,10个位置
      1.732051
    
    In [115]: print('{:2}'.format(102.231)) # 加小数位的宽度
    102.231
    
    In [116]: print('{:.2}'.format(102.231))
    1e+02
    
    In [117]: print('{:.2}'.format(3**0.5)) # 小数点后两位
    1.7
    
    In [118]: print('{:.2f}'.format(3**0.5))
    1.73
    
    In [119]: print('{:3.2f}'.format(3**0.5))
    1.73
    
    In [120]: print('{:3.1f}'.format(3**0.5))
    1.7
    
    In [121]: print('{:3.1f}'.format(0.2745))
    0.3
    
    In [122]: print('{:3.3f}'.format(0.2745))
    0.275
    
    In [123]: print('{:3.3%}'.format(0.2745))
    27.450%
    
    字符串format格式化test
    test

      其他使用:https://www.cnblogs.com/post/readauth?url=/JerryZao/p/9417465.html

    In [22]: aOut[22]: 'www magedu com'
    In [23]: a.lstrip('w')Out[23]: ' magedu com'
    In [24]: a.lstrip('m')Out[24]: 'www magedu com'

    为什么要坚持,想一想当初!
  • 相关阅读:
    四则运算2
    四则运算1
    学习进度条
    Mat数据结构
    Markdown语法
    C++继承和派生
    C++智能指针
    clion配置ROS
    视觉十四讲:第六讲_g2o图优化
    视觉十四讲:第六讲_ceres非线性优化
  • 原文地址:https://www.cnblogs.com/JerryZao/p/9445746.html
Copyright © 2011-2022 走看看