zoukankan html css js c++ java

python3 函数、运算符与数据类型的常用方法

python3 ---- 函数、运算符与数据类型的常用方法

Python爬虫

函数、运算符与数据类型的常用方法

Python函数的基本介绍

什么是函数？
函数是一段可以直接被另外一段程序或代码引用的程序或代码，也叫做子程序，方法.
- 可重复使用
- 可互相调用
函数的目的
- 为了代码段的复用

在Python中如何定义一个函数？

def foo(arg):
	return "Hello " + str(arg)

result = foo("Final")  #将此参数传入foo(arg=Final)
print(result)

函数的组成
- 参数列表
  - 必需参数
    当前参数必须按顺序传入
- 关键字参数
  根据关键字参数传参可以无视顺序
```
def foo(arg=None,arg_2=None):
```
  - 默认参数
```
def foo(arg='Final',arg_2=None):
	return "Hello " + str(arg)

	result = foo('x')   #如果有参，则返回x，若无参，则默认参数Final
print(result)
```
- 不定长参数
  在装饰器中会大量应用
  可以接受任意长度的参数.
- *
  代表了省略，省略了参数tuple（元组）
- **
  省略了关键字参数dict（字典）
```
def foo(*args,**kwargs):
	print(args)
	print(kwargs)
	return None

result = foo("Final","1",_class="c",_class_2="2")
print(result)
```
- 函数体
  函数中的逻辑体
- 返回值
  默认返回None
```
return None
```

Python运算符

算术运算
- 加法 +
- 减法 -
- 乘法 *
- 除 /
- 整除 //
- 取余数 %
- x的y次幂 x ** y
- 开方（没有提供直接的运算符）
  x ** (1/2)
赋值运算
通过=赋值
``` bash
a=1
比较运算
比较的是两个对象的字面值，字面值暂时可以简单地理解为输出值
- <
- <=
- =
  - == 等于
  - != 不等于

标识号比较运算
比较两个变量的内存地址

is
is not

赋值类型为str，int时考虑Python常量池内存地址是一样的，中文不在常量池当中

a="test_1"
b="test_1"
id(a)
Out[6]: 2135370818480
id(b)
Out[7]: 2135370818480
a is b
Out[8]: True

a = '你好'
b = '你好'
id(a)
Out[12]: 2135370954384
id(b)
Out[13]: 2135370954768
a is b
Out[11]: False

成员运算
判断元素是指在当前序列当中

a=[1,2,3]
1 in a
Out[15]: True
b =[1,2]
b in a
Out[17]: False

not in

布尔运算
判断当前语句的结果是True还是False
- and
  只有两边都是True才返回True
- or
  两边表达式有一个True返回的结果为True
  - 短路
    表达式A or 表达式B
    当表达式A为True时，表达式B就不会运行
逻辑取反
not
位运算属于二进制运算
- ~
- ^
- <<
- &
- |

位运算与运算符优先级

运算符	描述
or	布尔运算或
and	布尔运算与
not	布尔运算逻辑取反
in, not in, is, is not, <, !=, ...	比较运算, 成员检测运算, 标识号检测
+, -	加法和减法
*, /, //, %	乘法, 除法, 整除, 取余
+x, -x	正负数
**	幂

自定义优先级
如果不确定优先级，出于可读性和避免未知的BUG，我们都应该用（）来自定义优先级
- 通过（）
  (not b and c) or (d and e)

字符串和字节序列及编码解码问题

字符串（字符序列）和字节序列

字符
- 由于历史原因，将字符定义为unicode字符还不够准确，但是未来字符的定义一定是unicode字符
字节
字符的二进制表现形式
码位
计算机显示的实际上是码位
bash '你好'.encode("unicode_escape").decode() #将中文转换码位 Out[2]: '\\u4f60\\u597d' '\u4f60\u597d' #打印码位 Out[3]: '你好'
- UNICODE标准中以4-6个十六进制数字表示

编码

字节序列（bytes）->字符序列（string）--解码（decode）

len('\u4f60\u597d') #字符长度
Out[6]: 2
b="你好".encode("utf-8")
b
Out[13]: b'\xe4\xbd\xa0\xe5\xa5\xbd'

字符序列（string）->字节序列（bytes）--编码（encode）
```
b.decode("utf")
Out[14]: '你好'
```

编码错误

乱码与混合码，通常是人为的错误，两种不同的编码方式，在一种编码方式下进行译码

检查编码
没有办法通过字节序列来得出编码格式，都是统计学来预估当前的编码

b = "你".encode("utf-8") + "好".encode("gbk")
b
Out[5]: b'\xe4\xbd\xa0\xba\xc3'

b.decode("utf-8")
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
	exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-6-a863a95176d0>", line 1, in <module>
	b.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xba in position 3: invalid start byte
b.decode("gbk")
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
	exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-59dd6af7107c>", line 1, in <module>
	b.decode("gbk")
UnicodeDecodeError: 'gbk' codec can't decode byte 0xc3 in position 4: incomplete multibyte sequence

安装chardet
pip install chardet

导入charet
import chardet

chardet.detect(b)
Out[11]: {'encoding': 'ISO-8859-1', 'confidence': 0.73, 'language': ''}  #confience表示精度，一般在0.9相当准确了
chardet.detect("你".encode("utf-8"))  #可以看到此编码不太准确，所以仅供参考
Out[13]: {'encoding': 'ISO-8859-1', 'confidence': 0.73, 'language': ''}
chardet.detect("你好".encode("utf-8"))
Out[14]: {'encoding': 'utf-8', 'confidence': 0.7525, 'language': ''}

解决乱码和混合编码

忽略错误编码

  b_2="你好".encode("utf-8") + "啊".encode("gbk")
  b_2.decode("utf-8",errors='ignore')
  Out[18]: '你好'

利用鬼符来替换，特殊字符来替换，表示知道当前错误编码
```
  b_2.decode("utf-8",errors='replace')
  Out[21]: '你好��'
```

字符串的CRUD操作

通过dir("")可以查看当前字符串的操作方法

Create(创建)

不可变的数据类型

  a="a"
  id(a)
  Out[23]: 1809422010288
  a=a+"b"
  id(a)
  Out[25]: 1809520718768

# a+="b" 等于 a =a + "b" 省略写法
  a="a"
  a+="b"
  a
  Out[28]: 'ab'

Retrieve(检索)

根据索引获取字符
在计算机语言当中，索引值是从0开始计数
```
  a="hello,world"
  a[1]
  Out[30]: 'e'
```

find和index(获取目录字符的索引值)

  a.find("e")
  Out[31]: 1
  a.find("l")
  Out[32]: 2
  a.find("!")  #find无匹配的返回-1
  Out[39]: -1

  a.index("e")
  Out[33]: 1
  a.index("d")
  Out[34]: 10
  a.index("l")
  Out[35]: 2
  a.index("!")  #index无匹配报错
  Traceback (most recent call last):
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
  	exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-37-75d531dbc77e>", line 1, in <module>
  	a.index("!")
  ValueError: substring not found

startwith和endwith，从头/尾进行匹配

  f="2021-11-11-xxxx"
  f.startswith("2021-11-11")
  Out[41]: True
  f.startswith("2021-11-11x")
  Out[42]: False
  
  f="xxx.png"
  f.startswith("png")
  Out[44]: False
  f.endswith("png")
  Out[45]: True

UPDATE(更新)

replace(替换)，返回的是新的字符串，全局替换

  a="hello werld, hello werld"
  a.replace("wer","wor")
  Out[48]: 'hello world, hello world'

split(分割)

  a="<<python>>, <<java>>, <<c++>>"
  a.split(",")
  Out[50]: ['<<python>>', ' <<java>>', ' <<c++>>']

join(拼接)

  b = a.split(",")
  ",".join(b)
  Out[53]: '<<python>>, <<java>>, <<c++>>'

DELETE(删除)

  a ="    hello, world  "
  a
  Out[55]: '    hello, world  '
  a.strip()   #删除前后空格
  Out[56]: 'hello, world'

lstrip和rstrip 删除前后空格字符

  a.lstrip()
  Out[60]: 'hello, world  '
  a.rstrip()
  Out[61]: '    hello, world'

字符串的输出和输入

保存到文件

# open函数打开一个文件，没有文件会新建，但是路径不匹配会报错合
# #指定文件名，方法（读，写，追加），编码格式
output=open("output.txt","w",encoding="utf-8")
context="hello, world"
# 正式写入文件
output.write(context)
# 关闭文件句柄
output.close()

读取文件

  input = open("output.txt", "r", encoding="utf-8")
  # 获取文件中的内容
  content = input.read()
  print(content)
  # 暂时理解为仅读取一次，读取文件有指针标志
  content_2 = input.read()
  print(content_2)

追加文件

  # open函数打开一个文件，没有文件会新建，但是路径不匹配会报错合
  # #指定文件名，方法（读，写，追加），编码格式
  output = open("output.txt", "a", encoding="utf-8")
  context = "\nhello, python"
  # 正式写入文件
  output.write(context)
  # 关闭文件句柄
  output.close()

字符串的格式化输出

format

按传入参数默认顺序

  a="ping"
  b="pong"
  print("play pingpong: {}, {}".format(a, b))
  play pingpong: ping, pong

按指定参数索引，达到到重复性输出

  print("play pingpong: {0}, {1}, {0},  {1}".format(a, b))
  play pingpong: ping, pong, ping,  pong

关键词参数

  print("play pingpong: {a}, {b}, {a}, {b}".format(a='ping', b='pong'))
  play pingpong: ping, pong, ping, pong

推荐使用此方法，这样可读性更好，但仅能在3.6版本之上使用
```
  print(f"play pingpong: {a}, {b}")
```

小数的表示

  2 表示 保留两位
  f 表示 float 
  print("{:.2f}".format(3.141592))
  3.14

  "playing %s %s" % ("ping","pong")
  Out[3]: 'playing ping pong'

变量与引用

了解变量和引用

变量简单来说指向一个实体

引用简单来说指向同一个内存地址

  a=1
  b=a
  id(a)
  Out[6]: 140703519418160
  id(b)
  Out[7]: 140703519418160

基础数据结构的CRUD操作

List(列表)
list中存的元素是引用

create(增加)

append 末尾添加元素

  l=[]
  id(l)
  Out[9]: 1184852266944 #内存地址不会改变
  l.append("a")
  l
  Out[11]: ['a']
  id(l)
  Out[12]: 1184852266944

和 +=

拼接两个列表，返回一个新的列表

  l1=['a']
  l2=['b']
  l3=l1+l2
  id(l1)
  Out[19]: 1184852612864
  id(l2)
  Out[20]: 1184852093440
  id(l3)
  Out[21]: 1184852614400
  l1
  Out[22]: ['a']
  l2
  Out[23]: ['b']
  l3
  Out[24]: ['a', 'b']

  a='a'
  id(a)
  Out[28]: 1184775915440
  a+='b'
  a
  Out[30]: 'ab'
  id(a)
  Out[31]: 1184869746800

  l=['a']
  id(l)
  Out[33]: 1184852815424
  l+=['b']
  id(l)
  Out[35]: 1184852815424
  l
  Out[36]: ['a', 'b']

* 和 *=

  a='a'
  id(a)
  Out[38]: 1184775915440
  l=[a]*10
  id(l[0])
  Out[40]: 1184775915440
  id(l[9])
  Out[41]: 1184775915440
  l
  Out[42]: ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']
  
  赋值语句之后，a已经是一个新的对象，可以看到内存地址改变了
  a='b'
  id(a)
  Out[44]: 1184775821808
  l
  Out[45]: ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']

insert
指定位置添加元素

  l=["a"]
  l.insert(0,'b')
  l
  Out[48]: ['b', 'a']
  
  l.insert(10,'z')    #超过的部分自动添加到末尾
  l
  Out[50]: ['b', 'a', 'z']

Retrieve(检索)

索引取值，所有序列都支持索引取值

切片

your_list[start:end:step]

取一段区间
your_list[start:end]

取最后一个值
your_list[-1]  此值会少1位
your_list[len(your_list)]

间隔 步长
your_list[1:10:2] 取1到10的基数

  l=list(range(20))
  l
  Out[56]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
  l[0:10]
  Out[57]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  l[11:-1]
  Out[62]: [11, 12, 13, 14, 15, 16, 17, 18]
  l[11:len(l)]
  Out[63]: [11, 12, 13, 14, 15, 16, 17, 18, 19]
  l[-1]
  Out[64]: 19
  
  l[1:10:2]
  Out[65]: [1, 3, 5, 7, 9]

index

l=['a','b','c']
l
Out[68]: ['a', 'b', 'c']
l.index('b') #返回元素的下标
Out[70]: 1

在值为b的插入一个元素，查找下标在插入
l.insert(l.index('b'),'test')
l
Out[72]: ['a', 'test', 'b', 'c']

Update(更新)

索引赋值

your_list[下标值]='新值'

  l
  Out[73]: ['a', 'test', 'b', 'c']
  id(l)
  Out[74]: 1184869994688
  l[0]="a_1"
  l[l.index('test')]="a_2"
  l
  Out[78]: ['a_1', 'a_2', 'b', 'c']
  id (l)
  Out[79]: 1184869994688

切片赋值

 your_list[start:end:step]='新值'
 
  l
  Out[80]: ['a_1', 'a_2', 'b', 'c']
  l[0:2]
  Out[81]: ['a_1', 'a_2']
  l[0:2]="a"
  l
  Out[83]: ['a', 'b', 'c']
  l[0:2]
  Out[84]: ['a', 'b']
  l[0:2]=1
  Traceback (most recent call last):
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
  	exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-85-323c8d88b34e>", line 1, in <module>
  	l[0:2]=1
  TypeError: can only assign an iterable #切片出来的是序列，切片只能赋值序列，数值是单个值，字符串是序列

DELETE(删除)

pop() 从末尾删除元素并返回，适用于消息队列

  l
  Out[86]: ['a', 'b', 'c']
  x=l.pop()
  x
  Out[88]: 'c'
  l
  Out[89]: ['a', 'b']

clear() 清除当前列表的元素，不会改变列表的内存地址，若重新赋值，会改变列表的当前地址

  l
  Out[90]: ['a', 'b']
  id(l)
  Out[91]: 1184869994688
  l.clear()
  id(l)
  Out[93]: 1184869994688
  l=[]
  id(l)
  Out[95]: 1184870031424

排序

sort 修改本身的元素

l = [1, 4, 5, 2, 6]
l
Out[97]: [1, 4, 5, 2, 6]
l.sort()
l
Out[99]: [1, 2, 4, 5, 6]

sorted 不修改本身，返回处理结果，排序后返回新列表

  l = [1, 4, 5, 2, 6]
  l2=sorted(l)
  l
  Out[104]: [1, 4, 5, 2, 6]
  l2
  Out[105]: [1, 2, 4, 5, 6]
  id(l)
  Out[106]: 1184870031424
  id(l2)
  Out[107]: 1184869984448

reversed 倒序之后返回新列表

l=[1,4,5,2,6]
  list(reversed(l))
  Out[115]: [6, 2, 5, 4, 1]

  l=[1,4,5,2,6]
  l.reverse()
  l
  Out[111]: [6, 2, 5, 4, 1]

tuple 不可变的列表
- Create 无
- Retrieve
  - 索引取值
  - index
  - 切片
```
  t=(1,2,3)
  t.index(1)
  Out[118]: 0
  t[0:1]
  Out[119]: (1,)
```
- Update 无
- Delete 无

dict 字典CURD

Create

键对值赋值

update 提供合并字典功能

  d={}
  id(d)
  Out[122]: 1184870022336
  d['a']=1
  d2={"b":2,"c":3}
  d.update(d2)
  d
  Out[126]: {'a': 1, 'b': 2, 'c': 3}
  id(d)
  Out[127]: 1184870022336

setdefault 如果字典中没有当前key，那么设置默认值，相当于没有key，新增并设置默认值

  d
  Out[128]: {'a': 1, 'b': 2, 'c': 3}
  d.setdefault('a',0)
  Out[129]: 1
  d
  Out[130]: {'a': 1, 'b': 2, 'c': 3}
  d.setdefault('d',0)
  Out[132]: 0
  d
  Out[133]: {'a': 1, 'b': 2, 'c': 3, 'd': 0}

Retrieve

键对值访问

get 键对值访问缺失key会报错，而get不会，返回null，且可以指定默认值

  d
  Out[133]: {'a': 1, 'b': 2, 'c': 3, 'd': 0}

  d['a']
  Out[137]: 1
  d['e']
  Traceback (most recent call last):
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
  	exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-138-06ba894b5834>", line 1, in <module>
  	d['e']
  KeyError: 'e'

  d.get('f')
  d.get('f',0)
  Out[141]: 0
  d
  Out[142]: {'a': 1, 'b': 2, 'c': 3, 'd': 0}

keys() 返回所有的key

  d
  Out[142]: {'a': 1, 'b': 2, 'c': 3, 'd': 0}
  d.keys()
  Out[143]: dict_keys(['a', 'b', 'c', 'd'])
  
  生成列表需要转换
  d.keys()[0]
  Traceback (most recent call last):
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
  	exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-144-8ea809e8b798>", line 1, in <module>
  	d.keys()[0]
  TypeError: 'dict_keys' object is not subscriptable
  list(d.keys())
  Out[147]: ['a', 'b', 'c', 'd']

values() 返回所有的values

  list(d.values())
  Out[151]: [1, 2, 3, 0]
  d
  Out[152]: {'a': 1, 'b': 2, 'c': 3, 'd': 0}
  d.values()
  Out[153]: dict_values([1, 2, 3, 0])
  list(d.values())
  Out[154]: [1, 2, 3, 0]

item() 返回所有键对值

  d
  Out[155]: {'a': 1, 'b': 2, 'c': 3, 'd': 0}
  d.items()
  Out[156]: dict_items([('a', 1), ('b', 2), ('c', 3), ('d', 0)])
  list(d.items())
  Out[157]: [('a', 1), ('b', 2), ('c', 3), ('d', 0)]

Update

键对值赋值

  d
  Out[158]: {'a': 1, 'b': 2, 'c': 3, 'd': 0}
  d['a']=11
  d
  Out[160]: {'a': 11, 'b': 2, 'c': 3, 'd': 0}

update

  d.update({"b":22,"c":33})
  d
  Out[163]: {'a': 11, 'b': 22, 'c': 33, 'd': 0}

Delete

pop(key) 删除当前元素并返回value

  d
  Out[164]: {'a': 11, 'b': 22, 'c': 33, 'd': 0}
  d.pop('a')
  Out[165]: 11
  d
  Out[166]: {'b': 22, 'c': 33, 'd': 0}

popitem() 对于人来说无序的，相当于随机返回一个item

  Out[165]: 11
  d
  Out[166]: {'b': 22, 'c': 33, 'd': 0}
  d.popitem()
  Out[167]: ('d', 0)
  d
  Out[168]: {'b': 22, 'c': 33}

clear()

d.clear()
d
Out[172]: {}

set() 函数创建一个无序不重复元素集，可进行关系测试，删除重复数据，还可以计算交集、差集、并集等

Create

add

  s=set()
  s.add("a")
  s
  Out[175]: {'a'}

Retrieve 检索

运算符in 过滤，成员检测

  s
  Out[179]: {'a'}
  'a' in s
  Out[181]: True

Update

update

  s.update({"b","c"})
  s
  Out[184]: {'a', 'b', 'c'}

union

  s
  Out[185]: {'a', 'b', 'c'}
  s2={"d","e"}
  s.union(s2)
  Out[187]: {'a', 'b', 'c', 'd', 'e'}

  s.union({"e","f"})
  Out[192]: {'a', 'b', 'c', 'e', 'f'}

Delete

remove和discard
discard缺失元素时不会报错，而remove会报错

s
Out[193]: {'a', 'b', 'c'}
s.remove("a")
s
Out[195]: {'b', 'c'}
s.discard("e")
s.remove("e") 若值不存在会报错
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
	exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-197-ae0c369fede2>", line 1, in <module>
	s.remove("e")
KeyError: 'e'

pop() 无序删除并返回元素
删除和返回的无序元素

练习

完成四大基础数据结构的CRUD操作

查看全文

相关阅读:
hdu 2553 N皇后问题（dfs）
hdu 1043 Eight（双向bfs）
牛人的ACM经验（转）
康托和逆康托展开（转）
hiho Mission Impossible 6（模拟未提交验证。。）
数组越界也可以这么巧~~~
poj 1679 The Unique MST（次小生成树）
zoj 3204 Connect them（最小生成树）
hdu 4463 Outlets（最小生成树）
廖雪峰Java1-2程序基础-8字符和字符串

原文地址：https://www.cnblogs.com/final233/p/15751889.html