开发环境
- 系统: ubuntu18.04
- 系统编码: $LANG = en_US.UTF-8
- python解释器版本: Python 3.6.7
乱码现场
使用 json.dumps() 将 dict 转化为 json 数据的时候, 中文会显示为对应的 unicode 编码形式, 如果是将数据发送到第三方, 那么也会显示 unicode 编码形式, 很僵硬. ( 虽然说尽量使用英文进行交互, 但是相关部门的车牌号, 违法地址等信息都是中文的阿... )
demo:
param = {
"code": "0",
"message": "中文内容"
}
param1 = json.dumps(param, ensure_ascii=False)
param2 = json.dumps(param)
print(param1)
print(param2)
输出内容:
param1: {"code": "0", "message": "中文内容"}
param2: {"code": "0", "message": "u4e2du6587u5185u5bb9"}
并非乱码, 其实是 unicode 的16进制表示形式.
分析
区别在于 ensure_ascii=False 这个参数, 看下 json.dumps() 的方法的源码:
def dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True,
allow_nan=True, cls=None, indent=None, separators=None,
default=None, sort_keys=False, **kw):
...
# 如果 ensure_ascii 为false,则返回值可以包含非ascii值
If ``ensure_ascii`` is false, then the return value can contain non-ASCII
characters if they appear in strings contained in ``obj``. Otherwise, all
such characters are escaped in JSON strings.
...
使用 requests 发送中文数据的demo
# encoding: utf-8
import time
import json
import requests
def http_post(url, param=None):
param_json = param and json.dumps(param, ensure_ascii=False)
headers = {
"content-type": "application/json",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/71.0.3578.98 Safari/537.36"
}
with requests.Session() as session:
for i in range(3):
try:
resp = session.post(url, headers=headers, data=param_json.encode("utf-8"), verify=False, timeout=10)
if resp.status_code < 300:
print(resp)
return
else:
print('status code %s' % resp.status_code)
break
except Exception as e:
print(e)
time.sleep(1)
break
if __name__ == '__main__':
param = {
"code": "0",
"message": "中文内容"
}
param1 = json.dumps(param, ensure_ascii=False)
param2 = json.dumps(param)
print("param1: ", param1)
print("param2: ", param2)
http_post("http://10.10.19.199:8000/client/instesv/illegallogic", param=param)
接收方:
body content:[{"code": "0", "message": "中文内容"}]