zoukankan      html  css  js  c++  java
  • python读写文件以及常用httpClient响应体的编码问题

    9-12,结论:unicode不能直接写入文件,需要先编码,编码方式任意,只要能成功编码即可,如可用GBK、UTF-8。
    13,结论:选用的编码方式须能编码目标字符(例子中的“喆”不包含在GB2312中,导致失败)
    14,结论:用codecs操作写文件时,参数指定编码方式后,无须手动对字符串进行编码;当然编码方式要能编码目标字符(如GBK可编码“喆”字)
    15,结论:open读文件返回str,codecs读文件返回unicode

    其他:
    http请求响应体的类型:urllib、urllib2返回str,requests返回unicode

    In [9]: %paste
    t = u"我爱中国"
    with open('file.txt', 'w') as f:
            f.write(t)
    
    ## -- End pasted text --
    ---------------------------------------------------------------------------
    UnicodeEncodeError                        Traceback (most recent call last)
    <ipython-input-8-97316ac14ece> in <module>()
          2 with open('file.txt', 'w') as f:
    ----> 4         f.write(t)
    
    UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
    
    In [10]: %paste
    t = "我爱中国"
    with open('file.txt', 'w') as f:
            f.write(t)
    
    ## -- End pasted text --
    
    In [11]: %paste
    t = u"我爱中国"
    with open('file.txt', 'w') as f:
            f.write(t.encode('utf8'))
    ## -- End pasted text --
    
    In [12]: %paste
    t = u"我爱中国"
    with open('file.txt', 'w') as f:
            f.write(t.encode('gbk'))
    ## -- End pasted text --
    
    In [13]: %paste
    t = u"我爱李喆"
    with open('file.txt', 'w') as f:
            f.write(t.encode('gb2312'))
    ## -- End pasted text --
    ---------------------------------------------------------------------------
    UnicodeEncodeError                        Traceback (most recent call last)
    <ipython-input-13-53c795397701> in <module>()
          1 t = u"我爱李喆"
          2 with open('file.txt', 'w') as f:
    ----> 3         f.write(t.encode('gb2312'))
    
    UnicodeEncodeError: 'gb2312' codec can't encode character u'u5586' in position 3: illegal multibyte sequence
    
    In [14]: %paste
    import codecs
    t = u"我爱李喆"
    with codecs.open('file.txt', 'w', "gbk") as f:
            f.write(t)
    ## -- End pasted text --
    
    In [15]: %paste
    with open('file.txt', 'r') as f:
            t = f.read()
            print type(t)
    
    import codecs
    with codecs.open('file.txt', 'r', "gbk") as f:
            t = f.read()
            print type(t)
    ## -- End pasted text --
    <type 'str'>
    <type 'unicode'>
    
    本文原创发表于http://www.cnblogs.com/qijj,转载请保留此声明。
  • 相关阅读:
    dom4j解析XML时忽略DTD文件,加速文件解析过程
    mysql 中 in 语句参数个数
    N皇后 java
    springboot+jpa多表查询
    使用lua脚本在nginx上进行灰度流量转发
    RestTemplate将字符串以文件的方式上传
    在idea中编写自动拉取、编译、启动springboot项目的shell脚本
    逻辑回归调优方向
    流程图采用mindmanager进行绘制相关流程图体验较好
    尝试使用utool进行一些任务管理,例如ocr功能,使用讯飞ocr可以提高效率,替换图床
  • 原文地址:https://www.cnblogs.com/qijj/p/6384451.html
Copyright © 2011-2022 走看看