zoukankan      html  css  js  c++  java
  • [工程技巧]

     #-*-coding:gbk-*-                                                                      
      2 #########################################################################              
      3 #   Copyright (C) 2015 All rights reserved.                                            
      4 #                                                                                      
      5 #   文件名称:getnltksinica.py                                                         
      6 #   创 建 者:刘禹 finallyly liuyusi0121@sogou-inc.com(ext 3209)                       
      7 #   创建日期:2015年10月28日                                                           
      8 #   描    述:                                                                         
      9 #                                                                                      
     10 #   备    注:                                                                         
     11 #                                                                                      
     12 #########################################################################              
     13 #!/usr/bin/python                                                                      
     14 # please add your code here!                                                           
     15 import sys                                                                                                       
     16 reload(sys)                                                                            
     17 sys.setdefaultencoding('utf8')                                                         
     18 import nltk;                                                                           
     19 from nltk.corpus import sinica_treebank                                                
     20 sinica_fd=nltk.FreqDist(sinica_treebank.words())                                       
     21 print len(sinica_fd)                                                                   
     22 for m in sinica_fd:                                                                    
     23     sys.stdout.write("%s "%m);  

    有一份文档是UTF-8编码,直接打印到标准输出没有问题,但是重定向的话就会出错,因为系统的默认编码是GBK的。加上reload(sys)

    sys.segdefaultencoding这两句就没错了。

  • 相关阅读:
    flume
    Hive的安装
    集群的高级设定
    HDFS命令
    2019-9-25:渗透测试,基础学习,初识Hydra,BP爆破密码
    2019-9-17:渗透测试,基础学习,apache初识,mysql初识等笔记
    2019-9-24:渗透测试,css样式,js基础学习笔记
    2019-9-24:渗透测试,JavaScript数据类型基础学习
    2019-9-23:渗透测试,基础学习,http协议数据包的认识,html css的认识,笔记
    转。http,状态码详解
  • 原文地址:https://www.cnblogs.com/finallyliuyu/p/4916508.html
Copyright © 2011-2022 走看看