zoukankan      html  css  js  c++  java
  • Python3. 获取中文首字母

    今天想获取中文首字母,网上一搜基本都是python2的脚本,试了几个不行,只得在原有脚本基础上改改。

    python2的如下脚本在python3上运行是不能通过的

    asc = ord(str1[0]) * 256 + ord(str1[1]) - 65536

    经过反复思考查证,最终解决了这个问题,现附上思考过程:


    就是酱紫。所以下边附上python3支持的获取中文首字母的方法
    def single_get_first(unicode1):
        str1 = unicode1.encode('gbk')
        try:
            ord(str1)
            return str1
        except:
            asc = str1[0] * 256 + str1[1] - 65536
            if asc >= -20319 and asc <= -20284:
                return 'a'
            if asc >= -20283 and asc <= -19776:
                return 'b'
            if asc >= -19775 and asc <= -19219:
                return 'c'
            if asc >= -19218 and asc <= -18711:
                return 'd'
            if asc >= -18710 and asc <= -18527:
                return 'e'
            if asc >= -18526 and asc <= -18240:
                return 'f'
            if asc >= -18239 and asc <= -17923:
                return 'g'
            if asc >= -17922 and asc <= -17418:
                return 'h'
            if asc >= -17417 and asc <= -16475:
                return 'j'
            if asc >= -16474 and asc <= -16213:
                return 'k'
            if asc >= -16212 and asc <= -15641:
                return 'l'
            if asc >= -15640 and asc <= -15166:
                return 'm'
            if asc >= -15165 and asc <= -14923:
                return 'n'
            if asc >= -14922 and asc <= -14915:
                return 'o'
            if asc >= -14914 and asc <= -14631:
                return 'p'
            if asc >= -14630 and asc <= -14150:
                return 'q'
            if asc >= -14149 and asc <= -14091:
                return 'r'
            if asc >= -14090 and asc <= -13119:
                return 's'
            if asc >= -13118 and asc <= -12839:
                return 't'
            if asc >= -12838 and asc <= -12557:
                return 'w'
            if asc >= -12556 and asc <= -11848:
                return 'x'
            if asc >= -11847 and asc <= -11056:
                return 'y'
            if asc >= -11055 and asc <= -10247:
                return 'z'
            return ''
        
    def getPinyin(string):
        if string==None:
            return None
        lst = list(string)
        charLst = []
        for l in lst:
            charLst.append(single_get_first(l))
        return  ''.join(charLst)
    
    if __name__=='__main__':
        print(getPinyin('非场'))
    
    
    
    
     
  • 相关阅读:
    程序的了解
    Oracle VM VirtualBox虚拟网卡消失解决方法
    YARN 运维、巡检、监控、调优、排障
    HDFS巡检、监控、调优、排障
    Windows CMD命令大全
    [HDU]6356 Glad You Came(ST表)
    [BZOJ] 1019 [SHOI2008]汉诺塔
    树上叶子之间点对距离平方和
    [BZOJ]1026[SCOI2009]windy数
    [计蒜客]A1542 The Maximum Unreachable Node Set
  • 原文地址:https://www.cnblogs.com/mayibanjiah/p/6007473.html
Copyright © 2011-2022 走看看