zoukankan      html  css  js  c++  java
  • C#中实现输入汉字获取其拼音(汉字转拼音)的2种方法

    主要介绍了C#中实现输入汉字获取其拼音(汉字转拼音)的2种方法,本文分别给出了使用微软语言包、手动编码实现两种实现方式,需要的朋友可以参考下

    本文刚发布时,只写了一个实现方式,使用的是微软的语言包,但是对多音字的效果不怎么理想,甚至个别字会出现很诡异的错误,因此,现在扩展另一个方法,手动实现。

    方式一、使用微软语言包

    微软为了开发者实现国际化语言的互转,提供了Microsoft Visual Studio International Pack,这个扩展包里面有中文、日文、韩文、英语等各国语言包,并提供方法实现互转、获取拼音、获取字数、甚至获取笔画数等等。 [这种方式对多音字的效果不怎么理想,但是,这个方法比较简单,直接导入包就可以了,因此,对于那些只需要个别语句进行处理的或者不注重多音字的,可以使用这种方式,毕竟简便嘛。]

    在这里示例讲的是输入汉字,获取其拼音,获取拼音和获取拼音首字母实现效果分别如下:

    首先,去微软官网下载Microsoft Visual Studio International Pack语言包,下载地址分别如下:

    Microsoft Visual Studio International Pack 1.0 SR1Microsoft Visual Studio International Feature Pack 2.0

    下载后分别是“vsintlpack1.zip”、“Vsintlpack2.msi”、双击“Vsintlpack2.msi”安装、路径随意、但是要记得、因为一会要引用的、

     安装“Vsintlpack2.msi”之后、解压“vsintlpack1.zip”、里面包含七个语言包、  例如中文转拼音“CHSPinYinConv.msi”、简体繁体互转“CHTCHSConv.msi”等等。。

     在这里我们用到的是“CHSPinYinConv.msi”、双击安装成功后、打开Visual Studio、新建一个WinForm项目、窗体布局如上图所示、

    首先:添加刚刚安装的语言包引用:

    “D:Program Files (x86)Microsoft Visual Studio International PackSimplified Chinese Pin-Yin Conversion LibraryChnCharInfo.dll”

    默认是C盘、在这里我安装在D盘了,然后添加using引用:

    1 using Microsoft.International.Converters.PinYinConverter;//导入拼音相关

    创建获取拼音的方法:

    /// <summary> 
    
    /// 汉字转化为拼音
    
    /// </summary> 
    
    /// <param name="str">汉字</param> 
    
    /// <returns>全拼</returns> 
    
    public static string GetPinyin(string str)
    
    {
    
        string r = string.Empty;
    
        foreach (char obj in str)
    
        {
    
            try
    
            {
    
                ChineseChar chineseChar = new ChineseChar(obj);
    
                string t = chineseChar.Pinyins[0].ToString();
    
                r += t.Substring(0, t.Length - 1);
    
            }
    
            catch
    
            {
    
                r += obj.ToString();
    
            }
    
        }
    
        return r;
    
    }
    View Code

    创建获取汉字拼音首字母的方法:

    /// <summary> 
    
    /// 汉字转化为拼音首字母
    
    /// </summary> 
    
    /// <param name="str">汉字</param> 
    
    /// <returns>首字母</returns> 
    
    public static string GetFirstPinyin(string str)
    
    {
    
        string r = string.Empty;
    
        foreach (char obj in str)
    
        {
    
            try
    
            {
    
                ChineseChar chineseChar = new ChineseChar(obj);
    
                string t = chineseChar.Pinyins[0].ToString();
    
                r += t.Substring(0, 1);
    
            }
    
            catch
    
            {
    
                r += obj.ToString();
    
            }
    
        }
    
        return r;
    
    }
    View Code

    然后在“转拼音”按钮的点击事件中调用上述方法:

    // 汉字转拼音
    
    private void btn_One_Click(object sender, EventArgs e)
    
    {
    
        string source = this.txt_ChineseCharacter_One.Text.Trim();  // 得到输入的源字符
    
        string result = GetPinyin(source);  // 调用方法,获取拼音
    
        this.txt_Pinyin_One.Text = result;
    
    }
    View Code

    在“转首字母”按钮点击事件中调用上述方法:

    // 转首字母
    
    private void btn_Two_Click(object sender, EventArgs e)
    
    {
    
        string source = this.txt_ChineseCharacter_One.Text.Trim();  // 得到输入的源字符
    
        string result = GetFirstPinyin(source);  // 调用方法,获取拼音
    
        this.txt_Pinyin_One.Text = result;
    
    }
    View Code

    到此,已经完成了80%,运行程序,你会发现,当点击“转拼音”的时候,结果是这样子的:

    并不是我开始说的那种“Gu Ying”的效果啊、这是因为我在获取拼音的时候简单的处理了一下:

    // 汉字转拼音
    
    private void btn_One_Click(object sender, EventArgs e)
    
    {
    
        string source = this.txt_ChineseCharacter_One.Text.Trim();  // 得到输入的源字符  
         string result = string.Empty;   // 转拼音的结果
    
        string temp = string.Empty; // 下面foreach用到的临时变量
    
        foreach (char item in source)   // 遍历每个源字符
    
        {
    
            temp = GetPinyin(item.ToString());  // 将每个字符转拼音
    
            // 处理:获取首字母大写、其余字母小写
    
            result += (String.Format("{0}{1} ", temp.Substring(0, 1).ToUpper(), temp.Substring(1).ToLower()));
    
        } 
         //string result = GetPinyin(source);  // 调用方法,获取拼音
    
        this.txt_Pinyin_One.Text = result;
    
    }
    
     
    View Code

    OK、到此、这个功能已经实现完成了,还有其余的语言包功能,和此类似,大家可以百度“Microsoft Visual Studio International Pack使用”、各种语言之间的互转及功能示例就出来了。

    方式二、手动编码实现

    这种方式其实也不困难,说白了就是根据Unicode编码值,定义对应的拼音数组或集合,然后实现此效果。

    首先定义拼音区编码数组:

    //定义拼音区编码数组
    
    private static int[] getValue = new int[]
    
        {
    
            -20319,-20317,-20304,-20295,-20292,-20283,-20265,-20257,-20242,-20230,-20051,-20036,
    
            -20032,-20026,-20002,-19990,-19986,-19982,-19976,-19805,-19784,-19775,-19774,-19763,
    
            -19756,-19751,-19746,-19741,-19739,-19728,-19725,-19715,-19540,-19531,-19525,-19515,
    
            -19500,-19484,-19479,-19467,-19289,-19288,-19281,-19275,-19270,-19263,-19261,-19249,
    
            -19243,-19242,-19238,-19235,-19227,-19224,-19218,-19212,-19038,-19023,-19018,-19006,
    
            -19003,-18996,-18977,-18961,-18952,-18783,-18774,-18773,-18763,-18756,-18741,-18735,
    
            -18731,-18722,-18710,-18697,-18696,-18526,-18518,-18501,-18490,-18478,-18463,-18448,
    
            -18447,-18446,-18239,-18237,-18231,-18220,-18211,-18201,-18184,-18183, -18181,-18012,
    
            -17997,-17988,-17970,-17964,-17961,-17950,-17947,-17931,-17928,-17922,-17759,-17752,
    
            -17733,-17730,-17721,-17703,-17701,-17697,-17692,-17683,-17676,-17496,-17487,-17482,
    
            -17468,-17454,-17433,-17427,-17417,-17202,-17185,-16983,-16970,-16942,-16915,-16733,
    
            -16708,-16706,-16689,-16664,-16657,-16647,-16474,-16470,-16465,-16459,-16452,-16448,
    
            -16433,-16429,-16427,-16423,-16419,-16412,-16407,-16403,-16401,-16393,-16220,-16216,
    
            -16212,-16205,-16202,-16187,-16180,-16171,-16169,-16158,-16155,-15959,-15958,-15944,
    
            -15933,-15920,-15915,-15903,-15889,-15878,-15707,-15701,-15681,-15667,-15661,-15659,
    
            -15652,-15640,-15631,-15625,-15454,-15448,-15436,-15435,-15419,-15416,-15408,-15394,
    
            -15385,-15377,-15375,-15369,-15363,-15362,-15183,-15180,-15165,-15158,-15153,-15150,
    
            -15149,-15144,-15143,-15141,-15140,-15139,-15128,-15121,-15119,-15117,-15110,-15109,
    
            -14941,-14937,-14933,-14930,-14929,-14928,-14926,-14922,-14921,-14914,-14908,-14902,
    
            -14894,-14889,-14882,-14873,-14871,-14857,-14678,-14674,-14670,-14668,-14663,-14654,
    
            -14645,-14630,-14594,-14429,-14407,-14399,-14384,-14379,-14368,-14355,-14353,-14345,
    
            -14170,-14159,-14151,-14149,-14145,-14140,-14137,-14135,-14125,-14123,-14122,-14112,
    
            -14109,-14099,-14097,-14094,-14092,-14090,-14087,-14083,-13917,-13914,-13910,-13907,
    
            -13906,-13905,-13896,-13894,-13878,-13870,-13859,-13847,-13831,-13658,-13611,-13601,
    
            -13406,-13404,-13400,-13398,-13395,-13391,-13387,-13383,-13367,-13359,-13356,-13343,
    
            -13340,-13329,-13326,-13318,-13147,-13138,-13120,-13107,-13096,-13095,-13091,-13076,
    
            -13068,-13063,-13060,-12888,-12875,-12871,-12860,-12858,-12852,-12849,-12838,-12831,
    
            -12829,-12812,-12802,-12607,-12597,-12594,-12585,-12556,-12359,-12346,-12320,-12300,
    
            -12120,-12099,-12089,-12074,-12067,-12058,-12039,-11867,-11861,-11847,-11831,-11798,
    
            -11781,-11604,-11589,-11536,-11358,-11340,-11339,-11324,-11303,-11097,-11077,-11067,
    
            -11055,-11052,-11045,-11041,-11038,-11024,-11020,-11019,-11018,-11014,-10838,-10832,
    
            -10815,-10800,-10790,-10780,-10764,-10587,-10544,-10533,-10519,-10331,-10329,-10328,
    
            -10322,-10315,-10309,-10307,-10296,-10281,-10274,-10270,-10262,-10260,-10256,-10254
    
        };
    View Code

    然后定义拼音数组:

    //定义拼音数组
    
    private static string[] getName = new string[]
    
        {
    
            "A","Ai","An","Ang","Ao","Ba","Bai","Ban","Bang","Bao","Bei","Ben",
    
            "Beng","Bi","Bian","Biao","Bie","Bin","Bing","Bo","Bu","Ba","Cai","Can",
    
            "Cang","Cao","Ce","Ceng","Cha","Chai","Chan","Chang","Chao","Che","Chen","Cheng",
    
            "Chi","Chong","Chou","Chu","Chuai","Chuan","Chuang","Chui","Chun","Chuo","Ci","Cong",
    
            "Cou","Cu","Cuan","Cui","Cun","Cuo","Da","Dai","Dan","Dang","Dao","De",
    
            "Deng","Di","Dian","Diao","Die","Ding","Diu","Dong","Dou","Du","Duan","Dui",
    
            "Dun","Duo","E","En","Er","Fa","Fan","Fang","Fei","Fen","Feng","Fo",
    
            "Fou","Fu","Ga","Gai","Gan","Gang","Gao","Ge","Gei","Gen","Geng","Gong",
    
            "Gou","Gu","Gua","Guai","Guan","Guang","Gui","Gun","Guo","Ha","Hai","Han",
    
            "Hang","Hao","He","Hei","Hen","Heng","Hong","Hou","Hu","Hua","Huai","Huan",
    
            "Huang","Hui","Hun","Huo","Ji","Jia","Jian","Jiang","Jiao","Jie","Jin","Jing",
    
            "Jiong","Jiu","Ju","Juan","Jue","Jun","Ka","Kai","Kan","Kang","Kao","Ke",
    
            "Ken","Keng","Kong","Kou","Ku","Kua","Kuai","Kuan","Kuang","Kui","Kun","Kuo",
    
            "La","Lai","Lan","Lang","Lao","Le","Lei","Leng","Li","Lia","Lian","Liang",
    
            "Liao","Lie","Lin","Ling","Liu","Long","Lou","Lu","Lv","Luan","Lue","Lun",
    
            "Luo","Ma","Mai","Man","Mang","Mao","Me","Mei","Men","Meng","Mi","Mian",
    
            "Miao","Mie","Min","Ming","Miu","Mo","Mou","Mu","Na","Nai","Nan","Nang",
    
            "Nao","Ne","Nei","Nen","Neng","Ni","Nian","Niang","Niao","Nie","Nin","Ning",
    
            "Niu","Nong","Nu","Nv","Nuan","Nue","Nuo","O","Ou","Pa","Pai","Pan",
    
            "Pang","Pao","Pei","Pen","Peng","Pi","Pian","Piao","Pie","Pin","Ping","Po",
    
            "Pu","Qi","Qia","Qian","Qiang","Qiao","Qie","Qin","Qing","Qiong","Qiu","Qu",
    
            "Quan","Que","Qun","Ran","Rang","Rao","Re","Ren","Reng","Ri","Rong","Rou",
    
            "Ru","Ruan","Rui","Run","Ruo","Sa","Sai","San","Sang","Sao","Se","Sen",
    
            "Seng","Sha","Shai","Shan","Shang","Shao","She","Shen","Sheng","Shi","Shou","Shu",
    
            "Shua","Shuai","Shuan","Shuang","Shui","Shun","Shuo","Si","Song","Sou","Su","Suan",
    
            "Sui","Sun","Suo","Ta","Tai","Tan","Tang","Tao","Te","Teng","Ti","Tian",
    
            "Tiao","Tie","Ting","Tong","Tou","Tu","Tuan","Tui","Tun","Tuo","Wa","Wai",
    
            "Wan","Wang","Wei","Wen","Weng","Wo","Wu","Xi","Xia","Xian","Xiang","Xiao",
    
            "Xie","Xin","Xing","Xiong","Xiu","Xu","Xuan","Xue","Xun","Ya","Yan","Yang",
    
            "Yao","Ye","Yi","Yin","Ying","Yo","Yong","You","Yu","Yuan","Yue","Yun",
    
            "Za", "Zai","Zan","Zang","Zao","Ze","Zei","Zen","Zeng","Zha","Zhai","Zhan",
    
            "Zhang","Zhao","Zhe","Zhen","Zheng","Zhi","Zhong","Zhou","Zhu","Zhua","Zhuai","Zhuan",
    
            "Zhuang","Zhui","Zhun","Zhuo","Zi","Zong","Zou","Zu","Zuan","Zui","Zun","Zuo"
    
       };
    View Code

    然后定义转换字符串的方法:

    /// <summary>汉字转换成全拼的拼音</summary>
    
            /// <param name="Chstr">汉字字符串</param>
    
            /// <returns>转换后的拼音字符串</returns> 
    
            public string StrConvertToPinyin(string Chstr)
    
            {
    
                Regex reg = new Regex("^[u4e00-u9fa5]$");//验证是否输入汉字
    
                byte[] arr = new byte[2];
    
                string pystr = "";
    
                int asc = 0, M1 = 0, M2 = 0;
    
                char[] mChar = Chstr.ToCharArray();//获取汉字对应的字符数组
    
                for (int j = 0; j < mChar.Length; j++)
    
                {
    
                    //如果输入的是汉字
    
                    if (reg.IsMatch(mChar[j].ToString()))
    
                    {
    
                        arr = System.Text.Encoding.Default.GetBytes(mChar[j].ToString());
    
                        M1 = (short)(arr[0]);
    
                        M2 = (short)(arr[1]);
    
                        asc = M1 * 256 + M2 - 65536;
    
                        if (asc > 0 && asc < 160)
    
                        {
    
                            pystr += mChar[j];
    
                        }
    
                        else
    
                        {
    
                            switch (asc)
    
                            {
    
                                case -9254:
    
                                    pystr += "Zhen"; break;
    
                                case -8985:
    
                                    pystr += "Qian"; break;
    
                                case -5463:
    
                                    pystr += "Jia"; break;
    
                                case -8274:
    
                                    pystr += "Ge"; break;
    
                                case -5448:
    
                                    pystr += "Ga"; break;
    
                                case -5447:
    
                                    pystr += "La"; break;
    
                                case -4649:
    
                                    pystr += "Chen"; break;
    
                                case -5436:
    
                                    pystr += "Mao"; break;
    
                                case -5213:
    
                                    pystr += "Mao"; break;
    
                                case -3597:
    
                                    pystr += "Die"; break;
    
                                case -5659:
    
                                    pystr += "Tian"; break;
    
                                default:
    
                                    for (int i = (getValue.Length - 1); i >= 0; i--)
    
                                    {
    
                                        if (getValue[i] <= asc) //判断汉字的拼音区编码是否在指定范围内
    
                                        {
    
                                            pystr += getName[i];//如果不超出范围则获取对应的拼音
    
                                            break;
    
                                        }
    
                                    }
    
                                    break;
    
                            }
    
                        }
    
                    }
    
                    else    //如果不是汉字
    
                    {
    
                        pystr += mChar[j].ToString();//如果不是汉字则返回
    
                    }
    
                }
    
                return pystr;//返回获取到的汉字拼音
    
            }
    View Code

    这种方法虽然也会对多音字的识别也不是很理想,但是这种方式毕竟是手动实现的,可以手动控制,比如,“家长”,我们想得到的结果是“Jia Zhang”,但是生成的结果却是“Jia Chang”、

     对于这种包含多音字的词组,我们可以另行控制,比如,定义一个多音字数组和其对应的不同词组组合,在我们进行转换拼音的时候,判断,如果是多音字,那么在其词组中找到对应的拼音即可

    转:http://www.jb51.net/article/59441.htm

  • 相关阅读:
    POJ 3268 Silver Cow Party (Dijkstra)
    怒学三算法 POJ 2387 Til the Cows Come Home (Bellman_Ford || Dijkstra || SPFA)
    CF Amr and Music (贪心)
    CF Amr and Pins (数学)
    POJ 3253 Fence Repair (贪心)
    POJ 3069 Saruman's Army(贪心)
    POJ 3617 Best Cow Line (贪心)
    CF Anya and Ghosts (贪心)
    CF Fox And Names (拓扑排序)
    mysql8.0的新特性
  • 原文地址:https://www.cnblogs.com/love201314/p/4962971.html
Copyright © 2011-2022 走看看