zoukankan      html  css  js  c++  java
  • JAVA 统计字符串中中文,英文,数字,空格,特殊字符的个数

    引言

          可以根据各种字符在Unicode字符编码表中的区间来进行判断,如数字为'0'~'9'之间,英文字母为'a'~'z'或'A'~'Z'等,Java判断一个字符串是否有中文是利用Unicode编码来判断,因为中文的编码区间为:0x4e00--0x9fbb, 但通用区间来判断中文也不非常精确,因为有些中文的标点符号利用区间判断会得到错误的结果。所以通过Character.UnicodeBlock来进行判断。代码如下:

    package cn.csrc.base.count;

    public class CountCharacter {

      public static void main(String[] args) {

        String str ="我爱你abcd123中国 #!";
        CountCharacter countCharacter = new CountCharacter();
        countCharacter.count(str);
      }

      /**中文字符 */
      private int chCharacter = 0;

      /**英文字符 */
      private int enCharacter = 0;

      /**空格 */
      private int spaceCharacter = 0;

      /**数字 */
      private int numberCharacter = 0;

      /**其他字符 */
      private int otherCharacter = 0;

      //记录中文字符
      private StringBuilder sb1=new StringBuilder();


      //记录英文字符
      private StringBuilder sb2=new StringBuilder();


      //记录数字
      private StringBuilder sb3=new StringBuilder();


      //记录特殊字符
      private StringBuilder sb4=new StringBuilder();


      /***
      * 统计字符串中中文,英文,数字,空格等字符个数
      * @param str 需要统计的字符串
      */
      public void count(String str) {
        if(str.equals("") || str==null){
          System.out.println("字符串为空");
           return;
          }
        for (int i = 0; i < str.length(); i++) {
          char tmp = str.charAt(i);
          if ((tmp >= 'A' && tmp <= 'Z') || (tmp >= 'a' && tmp <= 'z')) {
            enCharacter ++;
            sb2.append(tmp+" ");
          } else if ((tmp >= '0') && (tmp <= '9')) {
            numberCharacter ++;
            sb3.append(tmp +" ");
          } else if (tmp ==' ') {
            spaceCharacter ++;
          } else if (isChinese(tmp)) {
            chCharacter ++;
            sb1.append(tmp+" ");
          } else {
            otherCharacter ++;
            sb4.append(tmp +" ");
          }
        }
          System.out.println("字符串:" + str + " ");
          System.out.println("中文字符有:" + chCharacter +" ("+sb1.toString()+")");
          System.out.println("英文字符有:" + enCharacter +" ("+sb2.toString()+")");
          System.out.println("数字有:" + numberCharacter+" ("+sb3.toString()+")");
          System.out.println("空格有:" + spaceCharacter+"");
          System.out.println("其他字符有:" + otherCharacter+" ("+sb4.toString()+")");
        }

        /***
        * 判断字符是否为中文
        * @param ch 需要判断的字符
        * @return 中文返回true,非中文返回false
        */
        private boolean isChinese(char ch) {
          //获取此字符的UniCodeBlock
          Character.UnicodeBlock ub = Character.UnicodeBlock.of(ch);
          // GENERAL_PUNCTUATION 判断中文的“号
          // CJK_SYMBOLS_AND_PUNCTUATION 判断中文的。号
          if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS || ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
           || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOG  RAPHS_EXTENSION_B
         || ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION || ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS
    || ub == Character.UnicodeBlock.GENERAL_PUNCTUATION) {
          System.out.println(ch + " 是中文");
          //sb1.append(ch+" ");
          return true;
        }
        return false;
      
      }
    }

      结果如下:

          

  • 相关阅读:
    美达飞凡16X DVD起死回生记
    vs2k5 中asp.net "Web Site Administration Tool "使用中遇到的问题
    有关sqlserver的锁
    基于dotnet2.0的联通sgip1.2协议二级网关源码
    .net winform下TreeNode在没有子结点时也显示+号的解决办法
    小胜凭智, 大胜靠德
    寄语八十年代的新一代
    PHP+APACHE+MYSQL+WINDOWS 环境配置秘笈,一定行!!!!
    JS获取当前屏幕分辨率
    godaddy免费空间安装wordpress教程之500错误的解决办法/读写权限修改
  • 原文地址:https://www.cnblogs.com/zhaosq/p/11014746.html
Copyright © 2011-2022 走看看