zoukankan      html  css  js  c++  java
  • java识别文件或字符串的编码格式

    1, 用juniversalchardet:

    http://code.google.com/p/juniversalchardet/

    官方示例:

    import org.mozilla.universalchardet.UniversalDetector;
    
    public class TestDetector
    {
      public static void main(String[] args)
      {
        byte[] buf = new byte[4096];
        java.io.FileInputStream fis = new java.io.FileInputStream("test.txt");
    
        // (1)
        UniversalDetector detector = new UniversalDetector(null);
    
        // (2)
        int nread;
        while ((nread = fis.read(buf)) > 0 && !detector.isDone()) {
          detector.handleData(buf, 0, nread);
        }
        // (3)
        detector.dataEnd();
    
        // (4)
        String encoding = detector.getDetectedCharset();
        if (encoding != null) {
          System.out.println("Detected encoding = " + encoding);
        } else {
          System.out.println("No encoding detected.");
        }
    
        // (5)
        detector.reset();
      }
    }

    他人示例代码:

    public static String guessEncoding(byte[] bytes) {
        String DEFAULT_ENCODING = "UTF-8";
        org.mozilla.universalchardet.UniversalDetector detector =
            new org.mozilla.universalchardet.UniversalDetector(null);
        detector.handleData(bytes, 0, bytes.length);
        detector.dataEnd();
        String encoding = detector.getDetectedCharset();
        detector.reset();
        if (encoding == null) {
            encoding = DEFAULT_ENCODING;
        }
        return encoding;
    }
  • 相关阅读:
    hutool 解析 Excel
    上传文件
    Cannot construct instance of `com.**` (although at least one Creator exists)
    Java8之Optional
    java8之Stream
    java8之Lambda
    springboot+mybatis事务管理
    queryWrapper in like
    Java 组装 Tree
    JWT
  • 原文地址:https://www.cnblogs.com/welhzh/p/3599864.html
Copyright © 2011-2022 走看看