zoukankan      html  css  js  c++  java
  • java识别文件或字符串的编码格式

    1, 用juniversalchardet:

    http://code.google.com/p/juniversalchardet/

    官方示例:

    import org.mozilla.universalchardet.UniversalDetector;
    
    public class TestDetector
    {
      public static void main(String[] args)
      {
        byte[] buf = new byte[4096];
        java.io.FileInputStream fis = new java.io.FileInputStream("test.txt");
    
        // (1)
        UniversalDetector detector = new UniversalDetector(null);
    
        // (2)
        int nread;
        while ((nread = fis.read(buf)) > 0 && !detector.isDone()) {
          detector.handleData(buf, 0, nread);
        }
        // (3)
        detector.dataEnd();
    
        // (4)
        String encoding = detector.getDetectedCharset();
        if (encoding != null) {
          System.out.println("Detected encoding = " + encoding);
        } else {
          System.out.println("No encoding detected.");
        }
    
        // (5)
        detector.reset();
      }
    }

    他人示例代码:

    public static String guessEncoding(byte[] bytes) {
        String DEFAULT_ENCODING = "UTF-8";
        org.mozilla.universalchardet.UniversalDetector detector =
            new org.mozilla.universalchardet.UniversalDetector(null);
        detector.handleData(bytes, 0, bytes.length);
        detector.dataEnd();
        String encoding = detector.getDetectedCharset();
        detector.reset();
        if (encoding == null) {
            encoding = DEFAULT_ENCODING;
        }
        return encoding;
    }
  • 相关阅读:
    佳佳的 Fibonacci
    毒瘤之神的考验
    An error occurred while searching for implementations of method
    eclipse 开发 scala
    hbase的数据模型
    Hbase和RDBMS(关系数据库管理系统)区别
    hbase和mapreduce开发 WordCount
    使用eclipse开发hbase程序
    hbase 的体系结构
    hbase 遇到过的问题
  • 原文地址:https://www.cnblogs.com/welhzh/p/3599864.html
Copyright © 2011-2022 走看看