zoukankan      html  css  js  c++  java
  • BytesWritable 长度问题(多出空格)

    在使用 BytesWritable 进行小文件合并时,发现长度与原类容不一致,会多出一些空格

    测试代码

    @Test
    public void test() {
        String str = "aaa";
    
        BytesWritable v = new BytesWritable();
        v.set(str.getBytes(), 0, str.getBytes().length);
    
        System.out.println("*" + new String(v.getBytes()) + "*");
    }

    结果,看到多出了一个空格

    查看 BytesWritable 源码,发现复制后数组大小会被处理,真正存储类容长度的为 size 属性

    public void set(byte[] newData, int offset, int length) {
        setSize(0);
        setSize(length);
        System.arraycopy(newData, offset, bytes, 0, size);
    }
    public void setSize(int size) { if (size > getCapacity()) { // Avoid overflowing the int too early by casting to a long. long newSize = Math.min(Integer.MAX_VALUE, (3L * size) / 2L); setCapacity((int) newSize); } this.size = size; }

    既然知道长度,在转换时设置上就好了

    @Test
    public void test() {
        String str = "aaa";
    
        BytesWritable v = new BytesWritable();
        v.set(str.getBytes(), 0, str.getBytes().length);
    
        // getSize()为过期方法,使用 getLength()
        System.out.println("*" + new String(v.getBytes(),0,v.getLength()) + "*");
    }


    http://hadoop.apache.org/docs/r2.9.2/api/org/apache/hadoop/io/BytesWritable.html

  • 相关阅读:
    python 集合 set
    Meet Python
    Python 模块
    KNN
    Python Numpy包安装
    R分词
    Maximum Entropy Model(最大熵模型)初理解
    Conditional Random Fields (CRF) 初理解
    Naive Bayes (NB Model) 初识
    Hidden Markov Models(HMM) 初理解
  • 原文地址:https://www.cnblogs.com/jhxxb/p/10795875.html
Copyright © 2011-2022 走看看