在使用 BytesWritable 进行小文件合并时,发现长度与原类容不一致,会多出一些空格
测试代码
@Test public void test() { String str = "aaa"; BytesWritable v = new BytesWritable(); v.set(str.getBytes(), 0, str.getBytes().length); System.out.println("*" + new String(v.getBytes()) + "*"); }
结果,看到多出了一个空格
查看 BytesWritable 源码,发现复制后数组大小会被处理,真正存储类容长度的为 size 属性
public void set(byte[] newData, int offset, int length) { setSize(0); setSize(length); System.arraycopy(newData, offset, bytes, 0, size); }
public void setSize(int size) { if (size > getCapacity()) { // Avoid overflowing the int too early by casting to a long. long newSize = Math.min(Integer.MAX_VALUE, (3L * size) / 2L); setCapacity((int) newSize); } this.size = size; }
既然知道长度,在转换时设置上就好了
@Test public void test() { String str = "aaa"; BytesWritable v = new BytesWritable(); v.set(str.getBytes(), 0, str.getBytes().length); // getSize()为过期方法,使用 getLength() System.out.println("*" + new String(v.getBytes(),0,v.getLength()) + "*"); }
http://hadoop.apache.org/docs/r2.9.2/api/org/apache/hadoop/io/BytesWritable.html