初衷是用正则来写一个Unicode字符串转码的方法,一开始是打算结合StringBuilder写的,但是看到jdk7的Matcher.appendReplacement文档中一段示例代码用了Matcher.appendReplacement,原来已经有专门做替换用的方法了。
Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb.toString());
但是印象中StringBuilder性能应该更好啊,因为StringBuffer用synchronized实现的,所以写了简单测试测了一下两种实现(测试环境是JDK7):
@Before
public void before(){
for (int i = 0; i < 100; i++) {
assertEquals(excepted, unicode2StringWithStringBuffer(input));
assertEquals(excepted, unicode2StringWithStringBuilder(input));
}
}
@Test
public void testUnicode2StringWithStringBuilder() {
long start = System.currentTimeMillis();
for (int i = 0; i < COUNT; i++) {
unicode2StringWithStringBuilder(input);
}
System.out.println(String.format("v1 StringBuilder %s takes: %s", COUNT, (System.currentTimeMillis() - start)));
}
@Test
public void testUnicode2StringWithStringBuffer() {
long start = System.currentTimeMillis();
for (int i = 0; i < COUNT; i++) {
unicode2StringWithStringBuffer(input);
}
System.out.println(String.format("v2 StringBuffer %s takes: %s", COUNT, (System.currentTimeMillis() - start)));
}
private static final int COUNT = 10000000;
private static final String excepted = "请求失败,参数错误:[action]";
private static final String input = "u8bf7u6c42u5931u8d25uff0cu53c2u6570u9519u8bef:[action]";
private static final Pattern patternUnicode = Pattern.compile("\\u([0-9a-zA-Z]{4})");
private static String unicode2StringWithStringBuilder(final String unicode) {
if (unicode != null) {
try {
Matcher matcher = patternUnicode.matcher(unicode);
StringBuilder stringBuilder = new StringBuilder(unicode);
int offset = 0; //StringBuilder替换长度不等的字符产生的位置偏移
while (matcher.find()) {
String current = matcher.group();
String code = matcher.group(1);
String ch = String.valueOf((char) Integer.parseInt(code, 16));
stringBuilder.replace(matcher.start() + offset, matcher.end() + offset, ch);
offset += 1 - current.length(); //1为ch长度
}
return stringBuilder.toString();
} catch (Exception e) {
e.printStackTrace();
return unicode;
}
} else {
return unicode;
}
}
private static String unicode2StringWithStringBuffer(final String unicode) {
if (unicode != null) {
try {
Matcher matcher = patternUnicode.matcher(unicode);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(sb, String.valueOf((char) Integer.parseInt(matcher.group(1), 16)));
}
matcher.appendTail(sb);
return sb.toString();
} catch (Exception e) {
e.printStackTrace();
return unicode;
}
} else {
return unicode;
}
}
1亿次和1千万次的执行结果分别是:
v2 StringBuffer 1000000 takes: 1815
v1 StringBuilder 1000000 takes: 1364
v2 StringBuffer 10000000 takes: 14107
v1 StringBuilder 10000000 takes: 13316
不知道这个测试科不科学,结果确实是StringBuilder快一些。