lua的string.format为什么比".."慢

zoukankan html css js c++ java

lua的string.format为什么比".."慢

大家直觉地认为".."操作符比string.format慢，这是个误解，实际上从连接字符串的效率来说".."比string.format快多了。先看看实验结果
10000000次实验，基础字符串11个字符
每次连接2个基础字符串 string.format 9秒； ".." 3秒
每次连接3个基础字符串 string.format 12秒； ".." 3秒
1000000次实验，基础字符串11个字符
每次连接10个基础字符串 string.format 4秒； ".." 1秒
10000000次实验，基础字符串59个字符
每次连接2个基础字符串 string.format 16秒； ".." 6秒
每次连接3个基础字符串 string.format 23秒； ".." 7秒
1000000次实验，基础字符串59个字符
每次连接10个基础字符串 string.format 7秒； ".." 2秒
一句话总结，".."比string.format平均快3-4倍，下面是实验的代码
local Beg = os.time()
--local str = "hello world"
local str = "hello world hello world hello world hello world hello world"
--[[
for i = 1, 10000000 do
local res = str .. str
--local res = string.format("%s%s", str, str)
--3s 9s
--6s 16s
end
for i = 1, 10000000 do
--local res = str .. str .. str
local res = string.format("%s%s%s", str, str, str)
--3s 12s
--7s 23s
end
for i = 1, 1000000 do
local res = str .. str .. str .. str .. str .. str .. str .. str .. str .. str
--local res = string.format("%s%s%s%s%s%s%s%s%s%s", str, str, str, str, str, str, str, str, str, str)
--1s 4s
--2s 7s
end
]]
print(os.time() - Beg)
为什么呢？看源码就知道了。".."被解释为OP_CONCAT，最终调用的是luaV_concat。string.format被注册为string库的一个函数，调用的时候要触发一个C函数调用，最终是调用str_format。直接贴源码：
void luaV_concat (lua_State *L, int total, int last) {
do {
StkId top = L->base + last + 1;
int n = 2; /* number of elements handled in this pass (at least 2) */
if (!(ttisstring(top-2) || ttisnumber(top-2)) || !tostring(L, top-1)) {
if (!call_binTM(L, top-2, top-1, top-2, TM_CONCAT))
luaG_concaterror(L, top-2, top-1);
} else if (tsvalue(top-1)->len == 0) /* second op is empty? */
(void)tostring(L, top - 2); /* result is first op (as string) */
else {
/* at least two string values; get as many as possible */
size_t tl = tsvalue(top-1)->len;
char *buffer;
int i;
/* collect total length */
for (n = 1; n < total && tostring(L, top-n-1); n++) {
size_t l = tsvalue(top-n-1)->len;
if (l >= MAX_SIZET - tl) luaG_runerror(L, "string length overflow");
tl += l;
}
buffer = luaZ_openspace(L, &G(L)->buff, tl);
tl = 0;
for (i=n; i>0; i--) { /* concat all strings */
size_t l = tsvalue(top-i)->len;
memcpy(buffer+tl, svalue(top-i), l);
tl += l;
}
setsvalue2s(L, top-n, luaS_newlstr(L, buffer, tl));
}
total -= n-1; /* got `n' strings to create 1 new */
last -= n-1;
} while (total > 1); /* repeat until only 1 result left */
}
首先看的是".."的实现，首先".."本身不是一个函数调用，仅仅是一个操作符。它的原理就是最直接的，就是把要连接的字符串全部找出来然后memcpy到新字符串上。之所以会有".."比较慢的误解，可能是源自于认为a .. b .. c需要先执行tmp = a .. b，再执行tmp = tmp .. c的操作，这样的拼接同一行如果出现很多次的话，memcpy的时间复杂度就会平方增长。但是看lua的源码就知道，它实际上并不是只连接两个字符串，它是会把所有需要连接的字符串都收集起来一起连接的。看以下例子
str = "a" .. "b"
以上语句的lua中间码如下
1       [1]     LOADK           0 -2    ; "a"
2       [1]     LOADK           1 -3    ; "b"
3       [1]     CONCAT          0 0 1
4       [1]     SETGLOBAL       0 -1    ; str
5       [1]     RETURN          0 1
str = "a" .. "b" .. "c"
以上语句的lua中间码如下
1       [1]     LOADK           0 -2    ; "a"
2       [1]     LOADK           1 -3    ; "b"
3       [1]     LOADK           2 -4    ; "c"
4       [1]     CONCAT          0 0 2
5       [1]     SETGLOBAL       0 -1    ; str
6       [1]     RETURN          0 1
首先可以看到，无论".."连接了多少个字符串，CONCAT都只会执行一次。其次，需要连接的字符串数量其实是记录在CONCAT指令的第三个参数里面。依然从源码可以看出来
case OP_CONCAT: {
int b = GETARG_B(i);
int c = GETARG_C(i);
Protect(luaV_concat(L, c-b+1, c); luaC_checkGC(L));
setobjs2s(L, RA(i), base+b);
continue;
}
OP_CONCAT的第一个参数是没用的，第二个参数是其实字符串在栈里的位置，第三个参数总共需要连接多少个字符串。需要连接的字符串会在栈里顺序排列下来。
也就是说，lua的实现没有想象中那么傻逼，这个连接字符串的算法从时间复杂度来说是O(n)，我反正是想不到更快的算法了。
下面看string.format的源码：
static int str_format (lua_State *L) {
int arg = 1;
size_t sfl;
const char *strfrmt = luaL_checklstring(L, arg, &sfl);
const char *strfrmt_end = strfrmt+sfl;
luaL_Buffer b;
luaL_buffinit(L, &b);
while (strfrmt < strfrmt_end) {
if (*strfrmt != L_ESC)
luaL_addchar(&b, *strfrmt++);
else if (*++strfrmt == L_ESC)
luaL_addchar(&b, *strfrmt++); /* %% */
else { /* format item */
char form[MAX_FORMAT]; /* to store the format (`%...') */
char buff[MAX_ITEM]; /* to store the formatted item */
arg++;
strfrmt = scanformat(L, strfrmt, form);
switch (*strfrmt++) {
case 'c': {
sprintf(buff, form, (int)luaL_checknumber(L, arg));
break;
}
case 'd': case 'i': {
addintlen(form);
sprintf(buff, form, (LUA_INTFRM_T)luaL_checknumber(L, arg));
break;
}
case 'o': case 'u': case 'x': case 'X': {
addintlen(form);
sprintf(buff, form, (unsigned LUA_INTFRM_T)luaL_checknumber(L, arg));
break;
}
case 'e': case 'E': case 'f':
case 'g': case 'G': {
sprintf(buff, form, (double)luaL_checknumber(L, arg));
break;
}
case 'q': {
addquoted(L, &b, arg);
continue; /* skip the 'addsize' at the end */
}
case 's': {
size_t l;
const char *s = luaL_checklstring(L, arg, &l);
if (!strchr(form, '.') && l >= 100) {
/* no precision and string is too long to be formatted;
keep original string */
lua_pushvalue(L, arg);
luaL_addvalue(&b);
continue; /* skip the `addsize' at the end */
}
else {
sprintf(buff, form, s);
break;
}
}
default: { /* also treat cases `pnLlh' */
return luaL_error(L, "invalid option " LUA_QL("%%%c") " to "
LUA_QL("format"), *(strfrmt - 1));
}
}
luaL_addlstring(&b, buff, strlen(buff));
}
}
luaL_pushresult(&b);
return 1;
}
string.format本身是一个函数调用，这就会有函数调用的消耗，不过我们暂时先忽略这种消耗。光看string.format本身也比".."的实现复杂不少。整个算法的思路是这样的：
循环地从格式字符串strfrmt找出有%的部分，每一次循环只处理从上次处理过的%的下一个字符到这次找到的%的这一部分的格式字符串。举个例子，假如strfrmt原本是"a%sb%s"。那么第一次循环就处理"a%s"，第二次循环处理"b%s"。
每次循环所需要处理的那一部分格式字符串存储在format字符串里，根据这个format字符串和当前栈的参数可以生成这一部分格式化之后的结果，这个结果存储在buff里面。每次循环结束的时候，会把buff添加到最终结果b上。
对于%s来说，会有一个优化，如果字符串的长度大于100，不把中间结果存储在buff上，而是直接把这部分字符串连接到最终结果b上。
由此可以看到，string.format比起".."，如果纯粹讨论字符串拼接，如果需要拼接的内容本身不多，其复杂的地方主要是在于它需要扫描，复制格式字符串；复制中间字符串以及一些函数调用方面的消耗。
以10000000次实验，基础字符串11个字符，每次连接2个基础字符串的实验为基础，我尝试修改了lua的源码再次做了一下实验
首先，原版lua中，这个实验里string.format耗时9秒； ".."耗时3秒
当我把if (!strchr(form, '.') && l >= 100)这个条件去掉，也就是说无论何时都使用优化方案的话，string.format耗时5秒，快了一倍。这是合理的，因为没有这个优化，最终拼接结果的每一个字符，都要先存储在中间变量buff上，当然会慢一倍。
然后，为了模拟string.format的函数调用消耗，我把".."封装成一个函数来调用
local function concat(str1, str2)
return str1 .. str2
end
用concat替换".."之后，".."的耗时变成了4秒，和string.format已经很接近了。
为了模拟扫描和复制格式字符串的过程，我把concat再改一下，把string.format的格式字符串也加进去（根据luaV_concat的实现，增加在".."后面的字符串会被直接memcpy到最终结果后面），修改后的concat如下
local function concat(str1, str2)
return str1 .. str2 .. "%s%s"
end
用这个concat替换".."之后，".."的耗时变成了5秒，和string.format一样。下面贴出完整的代码
//lstrlib.c
if (true/*!strchr(form, '.') && l >= 100*/) {
--test.lua
local Beg = os.time()
local str = "hello world"
local function concat(str1, str2)
return str1 .. str2 .. "%s%s"
end
for i = 1, 10000000 do
--local res = str .. str
local res = concat(str, str)
--local res = string.format("%s%s", str, str)
end
print(os.time() - Beg)
这个实验虽然不算很严谨，但是还是能说明问题。string.format如果仅仅用来拼字符串的话，和".."相比主要的消耗用在中间字符串的复制上，函数调用本身也有部分消耗，格式字符串的扫描和复制也占了一部分，这部分消耗取决于格式字符串的长度。
由此可知，如果不影响代码可读性的话，".."是肯定优于string.format的
http://yulinlu.blog.163.com/blog/static/58815698201231502544486/

查看全文

相关阅读:
正则表达式体会
 checkbox、全选反选，获取值
 弹出窗体值回调
 页面点击任意js事件，触发360、IE浏览器新页面
 XML增、删、改
 面试题
 行列转换
 DataTable 和Json 字符串互转
 前台js与后台方法互调
 文件与base64二进制转换

原文地址：https://www.cnblogs.com/byfei/p/6389909.html