----------------------- Page 3-----------------------17
Basic facts
基本事实
Before running any code, Lua translates (precompiles) the source into an internal format. This format is a sequence of instructions for a virtual machine, similar to machine code for a real CPU. This internal format is then interpreted by C code that is essentially a while loop with a large switch inside, one case for each instruction.
在运行任何代码以前,Lua把源代码翻译(预编译)为一种内部格式。这种格式是虚拟机的指令的序列,类似真实CPU的机器码。然后该内部格式由C代码——实质上是个内含大量开关的当型循环,每个指令(对应)一种情形——解释。
Perhaps you have already read somewhere that, since version 5.0, Lua uses a register-based virtual machine. The "registers" of this virtual machine do not correspond to real registers in the CPU, because this correspondence would be not portable and quite limited in the number of registers available. Instead, Lua uses a stack (implemented as an array plus some indices) to accommodate its registers. Each active function has an activation record, which is a stack slice wherein the function stores its registers. So, each function has its own registers 2. Each function may use up to 250 registers, because each instruction has only 8 bits to refer to a register.
可能你已经在某个地方了解了,从版本 5.0开始,Lua使用基于寄存器的虚拟机。该虚拟机的“寄存器”并不对应CPU中的真实寄存器,因为这种对应是不可移植的而且在可用的寄存器数目方面非常受限制。取而代之的是,Lua用栈(实现为数组外加一些索引)来提供寄存器。每个活动函数具有一个活动记录——它是个栈的片段,函数在其中存储其寄存器。所以,每个函数都有它自己的寄存器 2。每个函数最多可用250个寄存器,因为每个指令只有8位来引用一个寄存器。
Given that large number of registers, the Lua precompiler is able to store all local variables in registers. The result is that access to local variables is very fast in Lua. For instance, if a and b are local variables, a Lua statement like a = a + b generates one single instruction: ADD 0 0 1 (assuming that a and b are in registers 0 and 1, respectively). For comparison, if both a and b were globals, the code for that addition would be like this:
提供了那么大数量的寄存器,Lua预编译器能在寄存器中存储全部局部变量。结果是Lua中访问局部变量非常快。例如,如果a和b是局部变量,像a = a + b这样的Lua语句产生单条指令:ADD 0 0 1(假定a和b分别在寄存器0和1中)。作为比较,如果a和b是全局的,为此而增加的代码类似这样:
GETGLOBAL 0 0 ; a
GETGLOBAL 1 1 ; b
ADD 0 0 1
SETGLOBAL 0 0 ; a
So, it is easy to justify one of the most important rules to improve the performance of Lua programs: use locals!
所以,提升Lua程序性能的一个最重要的准则是:使用局部(变量)!
If you need to squeeze performance out of your program, there are several places where you can use locals besides the obvious ones. For instance, if you call a function within a long loop, you can assign the function to a local variable. For instance, the code
如果你需要从你的程序中榨出(更多)性能,除了那些很明显的,你还可以在一些地方使用局部(变量)。例如,如果在长时间(运行)的循环中调用函数,你可把函数赋给局部变量,例如,代码
for i = 1, 1000000 do
local x = math.sin(i)
end
runs 30% slower than this one:
运行得比这个慢30%:
local sin = math.sin
for i = 1, 1000000 do
local x = sin(i)
end
-----------------------
2 This is similar to the register windows found in some CPUs.
这类似与windows在某些CPU中建立的寄存器。
----------------------- Page 4-----------------------18
Access to external locals (that is, variables that are local to an enclosing function) is not as fast as access to local variables, but it is still faster than access to globals. Consider the next fragment:
访问外围局部(变量)(即位于闭包函数中的变量)不像访问局部变量那么快,但是仍然比访问全局的快。考虑下一个片段:
function foo (x)
for i = 1, 1000000 do
x = x + math.sin(i)
end
return x
end
print(foo(10))
We can optimize it by declaring sin once, outside function foo:
我们可通过在函数foo外面声明sin一次来优化它:
local sin = math.sin
function foo (x)
for i = 1, 1000000 do
x = x + sin(i)
end
return x
end
print(foo(10))
This second code runs 30% faster than the original one.
此处第二片代码运行比原来那个快30%。
Although the Lua compiler is quite efficient when compared with compilers for other languages, compilation is a heavy task. So, you should avoid compiling code in your program (e.g., function loadstring) whenever possible. Unless you must run code that is really dynamic, such as code entered by an end user, you seldom need to compile dynamic code.
虽然,相较于其他语言,Lua编译器非常高效,编译(仍然)是个繁重的任务。所以,你应该尽可能避免在程序中编译代码(例如函数loadstring)。除非必须运行确实是动态的代码,比如终端用户输入的代码,你很少需要编译动态代码。
As an example, consider the next code, which creates a table with functions to return constant values from 1 to 100000:
作为例子,考虑下一片代码,它创建一个表,其中带有返回从1到100000的常数值的函数:
local lim = 100000
local a = {}
for i = 1, lim do
a[i] = loadstring(string.format("return %d", i))
end
print(a[10]()) --> 10
This code runs in 1.4 seconds.
这段代码运行1.4秒。
With closures, we avoid the dynamic compilation. The next code creates the same 100000 functions in 1/10 of the time (0.14 seconds):
通过闭包,我们避免了动态编译。下一段代码用1/10的时间(0.14秒)创建同样100000个函数:
function fk (k)
return function () return k end
end
----------------------- Page 5-----------------------19
local lim = 100000
local a = {}
for i = 1, lim do a[i] = fk(i) end
print(a[10]()) --> 10
阅读(1197) | 评论(0) | 转发(0) |