performance problem

  • From: Laurent Deniau <Laurent.Deniau@xxxxxxx>
  • To: "luajit@xxxxxxxxxxxxx" <luajit@xxxxxxxxxxxxx>
  • Date: Fri, 15 May 2015 17:59:21 +0000

Hello,

I am looking for help about unexpected bad performance while building lazy
expressions with LuaJIT. It seems that the traces are always aborted despite of
this very simple use case fully in Lua (in the loop at the bottom), so I guess
that there is something simple that I don't catch. The real case involve much
more code including C functions calls and cdata allocations.

I copy at the end of the post, the code of a toy example reproducing the
problem, which takes about 10 sec to run the 1e7 loops (last luajit 2.1.0). I
would expect this code to be x10-x100 faster if fully JITed, Sinked and
optimised (i.e. 1-2 sec for 1e8 loops). Any advice to improve the performance
of this code is more than welcome (and mandatory to continue in with lazy
expressions in our case). Thanks.

Best,
Laurent.
--
Laurent Deniau http://cern.ch/mad
Accelerators Beam Physics mad@xxxxxxx
CERN, CH-1211 Geneva 23 Tel: +41 (0) 22 767 4647


Small use case for motivation:
-------------------------

function M.track(m, e)
local r, pz = m:same() -- r and m are like {x, y, s}
pz = 1/sqrt(1 + 2*m.ps/e.b + m.ps^2 - m.px^2 - m.py^2)
r.x = m.x + e.L*m.px*pz
r.y = m.y + e.L*m.py*pz
r.s = m.s + e.L*(1/e.b + m.ps)*pz - (1-e.T)*e.LD/e.b
return r:eval() -- replace expressions by their evaluation before
returning
end

Note that {x, y, s} in m and r can be a mix of scalars and polynomials, and if
they are only scalars, the lazy expressions completely disappear. So the
critical case is for the mix of scalars and small polynomials (implemented in
C), where the time to build the expression is about x10-x20 slower than their
evaluation, depending on the expression...

Toy implementation of lazy expressions:
----------------------------------
Overloads only '+' and '*' for precedence, and overrides 'sqrt' for precedence
and arty.
The 1e7 loops building (only) a simple expression is at the bottom.
Uncommenting the propagation of the orders during the build of the node adds a
slowdown by x2.5

-- preliminaries

local sqrt = function(a)
return type(a) == "number" and math.sqrt(a) or a:sqrt()
end

local order = function(a)
if type(a) == "number" then
return 0, 0
else
return a:order()
end
end

local print = function(a, n)
if type(a) == "number" then
n = n or 0
for i=0,n do io.write(' ') end
io.write('n=', a, '\n')
else
a:print(n)
end
end

-- nodes --
local node

node = {
new = function(self, op, lo, hi, lhs, rhs)
return setmetatable({op=op, lo=lo, hi=hi, lhs=lhs, rhs=rhs}, node)
end,

__add = function(a, b)
-- local alo, ahi = order(a)
-- local blo, bhi = order(b)
-- return node:new('+', math.min(alo,blo), math.max(ahi,bhi), a, b)
return node:new('+', -1, -1, a, b)
end,

__mul = function(a, b)
-- local alo, ahi = order(a)
-- local blo, bhi = order(b)
-- return node:new('*', alo+blo, ahi+bhi, a, b)
return node:new('*', -1, -1, a, b)
end,

sqrt = function(a)
return node:new('sqrt', a.lo, a.hi, a)
end,

order = function(a)
return a.lo, a.hi
end,

eval = function(a)
return a
end,

print = function(a, n)
n = n or 0
for i=0,n do io.write(' ') end
io.write(a.op, ': lo=', a.lo, ', hi=', a.hi, '\n')
if a.lhs ~= nil then print(a.lhs, n+2) end
if a.rhs ~= nil then print(a.rhs, n+2) end
end,
}
node.__index = node

-- leaf --
local leaf

leaf = {
new = function(self, s, x, lo, hi, mo)
return setmetatable({s=s, x=x, lo=lo, hi=hi, mo=mo}, leaf)
end,

__add = function(a, b)
local alo, ahi = order(a)
local blo, bhi = order(b)
return node:new('+', math.min(alo,blo), math.max(ahi,bhi), a, b)
end,

__mul = function(a, b)
local alo, ahi = order(a)
local blo, bhi = order(b)
return node:new('*', alo+blo, ahi+bhi, a, b)
end,

sqrt = function(a)
return node:new('sqrt', 0, a.mo, a)
end,

order = function(a)
return a.lo, a.hi
end,

eval = function(a, b)
a.lo, a.hi = order(b)
return a
end,

print = function(a, n)
n = n or 0
for i=0,n do io.write(' ') end
io.write(a.s, ': x=', a.x, ', lo=', a.lo, ', hi=', a.hi, ', mo=', a.mo,
'\n')
end,
}
leaf.__index = leaf

-- use case
local a, b, c, e = leaf:new('a', 1.5, 0, 1, 4), leaf:new('b', 2.5, 1, 2, 4),
leaf:new('c', 3.5, 2, 3, 4)

for i=1,10000000 do
e = 2*a + 3*b*b + sqrt(c)
end

print(e)


Other related posts: