• 博客访问： 953143
• 博文数量： 33
• 博客积分： 803
• 博客等级： 军士长
• 技术积分： 1755
• 用 户 组： 普通用户
• 注册时间： 2010-03-05 18:58

《Python科学计算》的作者

2016年（1）

2014年（2）

2013年（3）

2012年（27）

2012-11-17 15:23:06

```import numpy as np

N = 1000
data = np.random.rand(N, 10, 10)
dm = np.zeros(N)
for i in xrange(N):
dm[i] = np.linalg.det(data[i])
```

## NumPy中的det()代码

```def slogdet(a):
a = asarray(a)
_assertRank2(a)
_assertSquareness(a)
t, result_t = _commonType(a)
a = _fastCopyAndTranspose(t, a)
a = _to_native_byte_order(a)
n = a.shape[0]
if isComplexType(t):
lapack_routine = lapack_lite.zgetrf
else:
lapack_routine = lapack_lite.dgetrf
pivots = zeros((n,), fortran_int)
results = lapack_routine(n, n, a, n, pivots, 0)
info = results['info']
if (info < 0):
raise TypeError, "Illegal input to Fortran routine"
elif (info > 0):
return (t(0.0), _realType(t)(-Inf))
sign = 1. - 2. * (add.reduce(pivots != arange(1, n + 1)) % 2)
d = diagonal(a)
absd = absolute(d)
sign *= multiply.reduce(d / absd)
log(absd, absd)
return sign, logdet

def det(a):
sign, logdet = slogdet(a)
return sign * exp(logdet)
```

## Profiling

```%%prun
import numpy as np

N = 5000
data = np.random.rand(N, 10, 10)
dm = np.zeros(N)
for i in xrange(N):
dm[i] = np.linalg.det(data[i])
```

```165004 function calls in 1.581 seconds

Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
5000    0.551    0.000    1.432    0.000 linalg.py:1560(slogdet)
15000    0.130    0.000    0.130    0.000 {method 'reduce' of 'numpy.ufunc' objects}
5000    0.078    0.000    1.510    0.000 linalg.py:1642(det)
5000    0.068    0.000    0.068    0.000 {numpy.linalg.lapack_lite.dgetrf}
5000    0.068    0.000    0.068    0.000 {numpy.core.multiarray._fastCopyAndTranspose}
5000    0.060    0.000    0.130    0.000 linalg.py:99(_commonType)
5000    0.052    0.000    0.052    0.000 {method 'diagonal' of 'numpy.ndarray' objects}
10000    0.051    0.000    0.097    0.000 numeric.py:167(asarray)
5000    0.047    0.000    0.123    0.000 linalg.py:139(_fastCopyAndTranspose)
10000    0.046    0.000    0.046    0.000 {numpy.core.multiarray.array}
1    0.040    0.040    1.581    1.581 :2()
5000    0.039    0.000    0.057    0.000 linalg.py:127(_to_native_byte_order)
5000    0.038    0.000    0.142    0.000 fromnumeric.py:902(diagonal)
10000    0.038    0.000    0.058    0.000 linalg.py:71(isComplexType)
5000    0.034    0.000    0.056    0.000 linalg.py:157(_assertSquareness)
5000    0.034    0.000    0.034    0.000 {numpy.core.multiarray.arange}
15000    0.034    0.000    0.034    0.000 {issubclass}
1    0.031    0.031    0.031    0.031 {method 'rand' of 'mtrand.RandomState' objects}
5000    0.031    0.000    0.040    0.000 linalg.py:151(_assertRank2)
5001    0.025    0.000    0.025    0.000 {numpy.core.multiarray.zeros}
15000    0.025    0.000    0.025    0.000 {len}
5000    0.020    0.000    0.029    0.000 linalg.py:84(_realType)
5000    0.012    0.000    0.012    0.000 {max}
5000    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
5000    0.010    0.000    0.010    0.000 {min}
5000    0.009    0.000    0.009    0.000 {method 'get' of 'dict' objects}
1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
```

## 编写高速运算的代码

• 删除对输入矩阵的检测，直接将其当作是满足条件的双精度浮点数组。
• 无须对输入矩阵进行转置，因为转置矩阵的行列式与原始矩阵的行列式相同。
• 不对输入矩阵进行复制，`lapack_lite.dgetrf()`的计算结果将覆盖输入矩阵。如果需要保留原始矩阵，可以在调用`dets_fast()`之前复制数据。
• 创建一个形状为MxN的整数数组pivots，用来保存每次调用`lapack_lite.dgetrf()`时所得到的pivot数组。
• 将取对象线元素的`diagonal(a)`转换为使用高级下标存取:`idx = np.arange(n); d = a[:, idx, idx]`
• 采用NumPy的矢量运算，省略Python级别的循环。
```import numpy as np
from numpy.core import intc
from numpy.linalg import lapack_lite

def dets_fast(a):
m = a.shape[0]
n = a.shape[1]
lapack_routine = lapack_lite.dgetrf
pivots = np.zeros((m, n), intc)
flags = np.arange(1, n + 1).reshape(1, -1)
for i in xrange(m):
tmp = a[i]
lapack_routine(n, n, tmp, n, pivots[i], 0)
sign = 1. - 2. * (np.add.reduce(pivots != flags, axis=1) % 2)
idx = np.arange(n)
d = a[:, idx, idx]
absd = np.absolute(d)
sign *= np.multiply.reduce(d / absd, axis=1)
np.log(absd, absd)
return sign * np.exp(logdet)
```

```import numpy as np
from numpy.core import intc
from numpy.linalg import lapack_lite

def dets(a):
length = a.shape[0]
dm = np.zeros(length)
for i in xrange(length):
dm[i] = np.linalg.det(M[i])
return dm
```

```N = 1000
M = np.random.rand(N*10*10).reshape(N, 10, 10)
print np.allclose(dets(M), dets_fast(M.copy()))
```
```True
```

```%timeit dets(M)
%timeit dets_fast(M.copy())
```
```1 loops, best of 3: 173 ms per loop
100 loops, best of 3: 14.1 ms per loop```