View previous topic :: View next topic |
Author |
Message |
gary_uk n00b
Joined: 18 Nov 2014 Posts: 10
|
Posted: Tue Nov 18, 2014 1:42 pm Post subject: Installing numpy with multiprocessing |
|
|
Hardware: Xeon E5-2640v3 X2
Sofware: Gentoo_AMD64
I was hoping to install Numpy and its LAPACK/BLAS dependencies in a way that makes the most of the many CPU cores. According to Scipy's website ( http://wiki.scipy.org/ParallelProgramming ), this is can be done by compiling with OpenMP turned on. I must admit I'm a little concerned about adding `-fopenmp' to the compiler flags in /etc/portage/make.conf and applying a general update using emerge. Since I'm new to Gentoo I was wondering about the following:
1. Is there a recommended way in Gentoo to install Numpy with OpenMP?
2. If not, is there a way I can tell Portage to use specific compiler flags only to Numpy and it dependencies?
3. If not, is my best solution compiling Numpy and its dependencies manually without Portage?
Thanks. |
|
Back to top |
|
|
khayyam Watchman
Joined: 07 Jun 2012 Posts: 6227 Location: Room 101
|
Posted: Thu Dec 04, 2014 5:17 pm Post subject: |
|
|
gary ...
I'm not sure about this specific use case but you might try the following:
Create an 'openmp.conf' in /etc/portage/env with whatever env you need to apply (ie, CFLAG changes)
/etc/portage/env/openmp.conf
Code: | CFLAGS="${CFLAGS} -fopenmp" |
In /etc/portage/package.env/ create a file providing the information as to which packages should use this env ...
/etc/portage/package.env/openmp.env
Code: | dev-python/numpy openmp.conf
<category>/<package> openmp.conf |
This env should be applied to these packages on re-merge.
HTH & best ... khay |
|
Back to top |
|
|
gary_uk n00b
Joined: 18 Nov 2014 Posts: 10
|
Posted: Thu Jan 29, 2015 11:18 am Post subject: |
|
|
khayyam wrote: |
I'm not sure about this specific use case but you might try the following:
|
Thanks for the advice, but my best solution so far has turned out to be a little more long-winded. In order to make optimal use all the cores, I needed a few things:
1. OpenMP (a la UNIX's pthread)
1. Intel's mkl.
2. Intel's icc.
... I couldn't get these to install into Gentoo (apparently) because Gentoo isn't LSB-compliant. I therefore installed a LSB package onto an RPM-based distro, installed the Intel binaries and copied the /opt/intel/ directory into Gentoo's /opt/. In order to make sure icc was available in Gentoo's environment, I created a shell script /etc/profile.d/intel along the lines of...
Code: |
source /opt/intel/.../compilervars.sh intel64
export LD_LIBRARY_PATH=/opt/intel/.../mkl/lib/intel64:/opt/intel/.../lib/intel64:$LD_LIBRARY_PATH
|
...so I could compile with icc as any user. I then downloaded the latest stable numpy tar.gz source and made the following changes to the configuration files:
intelccompiler.py:
Code: |
#self.cc_exe = 'icc -m64 -fPIC'
self.cc_exe = 'icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'
|
intel.py:
Code: |
#return ['-i8 -xhost -openmp -fp-model strict']
return ['-xhost -openmp -fp-model strict -fPIC']
|
site.cfg:
Code: |
[mkl]
library_dirs = /opt/intel/.../mkl/lib/intel64
include_dirs = /opt/intel/.../mkl/include
mkl_libs = mkl_rt
lapack_libs = mkl_lapack95_lp64
|
Finally, I installed numpy using the following line:
Code: |
sh-4.2# python setup.py config --compiler=intelem build_clib --compiler=intelem build_ext --compiler=intelem install
|
(... actually I used `python2...' because most of my code is still written for Python 2.7.X).
I agree the process is a lot more long winded than `emerge -qan numpy'. I'm not familiar enough with Gentoo to know whether Portage could have performed this kind of installation with this kind of customisation. It's also quite irritating when attempting to install numpy-dependent packages for Portage to complain that numpy is not installed, forcing me to resort to distutils every time I want to install something rather than using emerge. However, a comparison of the differences in performance when running the following program is instructive:
Code: |
import numpy as np
import time
n = 5
N = 6000
M = 10000
k_list = [64, 80, 96, 104, 112, 120, 128, 144, 160, 176, 192, 200, 208, 224, 240, 256, 384]
def get_gflops(M, N, K):
return M*N*(2.0*K-1.0) / 1024**3
#np.show_config()
for K in k_list:
a = np.array(np.random.random((M, N)), dtype=np.double, order='C', copy=False)
b = np.array(np.random.random((N, K)), dtype=np.double, order='C', copy=False)
A = np.matrix(a, dtype=np.double, copy=False)
B = np.matrix(b, dtype=np.double, copy=False)
start = time.time()
for i in range(n):
C = np.dot(A, B)
end = time.time()
tm = (end-start) / float(n)
print ('{0:4}, {1:9.7}, {2:9.7}'.format(K, tm, get_gflops(M, N, K) / tm))
sh-4.2$
|
Running the program under open-source BLAS/LAPACK:
Code: |
64, 7.206665, 1.057355
80, 9.500084, 1.004202
96, 10.81931, 1.059217
104, 12.05694, 1.030112
112, 12.7766, 1.047227
120, 13.97688, 1.02598
128, 14.33727, 1.067149
144, 16.51003, 1.043002
160, 18.15292, 1.054376
176, 20.34829, 1.034977
192, 21.69134, 1.059409
200, 23.98138, 0.9982743
208, 24.3797, 1.021341
224, 25.53424, 1.050354
240, 28.10184, 1.022709
|
Running the program under Intel MKL/pthread:
Code: |
64, 0.05577588, 136.6182
80, 0.09417605, 101.2996
96, 0.08080792, 141.8178
104, 0.06589293, 188.4876
112, 0.145041, 92.24978
120, 0.1274409, 112.5227
128, 0.1084719, 141.0504
144, 0.1280391, 134.4901
160, 0.124629, 153.5758
176, 0.1102171, 191.0774
192, 0.105535, 217.7476
200, 0.2072258, 115.5262
208, 0.2185948, 113.9094
224, 0.2123821, 126.2818
240, 0.1290472, 222.7093
|
Of course the results speak for themselves, with a speed increase somewhere between 100-200X. Bear in mind matrix multiplication is easily parallelised whereas simple element-by-element arithmetic would not see such speed increases. Presently I'm looking to overcome the limitations imposed by GIL using `Joblib' which provides convenient pipelining. I've experimented with it and have seen improvements in the 5-10X range, but I'm sure I've only scratched the surface.
In the meantime, I hope this feedback is useful! |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|