Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1072603
  • 博文数量: 264
  • 博客积分: 7225
  • 博客等级: 少将
  • 技术积分: 5096
  • 用 户 组: 普通用户
  • 注册时间: 2008-11-17 08:53
文章分类

全部博文(264)

文章存档

2011年(33)

2010年(52)

2009年(152)

2008年(27)

我的朋友

分类:

2008-12-22 20:35:52

 
載位置:

免費登記註冊,即可下載

現在最新版本是2007.03.21釋出的GotoBLAS-1.14.tar.gz,已完全支援Core 2 Duo係列

其中新版安裝方式非常人性化,與在Linuxsource code使用configure方式略有不同

 

Linux底下使用x86x86_64 CPU者,提供一個快速安裝的script,可快速簡測你的SMP和編譯器種類,至於支援的編譯器種類依此順序為依據PathScalePGIIntelgfortrang95g77

Step 1:解壓縮

tar –zxvf GotoBLAS-1.14.tar.gz

 

Step 2:安裝GotoBLAS,在32 bit64 bit的安裝分別如下

For 32 bit安裝: ./quickbuild.32bit

For 64 bit安裝: ./quickbuild.64bit

安裝完後,即會把函式庫建在剛剛解壓縮後的資料夾內

Core 2 Duo為例,會主要產生3個檔案

libgoto.a

libgoto_core2p-r1.14.a  系統會自動依你的CPU型式來取名

libgoto_core2p-r1.14.so

 

若為特殊機器,則需藉由改寫getarch.cMakefile.rule,將符合本身機器的參數前面註釋拿掉,並重新編譯即可產生函式庫,主要安裝過程請詳閱Quickinstallat.txt

------------------------------------------------------

Note: (安装完后的提示如下)

Done. This library is compiled with following conditions.

Binary  ... 64bit
Fortran ... INTEL
SMP     ... Enabled. You have to link library with -lpthread option.
-------------------------------------------------------

 

 

裝設GotoBLAS無非就是想讓vasp的執行速度加快,如何修改Makefile?步驟如下

Step 1:將所得到的3個重要函式庫其中之一,丟進vasp.4.lib內,或另設資料夾將其置入

/vasp/src/vasp.4.lib/libgoto_core2p-r1.14.so

 

Step 2:修改MakefileBLAS路徑,先將全部舊有的BLAS路徑以#符號注釋起來,而後加入新的BLAS路徑

BLAS= ../vasp.4.lib/libgoto_core2p-r1.14.so

 

可參考量化網頁相關文章:

/Experience/CommonSoftwares/VASP/CompileInstallation/200512/27.html

 

Step 3:重新編譯Makefile即可

 

 

By 阿達仔  國立成功大學

 原文地址

二If you compile the blas libraries with the threading turned on but the number of threads set to 1 you gain between 33 and 50% speed when doing larger calculations.
The User Configuration part of Makefile.rule:


#
# Beginning of user configuration
#

# This library's version
REVISION = -r1.26

# Which C compiler do you prefer? Default is gcc.
C_COMPILER = GNU
# C_COMPILER = INTEL
# C_COMPILER = PGI

# Now you don't need Fortran compiler to build library.
# If you don't spcifly Fortran Compiler, GNU g77 compatible
# interface will be used.
# F_COMPILER = G77
# F_COMPILER = G95
# F_COMPILER = GFORTRAN
F_COMPILER = INTEL
# F_COMPILER = PGI
# F_COMPILER = PATHSCALE
# F_COMPILER = IBM
# F_COMPILER = COMPAQ
# F_COMPILER = SUN
# F_COMPILER = F2C

# If you need 64bit binary; some architecture can accept both 32bit and
# 64bit binary(X86_64, SPARC, Power/PowerPC or WINDOWS).
BINARY64 = 1

# If you want to build threaded BLAS
SMP = 1

# You can define maximum number of threads. Basically it should be
# less than actual number of cores. If you don't specify one, it's
# automatically detected by script.
MAX_THREADS = 1

# If you want to use legacy threaded Level 3 implementation.
# Some architecture prefer this algorithm, but it's rare.
# USE_SIMPLE_THREADED_LEVEL3 = 1

# If you want to use GotoBLAS with accerelator like Cell or GPGPU
# This is experimental and currently won't work well.
# USE_ACCERELATOR = 1

# Define accerelator type (won't work)
# USE_CELL_SPU = 1

# Theads are still working for a while after finishing BLAS operation
# to reduce thread activate/deactivate overhead. You can determine
# time out to improve performance. This number should be from 4 to 30
# which corresponds to (1 << n) cycles. For example, if you set to 26,
# thread will be running for (1 << 26) cycles(about 25ms on 3.0GHz
# system). Also you can control this mumber by GOTO_THREAD_TIMEOUT
# CCOMMON_OPT += -DTHREAD_TIMEOUT=26

# If you need cross compiling
# (you have to set architecture manually in getarch.c!)
# Example : HOST ... G5 OSX, TARGET = CORE2 OSX
# CROSS_SUFFIX = i686-apple-darwin8-
# CROSS_VERSION = -4.0.1
# CROSS_BINUTILS =

# If you need Special memory management;
# Using HugeTLB file system(Linux / AIX / Solaris)
# HUGETLB_ALLOCATION = 1

# Using bigphysarea memory instead of normal allocation to get
# physically contiguous memory.
# BIGPHYSAREA_ALLOCATION = 1

# To get maxiumum performance with minimum impact to the system,
# mixing memory allocation may be worth to try. In this case,
# you have to define one of ALLOC_HUGETLB or BIGPHYSAREA_ALLOCATION.
# Another allocation will be done by mmap or static allocation.
# (Not implemented yet)
# MIXED_MEMORY_ALLOCATION = 1

# Using static allocation instead of dynamic allocation
# You can't use it with ALLOC_HUGETLB
# STATIC_ALLOCATION = 1

# If you want to use CPU affinity
# CCOMMON_OPT += -DUSE_CPU_AFFINITY

# If you want to use memory affinity (NUMA)
# You can't use it with ALLOC_STATIC
# NUMA_AFFINITY = 1

# If you want to use interleaved memory allocation.
# Default is local allocation(it only works with NUMA_AFFINITY).
# CCOMMON_OPT += -DINTERLEAVED_MAPPING

# If you want to drive whole 64bit region by BLAS. Not all Fortran
# compiler supports this. It's safe to keep comment it out if you
# are not sure.
# INTERFACE64 = 1

# If you have special compiler to run script to determine architecture.
GETARCH_CC +=
GETARCH_FLAGS +=

阅读(1185) | 评论(0) | 转发(2) |
0

上一篇:VASP编译之MKL10使用

下一篇:CPMD学习资料

给主人留下些什么吧!~~