分类: LINUX
2008-05-04 20:25:53
How do I setup and execute a "Fluent" parallel batch job? The Fluent jobs that you will want to run on the HPCVL machines are likely to be quite large. To utilize the parallel structure of the Sun Fire's, Fluent offers several options to execute the solver in a parallel environment, i.e. on several CPU's simultaneously. Currently, the default option for such runs is vendor MPI i.e., it uses the SUN native version of the “Message Passing Interface” for inter-process communication.
To take advantage of the parallel capabilities of Fluent, you have to call the program with a series of commandline options that specify the details of your parallel run. Here is a short overview:
Parallel jobs should only be run in batch using the Grid Engine. In fact, the use of GridEngine is mandatory on the Sun Fire cluster for all production jobs. To submit a parallel job to Grid Engine the command line appears in a submit script, for which we have an . It has to be altered by replacing all items enclosed in {} by the applicable values. The number of processors specified in this script appears only once, after #$ -pe fluent.pe, which is where you let the Gridengine know how many processors to allocate to run the program. The internal environment variable $NSLOTS will automatically be set to this value and can then be used in the fluent command line. It is necessary to source a setup file which resides in /opt/fluent/Fluent.Inc/ and is called setup.sh for the 32-bit version. This will set various environment variables and enable the Fluent program to properly interact with Grid Engine. If you are interested, take a look. The file is readable. If you are using the 64-bit version of Fluent, you have to alter the batch script to source the /opt/fluent/Fluent.Inc/setup_64bit.sh file instead.
In the above script, the parallel environment fluent.pe is for Fluent jobs only, and is used to keep track of the available licenses. The licensing situation can also be checked interactively by typing: Grid Engine is able to interact with the license manager of Fluent (FlexLM) to check if sufficient licenses are available for running. This will keep the scheduler from starting jobs because enough processors are available, just to be stopped again because there is not enough licenses. Grid Engine keeps an internal counter of available "license slots" which gets updated frequently. Everytime Grid Engine attempts to schedule a Fluent job and is kept from doing so because not enough licenses are available, it will "requeue" the job. Since this causes the issue of an email if notification was requested, we recommend to remove the email notification line(s) (e.g. #$ -m be).
All processes are allocated within a single node. This is to make communication more efficient and to avoid problems with the control by Gridengine. The effect of this is that, while still using MPI, Fluent employs a so-called shared-memory layer for communication. The disadvantage is that it takes longer until the required resources (dedicated processors) are available, i.e. you spend more time on the Grid Engine waiting queue.
Once the script has been adapted, it can be submitted to the Gridengine by There is an easier way to do this: We are supplying a small perl script called
flulic
qsub batch_file_name
from sfnode0 (which is the GridEngine submit host). Note that the job will appear as a parallel job on the GridEngine's qstat or qmon. Note also that submission of a parallel job in this way is only profitable for large systems that use many CPU cycles, since the overhead for assigning processes, preparing nodes, and communication between them is considerable.
FluentSubmit
and answer the questions. The script expects a Fluent input file with "file extension" .flin to be present and will do everything else automatically. This is meant for simple Fluent job submissions. More complex job submissions are better done manually.