OpenMP for Fortran
- OpenMP Directive
- Syntax of OpenMP compiler directive for Fortran:
!$OMP DirectiveName Optional_CLAUSES... ... ... Program statements between the !$OMP lines ... are executed in parallel by all threads ... !$OMP END DirectiveName
- Program statements between the 2 red lines are executed by multiple threads
-
- Setting the level of parallellism in OpenMP programs
- The number of threads that will be created to execute parallel sections in an OpenMP program is controlled by the environment variable OMP_NUM_THREADS
- To set this environment variable use:
export OMP_NUM_THREADS=... Example: export OMP_NUM_THREADS=8
-
- Compiling OpenMP programs
- Fortran
- Compile:
f90 -O -c -xopenmp -stackvar Prog.f90
- Link:
f90 -O -o Executable -xopenmp -stackvar Prog1.o Prog2.o ....
- Compile:
- Fortran
- Introductory Example
- Parallel "Hello World" OpenMP program:
PROGRAM Main !$OMP PARALLEL print *, "Hello World !" !$OMP END PARALLEL END
- Example Program: (Demo above code)
- Prog file (OpenMP Hello World): click here
- Compile with:
- f90 -O
- openMP01.f90
- Run with:
- export OMP_NUM_THREADS=8
- a.out
Make sure you do it on compute.
You will see "Hello World !!!" printed EIGHT times !!! (Remove the #pragma line and you get ONE line)....
- Parallel "Hello World" OpenMP program:
- Defining shared and private (non-shared) variables in parallel section
- Recall:
- There is no scopes in Fortran
Fortran uses option keywords to define private (non-shared) (and shared) variables....
- Defining shared and private variables in a PARALLEL section
- A variable is by default shared among all threads
- A private variable in a PARALLE section must be specified using the option PRIVATE
- Fortran example of SHARED variable:
PROGRAM Main IMPLICIT NONE integer :: N ! Shared N = 1001 print *, "Before parallel section: N = ", N !$OMP PARALLEL N = N + 1 print *, "Inside parallel section: N = ", N !$OMP END PARALLEL print *, "After parallel section: N = ", N END
- Example Program: (Demo above code)
- Prog file: (Shared variable in OpenMP) --- click here
- Compile with:
- f90 -O
- openMP02a.f90
- Run a few times with:
- export OMP_NUM_THREADS=8
- a.out
You should see the value for N at the end is not always 1009, it could be less. This is evidence of asynchronous update.
-
- Fortran example of NON-SHARED (private) variable:
PROGRAM Main IMPLICIT NONE integer :: N ! Shared N = 1001 print *, "Before parallel section: N = ", N !$OMP PARALLEL PRIVATE(N) N = N + 1 print *, "Inside parallel section: N = ", N !$OMP END PARALLEL print *, "After parallel section: N = ", N END
- Example Program: (Demo above code)
- Prog file: (Private variable in OpenMP) --- click here
- Compile with:
- f90 -O
- openMP02b.f90
- Run a few times with:
- export OMP_NUM_THREADS=8
- a.out
- Output:
Before parallel section: N = 1001 Inside parallel section: N = 1 Inside parallel section: N = 1 Inside parallel section: N = 1 Inside parallel section: N = 1 Inside parallel section: N = 1 Inside parallel section: N = 1 Inside parallel section: N = 1 Inside parallel section: N = 1 After parallel section: N = 1001
Each thread has its own variable N
This variable N is different from the "program" variable defined in the main program !!!
-
- OpenMP Support function
- Most useful support functions in OpenMP:
Function Name Effect omp_set_num_threads(int nthread) Set size of thread team INTEGER omp_get_num_threads() return size of thread team INTEGER omp_get_max_threads() return max size of thread team (typically equal to the number of processors INTEGER omp_get_thread_num() return thread ID of the thread that calls this function INTEGER omp_get_num_procs() return number of processors LOGICAL omp_in_parallel() return TRUE if currently in a PARALLEL segment - Here is a simple OMP program in Fortran:
PROGRAM Main IMPLICIT NONE INTEGER :: nthreads, myid INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS !$OMP PARALLEL private(nthreads, myid) myid = OMP_GET_THREAD_NUM() print *, "Hello I am thread ", myid if (myid == 0) then nthreads = OMP_GET_NUM_THREADS() print *, "Number of threads = ", nthreads end if !$OMP END PARALLEL END
- Example Program: (OpenMP Fortran program) --- click here
- Compile using the following command:
- f90 -O
- hello.f90
- Run with:
- export OMP_NUM_THREADS=8
- a.out
- Output:
Hello I am thread 7 Hello I am thread 5 Hello I am thread 1 Hello I am thread 0 Hello I am thread 2 Number of threads = 8 Hello I am thread 4 Hello I am thread 3 Hello I am thread 6
-
- Caveat with Fortran
- Recall:
- Array indices in Fortran by default start with 1 (ONE)
- Observed from "Hello" program:
- Thread IDs start with 0 (ZERO)
- Caveat:
- Use ThreadID+1 as index to an array in Fortran !!!
- Recall:
- Example OpenMP Program: Find minimum in an array
- A sequential program in C++ can be found here: ( click here )
- We will write this program using OpenMP in Fortran
- Parallel Find Min program in Fortran:
PROGRAM Min IMPLICIT NONE INTEGER, PARAMETER :: MAX = 10000000 DOUBLE PRECISION, DIMENSION(MAX) :: x DOUBLE PRECISION, DIMENSION(10) :: my_min DOUBLE PRECISION :: rmin INTEGER :: num_threads INTEGER :: i, n INTEGER :: id, start, stop ! =========================================================== ! Declare the OpenMP functions ! =========================================================== INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS ! =================================== ! Parallel section: Find local minima ! =================================== !$OMP PARALLEL PRIVATE(i, id, start, stop, num_threads, n) num_threads = omp_get_num_threads() n = MAX/num_threads id = omp_get_thread_num() ! ---------------------------------- ! Find my own starting index ! ---------------------------------- start = id * n + 1 !! Array start at 1 ! ---------------------------------- ! Find my own stopping index ! ---------------------------------- if ( id <> (num_threads-1) ) then stop = start + n else stop = MAX end if ! ---------------------------------- ! Find my own min ! ---------------------------------- my_min(id+1) = x(start) DO i = start+1, stop IF ( x(i) < my_min(id+1) ) THEN my_min(id+1) = x(i) END IF END DO !$OMP END PARALLEL ! =================================== ! Find min over the local minima ! =================================== rmin = my_min(1) DO i = 2, num_threads IF ( rmin < my_min(i) ) THEN rmin = my_min(i) END IF END DO print *, "min = ", rmin END PROGRAM
- Example Program: (Demo above code)
- Prog file: click here
- f90 -O
- min-mt1.f90
- Run with:
- export OMP_NUM_THREADS=8
- a.out
- Mutual exclusion synchronization Primitives
- This mutual exclusion effect in Fortran is achieved in OpenMP using the following pragma:
!$OMP CRITICAL ... statements are guaranteed to be executed ,,, by ONE thread at any one time !$OMP END CRITICAL
-
- Example OpenMP program with synchronization: compute Pi
- Example:
PROGRAM Compute_PI IMPLICIT NONE INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS INTEGER N, i INTEGER id, num_threads DOUBLE PRECISION w, x, sum DOUBLE PRECISION pi, mypi N = 50000000 !! Number of intervals w = 1.0d0/N !! width of each interval sum = 0.0d0 !$OMP PARALLEL PRIVATE(i, id, num_threads, x, mypi) num_threads = omp_get_num_threads() id = omp_get_thread_num() mypi = 0.0d0; DO i = id, N-1, num_threads x = w * (i + 0.5d0) mypi = mypi + w*f(x) END DO !$OMP CRITICAL pi = pi + mypi !$OMP END CRITICAL !$OMP END PARALLEL PRINT *, "Pi = ", pi END PROGRAM
- Example Program: (OpenMP compute Pi) --- click here
- Compile with:
- f90 -O
- openMP_compute_pi2.f90
- Run a few times with:
- export OMP_NUM_THREADS=8
- a.out
-
- Parallel For Loop in OpenMP
The division of labor (splitting the work of a for-loop) of a for-loop can be done in OpenMP through a special Parallel LOOP construct.
- A Parallel Loop construct MUST appear within a Parallel region of the program !
- The syntax of a Parallel LOOP construct in Fortran is:
!$OMP DO DO index = .... .... ! Division of labor is taken care of ! by the Fortran compiler END DO !$OMP END DO
- The meaning of this Parallel LOOP construct is to distribute the iterations in the for-loop (or do-loop) among the threads.
Each iteration of the for-loop is executed exactly once by each thread.
The loop variable used in the Parallel LOOP construct is by default PRIVATE (other variables are still by default SHARED)
- Example: compute Pi with parallel DO loop
PROGRAM Compute_PI IMPLICIT NONE INTEGER N, i, num_threads DOUBLE PRECISION w, x, sum DOUBLE PRECISION pi, mypi N = 50000000 !! Number of intervals w = 1.0d0/N !! width of each interval sum = 0.0d0 !$OMP PARALLEL PRIVATE(x, mypi) mypi = 0.0d0; !$OMP DO DO i = 0, N-1 !! Parallel Loop x = w * (i + 0.5d0) mypi = mypi + w*f(x) END DO !$OMP END DO !$OMP CRITICAL pi = pi + mypi !$OMP END CRITICAL !$OMP END PARALLEL PRINT *, "Pi = ", pi END PROGRAM
- Example Program: (OpenMP compute Pi) --- click here
- Compile with:
- f90 -O
- openMP_compute_pi3.f90
- Run with:
- export OMP_NUM_THREADS=8
- a.out
-
- Final Notes
- The stack size of each thread can be controlled by setting another environment variable:
setenv STACKSIZE nBytes
- For more information on OpenMP, see: http://www.openmp.org