c - Nested loop in OpenMP -
i need run short outer loop , long inner loop. parallelize latter , not former. reason there array updated after inner loop has run. code using following
#pragma omp parallel{ for(j=0;j<3;j++){ s=0; #pragma omp reduction(+:s) for(i=0;i<10000;i++) s+=1; a[j]=s; } }
this hangs. following works fine, i'd rather avoid overhead of starting new parallel region since preceded another.
for(j=0;j<3;j++){ s=0; #pragma omp parallel reduction(+:s) for(i=0;i<10000;i++) s+=1; a[j]=s; }
what correct (and fastest) way of doing this?
the following example should work expected:
#include<iostream> using namespace std; int main(){ int s; int a[3]; #pragma omp parallel { // note moved curly bracket for(int j = 0; j < 3; j++) { #pragma omp single s = 0; #pragma omp reduction(+:s) for(int i=0;i<10000;i++) { s+=1; } // implicit barrier here #pragma omp single a[j]=s; // statement needs synchronization } // end of outer loop } // end of parallel region (int jj = 0; jj < 3; jj++) cout << a[jj] << endl; return 0; }
an example of compilation , execution is:
> g++ --version g++ (ubuntu/linaro 4.6.3-1ubuntu5) 4.6.3 copyright (c) 2011 free software foundation, inc. free software; see source copying conditions. there no warranty; not merchantability or fitness particular purpose. > g++ -fopenmp -wall main.cpp > export omp_num_threads=169 > ./a.out 10000 10000 10000
Comments
Post a Comment