I have a large program which I am trying to speed up with OpenMP constructs. When I do not specify the number of threads using the OMP_SET_NUM_THREADS library routine, 5 threads are created (according to Resource Monitor in Win 7) and I can see that CPU usage is around 100%. However, when I try to explicitly set the number of threads, I see that as many threads as I asked for are indeed created but the CPU usage hovers around 25% regardless of the number of threads. When I try to replicate the same behavior using a simple test code, I don't see this strange behavior. Has anybody run into this problem? Is this a likely compiler bug or do I need to set some environment variables to get the OMP_SET_NUM_THREADS routine work properly?
Any help will be greatly appreciated.
Thanks,
Jon