Similarity Forum
General Category => General => Topic started by: ektorbarajas on March 14, 2013, 23:10:37
-
Hi.
I noticed that when running an OpenCL benchmark, the results are expressed in speed (ms) for example:
CPU Speed (1 Core): 281ms
OpenCL Device Speed: 78ms
According to this, then OpenCL must be enabled since openCL is faster than 1 CPU Core,
BUT the online help says the opposite:
"Don’t turn on OpenCL acceleration if the results of OpenCL benchmark are lower than processor’s benchmark, because this will increase overhead costs due to increased expenses for data transfer to the OpenCL device. "
So what is the correct statement?
Thanks
-
Also I'd appreciate to clarify this:
On the parameter "Work Group Size", higher numbers are better? so if a device supports 1024 then it's better 1024 than 64?
Thanks
-
Benchmarking doesn't count transfer time, it only shows pure math power. In your situation your OpenCL more powerful when your CPUs you can enable it.
Work size hasn't simple best value, try use different values and benchmark, some video cards hast best values in 128 some in 64.
-
So the rule of thumb is that if a benchmark throws the OpenCL speed lower than 1 CPU cure, then OpenCL should be used?
For example:
CPU Speed (1 Core): 281ms
OpenCL Device Speed: 78ms
menas that there is a real benefit enabilg OpenCL
While
CPU Speed (1 Core): 581ms
OpenCL Device Speed: 754ms
means that OpenCL MUST NOT BE enabled?
Then indeed the help statement is wrong:
"Don’t turn on OpenCL acceleration if the results of OpenCL benchmark are lower than processor’s benchmark, because this will increase overhead costs due to increased expenses for data transfer to the OpenCL device. "
Regards
-
yes, it's not fully correct "lower" in this context means higher time results. We correct it in next releases.
-
Hi. It has being a long time :)
Just wanted to know if this minor issue has being corrected in the beta 1.9.0?
Regards
-
Thanks for reminder, we fixed help text:
Don’t turn on OpenCL acceleration if the results of OpenCL benchmark are longer in ms. than processor’s benchmark, because this will increase overhead costs due to increased expenses for data transfer to the OpenCL device.
-
Great!!!
Kind regards
-
Which would be better in this situation:
Work size= 64
CPU= 600ms
GPU= 74ms
Work size= 512
CPU= 700ms
GPU= 150ms
The 512 GPU is still much faster than CPU. So would 512 be faster than 64 in this case?
-
lower values is better.
use work size 64, work size is how much separate "threads/processes" run simultaneously, it doesn't directly related to performance, different devices have different optimum value. 512 doesn't mean better 64.