Understanding the usage of OpenCL in OpenCV (Mat/ Umat Objects)
In code, using Mat Object always runs on CPU & using UMat Object always runs on GPU, irrespective of the code ocl::setUseOpenCL(true/false);
I'm sorry, because I'm not sure if this is a question or a statement... in either case it's partially true. In 3.0, for the UMat
, if you don't have a dedicated GPU then OpenCV just runs everything on the CPU. If you specifically ask for Mat
you get it on the CPU. And in your case you have directed both to run on each of your GPUs/CPU by selecting each specifically (more on "choosing a CPU below)... read this:
Few design choices support the new architecture:
A unified abstraction cv::UMat that enables the same APIs to be implemented using CPU or OpenCL code, without a requirement to call OpenCL accelerated version explicitly. These functions use an OpenCL -enabled GPU if exists in the system, and automatically switch to CPU operation otherwise.
The UMat abstraction enables functions to be called asynchronously. Unlike the cv::Mat of the OpenCV version 2.x, access to the underlyi ng data for the cv::UMat is performed through a method of class, and not though its data member. Such an approach enables the implementation to explicitly wait for GPU completion only when CPU code absolutely needs the result.
The UMat implementation makes use of CPU-GPU shared physical memory available on Intel SoCs, including allocations that come from pointers passed into OpenCV.
I think there also might be a misunderstanding about "using OpenCL". When you use an UMat
, you are specifically trying to use the GPU. And, I'll plead some ignorance here, as a result I believe that CV is using some of the CL library to make that happen automatically... as a side in 2.X we had cv::ocl to specifically/manually do this, so be careful if you are using that 2.X legacy code in 3.X. There are reasons to do it, but they are not always straightforward. But, back on topic, when you say,
with OpenCL UMat
you are potentially being redundant. The CL code you have in your snippet is basically finding out what equipment is installed, how many there are, what their names are, and choosing which to use... I'd have to dig through the way it is instantiated, but perhaps when you make it UMat
it automatically sets OpenCL to True? (link) That would definitely support the data you presented. You could probably test that idea by checking what the state of ocl::setUseOpenCL after you set it to false and then use an UMat
.
Finally, I'm guessing your CPU has a built in GPU. So it is running parallel processing with OpenCL and not paying a time penalty to travel to the seperate/dedicated GPU and back, hence your perceived performance increase over the GPUs (since it is not technically the CPU running it)... only when you are specifically using the Mat
is the CPU only being used.
Your last question, I'm not sure... this is my speculation: OpenCL architexture exists on the GPU, when you install CV with CL you are installing the link between the two libraries and associated header files. I'm not sure which dll files you need to make that magic happen.
Suraksha Ajith
Updated on June 05, 2022Comments
-
Suraksha Ajith almost 2 years
I ran the code below to check for the performance difference between GPU and CPU usage. I am calculating the Average time for
cv::cvtColor()
function. I make four function calls:Just_mat()
(Without using OpenCL forMat
object)Just_UMat()
(Without using OpenCL forUmat
object)OpenCL_Mat()
(using OpenCL forMat
object)OpenCL_UMat()
(using OpenCL forUMat
object)
for both CPU and GPU.
I did not find a huge performance difference between GPU and CPU usage.int main(int argc, char* argv[]) { loc = argv[1]; just_mat(loc);// Calling function Without OpenCL just_umat(loc);//Calling function Without OpenCL cv::ocl::Context context; std::vector<cv::ocl::PlatformInfo> platforms; cv::ocl::getPlatfomsInfo(platforms); for (size_t i = 0; i < platforms.size(); i++) { //Access to Platform const cv::ocl::PlatformInfo* platform = &platforms[i]; //Platform Name std::cout << "Platform Name: " << platform->name().c_str() << "\n" << endl; //Access Device within Platform cv::ocl::Device current_device; for (int j = 0; j < platform->deviceNumber(); j++) { //Access Device platform->getDevice(current_device, j); int deviceType = current_device.type(); cout << "Device name: " << current_device.name() << endl; if (deviceType == 2) cout << context.ndevices() << " CPU devices are detected." << std::endl; if (deviceType == 4) cout << context.ndevices() << " GPU devices are detected." << std::endl; cout << "===============================================" << endl << endl; switch (deviceType) { case (1 << 1): cout << "CPU device\n"; if (context.create(deviceType)) opencl_mat(loc);//With OpenCL Mat break; case (1 << 2): cout << "GPU device\n"; if (context.create(deviceType)) opencl_mat(loc);//With OpenCL UMat break; } cin.ignore(1); } } return 0; } int just_mat(string loc);// I check for the average time taken for cvtColor() without using OpenCl int just_umat(string loc);// I check for the average time taken for cvtColor() without using OpenCl int opencl_mat(string loc);//ocl::setUseOpenCL(true); and check for time difference for cvtColor function int opencl_umat(string loc);//ocl::setUseOpenCL(true); and check for time difference for cvtColor function
The output(in miliseconds) for the above code is
__________________________________________
|GPU Name|With OpenCL Mat | With OpenCl UMat|
|_________________________________________|
|--Carrizo---|------7.69052 ------ |------0.247069-------|
|_________________________________________|
|---Island--- |-------7.12455------ |------0.233345-------|
|_________________________________________|
__________________________________________
|----CPU---|With OpenCL Mat | With OpenCl UMat |
|_________________________________________|
|---AMD---|------6.76169 ------ |--------0.231103--------|
|_________________________________________|
________________________________________________
|----CPU---| WithOut OpenCL Mat | WithOut OpenCl UMat |
|_______________________________________________|
|----AMD---|------7.15959------ |------------0.246138------------ |
|_______________________________________________|In code, using Mat Object always runs on CPU & using UMat Object always runs on GPU, irrespective of the code
ocl::setUseOpenCL(true/false);
Can anybody explain the reason for all output time variation?
One more question, i didn't use any OpenCL specific .dll with .exe file and yet GPU was used without any error, while building OpenCV with Cmake i checkedWith_OpenCL
did this built all OpenCL required function withinopencv_World310.dll
?