How to efficiently display OpenCV video in Qt?

c++ multithreading qt opencv

27,373

Using QImage::scanLine forces a deep copy, so at the minimum, you should use constScanLine, or, better yet, change the slot's signature to:

void widget::set_image(const QImage & image);

Of course, your problem then becomes something else: the QImage instance points to the data of a frame that lives in another thread, and can (and will) change at any moment.

There is a solution for that: one needs to use fresh frames allocated on the heap, and the frame needs to be captured within QImage. QScopedPointer is used to prevent memory leaks until the QImage takes ownership of the frame.

static void matDeleter(void* mat) { delete static_cast<cv::Mat*>(mat); }

class capture {
   Q_OBJECT
   bool m_enable;
   ...
public:
   Q_SIGNAL void image_ready(const QImage &);
   ...
};

void capture::start_process() {
  m_enable = true;
  while(m_enable) {
    QScopedPointer<cv::Mat> frame(new cv::Mat);
    if (!m_video_handle->read(*frame)) {
      break;
    }
    cv::cvtColor(*frame, *frame, CV_BGR2RGB);

    // Here the image instance takes ownership of the frame.
    const QImage image(frame->data, frame->cols, frame->rows, frame->step,
                       QImage::Format_RGB888, matDeleter, frame.take());       
    emit image_ready(image);
    cv::waitKey(30);
  }
}

Of course, since Qt provides native message dispatch and a Qt event loop by default in a QThread, it's a simple matter to use QObject for the capture process. Below is a complete, tested example.

The capture, conversion and viewer all run in their own threads. Since cv::Mat is an implicitly shared class with atomic, thread-safe access, it's used as such.

The converter has an option of not processing stale frames - useful if conversion is only done for display purposes.

The viewer runs in the gui thread and correctly drops stale frames. There's never a reason for the viewer to deal with stale frames.

If you were to collect data to save to disk, you should run the capture thread at high priority. You should also inspect OpenCV apis to see if there's a way of dumping the native camera data to disk.

To speed up conversion, you could use the gpu-accelerated classes in OpenCV.

The example below makes sure that in none of the memory is reallocated unless necessary for a copy: the Capture class maintains its own frame buffer that is reused for each subsequent frame, so does the Converter, and so does the ImageViewer.

There are two deep copies of image data made (besides whatever happens internally in cv::VideoCatprure::read):

The copy to the Converter's QImage.
The copy to ImageViewer's QImage.

Both copies are needed to assure decoupling between the threads and prevent data reallocation due to the need to detach a cv::Mat or QImage that has the reference count higher than 1. On modern architectures, memory copies are very fast.

Since all image buffers stay in the same memory locations, their performance is optimal - they stay paged in and cached.

The AddressTracker is used to track memory reallocations for debugging purposes.

// https://github.com/KubaO/stackoverflown/tree/master/questions/opencv-21246766
#include <QtWidgets>
#include <algorithm>
#include <opencv2/opencv.hpp>

Q_DECLARE_METATYPE(cv::Mat)

struct AddressTracker {
   const void *address = {};
   int reallocs = 0;
   void track(const cv::Mat &m) { track(m.data); }
   void track(const QImage &img) { track(img.bits()); }
   void track(const void *data) {
      if (data && data != address) {
         address = data;
         reallocs ++;
      }
   }
};

The Capture class fills the internal frame buffer with the captured frame. It notifies of a frame change. The frame is the user property of the class.

class Capture : public QObject {
   Q_OBJECT
   Q_PROPERTY(cv::Mat frame READ frame NOTIFY frameReady USER true)
   cv::Mat m_frame;
   QBasicTimer m_timer;
   QScopedPointer<cv::VideoCapture> m_videoCapture;
   AddressTracker m_track;
public:
   Capture(QObject *parent = {}) : QObject(parent) {}
   ~Capture() { qDebug() << __FUNCTION__ << "reallocations" << m_track.reallocs; }
   Q_SIGNAL void started();
   Q_SLOT void start(int cam = {}) {
      if (!m_videoCapture)
         m_videoCapture.reset(new cv::VideoCapture(cam));
      if (m_videoCapture->isOpened()) {
         m_timer.start(0, this);
         emit started();
      }
   }
   Q_SLOT void stop() { m_timer.stop(); }
   Q_SIGNAL void frameReady(const cv::Mat &);
   cv::Mat frame() const { return m_frame; }
private:
   void timerEvent(QTimerEvent * ev) {
      if (ev->timerId() != m_timer.timerId()) return;
      if (!m_videoCapture->read(m_frame)) { // Blocks until a new frame is ready
         m_timer.stop();
         return;
      }
      m_track.track(m_frame);
      emit frameReady(m_frame);
   }
};

The Converter class converts the incoming frame to a scaled-down QImage user property. It notifies of the image update. The image is retained to prevent memory reallocations. The processAll property selects whether all frames will be converted, or only the most recent one should more than one get queued up.

class Converter : public QObject {
   Q_OBJECT
   Q_PROPERTY(QImage image READ image NOTIFY imageReady USER true)
   Q_PROPERTY(bool processAll READ processAll WRITE setProcessAll)
   QBasicTimer m_timer;
   cv::Mat m_frame;
   QImage m_image;
   bool m_processAll = true;
   AddressTracker m_track;
   void queue(const cv::Mat &frame) {
      if (!m_frame.empty()) qDebug() << "Converter dropped frame!";
      m_frame = frame;
      if (! m_timer.isActive()) m_timer.start(0, this);
   }
   void process(const cv::Mat &frame) {
      Q_ASSERT(frame.type() == CV_8UC3);
      int w = frame.cols / 3.0, h = frame.rows / 3.0;
      if (m_image.size() != QSize{w,h})
         m_image = QImage(w, h, QImage::Format_RGB888);
      cv::Mat mat(h, w, CV_8UC3, m_image.bits(), m_image.bytesPerLine());
      cv::resize(frame, mat, mat.size(), 0, 0, cv::INTER_AREA);
      cv::cvtColor(mat, mat, CV_BGR2RGB);
      emit imageReady(m_image);
   }
   void timerEvent(QTimerEvent *ev) {
      if (ev->timerId() != m_timer.timerId()) return;
      process(m_frame);
      m_frame.release();
      m_track.track(m_frame);
      m_timer.stop();
   }
public:
   explicit Converter(QObject * parent = nullptr) : QObject(parent) {}
   ~Converter() { qDebug() << __FUNCTION__ << "reallocations" << m_track.reallocs; }
   bool processAll() const { return m_processAll; }
   void setProcessAll(bool all) { m_processAll = all; }
   Q_SIGNAL void imageReady(const QImage &);
   QImage image() const { return m_image; }
   Q_SLOT void processFrame(const cv::Mat &frame) {
      if (m_processAll) process(frame); else queue(frame);
   }
};

The ImageViewer widget is the equivalent of a QLabel storing a pixmap. The image is the user property of the viewer. The incoming image is deep-copied into the user property, to prevent memory reallocations.

class ImageViewer : public QWidget {
   Q_OBJECT
   Q_PROPERTY(QImage image READ image WRITE setImage USER true)
   bool painted = true;
   QImage m_img;
   AddressTracker m_track;
   void paintEvent(QPaintEvent *) {
      QPainter p(this);
      if (!m_img.isNull()) {
         setAttribute(Qt::WA_OpaquePaintEvent);
         p.drawImage(0, 0, m_img);
         painted = true;
      }
   }
public:
   ImageViewer(QWidget * parent = nullptr) : QWidget(parent) {}
   ~ImageViewer() { qDebug() << __FUNCTION__ << "reallocations" << m_track.reallocs; }
   Q_SLOT void setImage(const QImage &img) {
      if (!painted) qDebug() << "Viewer dropped frame!";
      if (m_img.size() == img.size() && m_img.format() == img.format()
          && m_img.bytesPerLine() == img.bytesPerLine())
         std::copy_n(img.bits(), img.sizeInBytes(), m_img.bits());
      else
         m_img = img.copy();
      painted = false;
      if (m_img.size() != size()) setFixedSize(m_img.size());
      m_track.track(m_img);
      update();
   }
   QImage image() const { return m_img; }
};

The demonstration instantiates the classes described above and runs the capture and conversion in dedicated threads.

class Thread final : public QThread { public: ~Thread() { quit(); wait(); } };

int main(int argc, char *argv[])
{
   qRegisterMetaType<cv::Mat>();
   QApplication app(argc, argv);
   ImageViewer view;
   Capture capture;
   Converter converter;
   Thread captureThread, converterThread;
   // Everything runs at the same priority as the gui, so it won't supply useless frames.
   converter.setProcessAll(false);
   captureThread.start();
   converterThread.start();
   capture.moveToThread(&captureThread);
   converter.moveToThread(&converterThread);
   QObject::connect(&capture, &Capture::frameReady, &converter, &Converter::processFrame);
   QObject::connect(&converter, &Converter::imageReady, &view, &ImageViewer::setImage);
   view.show();
   QObject::connect(&capture, &Capture::started, [](){ qDebug() << "Capture started."; });
   QMetaObject::invokeMethod(&capture, "start");
   return app.exec();
}

#include "main.moc"

This concludes the complete example. Note: The previous revision of this answer unnecessarily reallocated the image buffers.

27,373

Murat Şeker

C++ Enthusiast

Updated on October 18, 2020

Comments

Murat Şeker over 3 years
I'm capturing multiple streams from ip cameras with the help of OpenCV. When i try to display these stream from an OpenCV window(cv::namedWindow(...)), it works without any problem (i have tried up to 4 streams so far).

The problem arises when i try to show these streams inside a Qt widget. Since the capturing is done in another thread, i have to use the signal slot mechanism in order to update the QWidget(which is in main thread).

Basically, i emit the newly captured frame from the capture thread and a slot in the GUI thread catches it. When i open 4 streams, i can not display the videos smoothly like before.

Here is the emitter :
```
void capture::start_process() {
    m_enable = true;
    cv::Mat frame;

    while(m_enable) {
        if (!m_video_handle->read(frame)) {
            break;
        }
        cv::cvtColor(frame, frame,CV_BGR2RGB);

        qDebug() << "FRAME : " << frame.data;

        emit image_ready(QImage(frame.data, frame.cols, frame.rows, frame.step, QImage::Format_RGB888));
        cv::waitKey(30);
    }
}
```
This is my slot :
```
void widget::set_image(QImage image) {
    img = image;
    qDebug() << "PARAMETER IMAGE: " << image.scanLine(0);
    qDebug() << "MEMBER IMAGE: " << img.scanLine(0);
}
```
The problem seems like the overhead of copying QImages continuously. Although QImage uses implicit sharing, when i compare the data pointers of images via qDebug() messages, i see different addresses.

1- Is there any way to embed OpenCV window directly into QWidget ?

2- What is the most efficient way to handle displaying multiple videos? For example, how video management systems show up to 32 cameras in the same time ?

3- What must be the way to go ?
- Micka over 10 years
  
  did you try to ignore QImage and use the data arrays directly as textures to render in a QGLWidget? That should be faster, I guess. Professional solutions might use specialized hardware, but that's just a guess, too.
madduci about 10 years

interesting solution. How would you modify it, in case you have more than one camera connected and running on the same system?
Kuba hasn't forgotten Monica about 10 years

@blackibiza The main method can be trivially modified to start multiple captures, converters and image viewers - the Capture::start method takes the camera number as an argument. The object-per-thread approach is a bit limiting, performance-wise. The threads are rather heavyweight creatures. This solution would be perhaps better off using QtConcurrent::run as that approximates the ultimate performance of GCD. This code really should be refactored to be GCD-like and I might do that one day.
Lennart Rolland almost 9 years

Saved the day bigtime for me. But now I am struggling with OpenCV being unstalbe when opening camera sources, so I am actually looking for a way to feed video from Qt into OpenCV instead hehe
Kuba hasn't forgotten Monica almost 9 years

@LennartRolland "struggling with OpenCV being unstable" What do you mean by "unstable"? If it crashes, then use the debugger to figure out why, fix it, and submit an OpenCV patch. If it doesn't crash, how else is it unstable?
Lennart Rolland almost 9 years

@KubaOber: In a perfect world that is what I would do however I have my own agenda and limited time. And it does not crash it simply can't open video sources reliably. I read that this is a known problem with the best workaround to supply video yourself (which is what I am doing with Qt now)
Vyacheslav Napadovsky over 7 years

There's an issue with this sample/ If you comment out this line converter.setProcessAll(false); and class Converter is slower then OpenCV gives you frames, you'll get out of memory pretty fast, because class Capture will eat up all memory pushing frames to Converter that can't process them in time.
Kuba hasn't forgotten Monica over 7 years

@slavanap That's why that line is there. You're not supposed to remove it :)
goulashsoup almost 6 years

@KubaOber In the ImageViewer's member function setImage you copy the image which costs time and memory. Isn't that something you wanted to avoid?
goulashsoup almost 6 years

@KubaOber Also in reference to the first short example: Why is it necessary to use a QScopedPointer when you allready use the QImage constructor which takes a _cleanupFunction_, in this case matDeleter ? To my understanding QImage would handle the destructrion of the cv::Mat pointer alone.
Kuba hasn't forgotten Monica almost 6 years

The scoped pointer is for exception safety only. It ceases to be relevant once it gives up the ownership of its data to QImage constructor. It was just an example.The copying is cheaper than reallocating memory. Objectively cheaper, BTW — you can benchmark it. Otherwise, the old frame will be retained by QImage and when the camera has a new frame, it will have to store it somewhere else, so it will allocate anew. That’s very slow since you are writing to cold caches, potentially pages pages out, etc.
Kuba hasn't forgotten Monica almost 6 years

If some thread synchronization was added so that only a single frame instance was used, then it would be possible to avoid the copy. But again: in practice, that copy has a very low cost. To avoid it, some sacrifices need to be made that couple the capture, converter and display threads. The converter has to copy because it changes the image, and changing the image is always done by a copy that perhaps computes. Even the RGB to BGR in-place conversion copies data (has to read it all and write it all).