OpenCV & Python Multithreading - Seeking within a VideoCapture Object

python multithreading opencv

15,408

Solution 1

Based on the comments on the original question I've done some testing and thought it worth sharing the (interesting) results. Big savings potential for anyone using OpenCV's VideoCapture.set(CAP_PROP_POS_MSEC) or VideoCapture.set(CAP_PROP_POS_FRAMES).

I've done some profiling comparing three options:

1. GET FRAMES BY SEEKING TO TIME:

frames = {}
def get_all_frames_by_ms(time):
    while True:
        video_capture.set(cv2.CAP_PROP_POS_MSEC, time)
        capture_success, frames[time] = video_capture.read()
        if not capture_success:
            break
        time += 1000

2. GET FRAMES BY SEEKING TO FRAME NUMBER:

frames = {}
def get_all_frames_by_frame(time):
    while True:
        # Note my test video is 12.333 FPS, and time is in milliseconds
        video_capture.set(cv2.CAP_PROP_POS_FRAMES, int(time/1000*12.333))
        capture_success, frames[time] = video_capture.read()
        if not capture_success:
            break
        time += 1000

3. GET FRAMES BY GRABBING ALL, BUT RETRIEVING ONLY ONES I WANT:

def get_all_frames_in_order():
    prev_time = -1
    while True:
        grabbed = video_capture.grab()
        if grabbed:
            time_s = video_capture.get(cv2.CAP_PROP_POS_MSEC) / 1000
            if int(time_s) > int(prev_time):
                # Only retrieve and save the first frame in each new second
                self.frames[int(time_s)] = video_capture.retrieve()
            prev_time = time_s
        else:
            break

Running through those three approaches, the timings (from three runs of each) are as follows:

33.78s 29.65s 29.24s
31.95s 29.16s 28.35s
11.81s 10.76s 11.73s

In each case it's saving 100 frames at 1sec intervals into a dictionary, where each frame is a 3072x1728 image, from a .mp4 video file. All on a 2015 MacBookPro with 2.9 GHz Intel Core i5 and 8GB RAM.

Conclusions so far... if you're interested in retrieving only some frames from a video, then very worth looking at running through all frames in order and grabbing them all, but only retrieving those you're interested in - as an alternative to reading (which grabs and retrieves in one go). Gave me an almost 3x speedup.

I've also re-looked at multi-threading on this basis. I've got two test processes - one that gets the frames, and another that processes them once they're available:

frames = {}

def get_all_frames_in_order():
    prev_time = -1
    while True:
        grabbed = video_capture.grab()
        if grabbed:
            time_s = video_capture.get(cv2.CAP_PROP_POS_MSEC) / 1000
            if int(time_s) > int(prev_time):
                # Only retrieve and save the first frame in each new second
                frames[int(time_s)] = video_capture.retrieve()
            prev_time = time_s
        else:
            break

def process_all_frames_as_available(processing_time):
    prev_time = 0
    while True:
        this_time = prev_time + 1000
        if this_time in frames and prev_time in frames:
            # Dummy processing loop - just sleeps for specified time
            sleep(processing_time)
            prev_time += self.time_increment
            if prev_time + self.time_increment > video_duration:
                break
        else:
            # If the frames aren't ready yet, wait a short time before trying again
            sleep(0.02)

For this testing, I then called them either one after the other (sequentially, single threaded), or with the following muti-threaded code:

get_frames_thread = Thread(target=get_all_frames_in_order)
get_frames_thread.start()
process_frames_thread = Thread(target=process_all_frames_as_available, args=(0.02,))
process_frames_thread.start()
get_frames_thread.join()
process_frames_thread.join()

Based on that, I'm now happy that multi-threading is working effectively and saving a significant amount of time. I generated timings for the two functions above separately, and then together in both single-threaded and multi-threaded modes. The results are below (number in bracket is the time in seconds that the 'processing' for each frame takes, which in this case is just a dummy / delay):

get_all_frames_in_order - 2.99s

Process time = 0.02s per frame:
process_all_frames_as_available - 0.97s
single-threaded - 3.99s
multi-threaded - 3.28s

Process time = 0.1s per frame:
process_all_frames_as_available - 4.31s
single-threaded - 7.35s
multi-threaded - 4.46s

Process time = 0.2s per frame:
process_all_frames_as_available - 8.52s
single-threaded - 11.58s
multi-threaded - 8.62s

As you can hopefully see, the multi-threading results are very good. Essentially, it takes just ~0.2s longer to do both functions in parallel than the slower of the two functions running entirely separately.

Hope that helps someone!

Solution 2

Coincidentally, I've worked on a similar problem, and I have created a python library (more of a thin wrapper) for reading videos. The library is called mydia.

The library does not use OpenCV. It uses FFmpeg as the backend for reading and processing videos.

mydia supports custom frame selection, frame resizing, grayscale conversion and much more. The documentation can be viewed here

So, if you want to select N frames per second (where N = 1 in your case), the following code would do it:

import numpy as np
from mydia import Videos

video_path = "path/to/video"

def select_frames(total_frames, num_frames, fps, *args):
    """This function will return the indices of the frames to be captured"""
    N = 1
    t = np.arange(total_frames)
    f = np.arange(num_frames)
    mask = np.resize(f, total_frames)

    return t[mask < N][:num_frames].tolist()

# Let's assume that the duration of your video is 120 seconds
# and you want 1 frame for each second 
# (therefore, setting `num_frames` to 120)
reader = Videos(num_frames=120, mode=select_frames)

video = reader.read(video_path)  # A video tensor/array

The best part is that internally, only those frames that are required are read, and therefore the process is much faster (which is what I believe you are looking for).

The installation of mydia is extremely simple and can be viewed here.

This might have a slight learning curve, but I believe that it is exactly what you are looking for.

Moreover, if you have multiple videos, you could use multiple workers for reading them in parallel. For instance:

from mydia import Videos

path = "path/to/video"
reader = Videos()
video = reader.read(path, workers=4)

Depending on your CPU, this could give you a significant speed-up.

Hope this helps !!

15,408

Author by

DaveWalker

Long history with lots of languages, including most recently Python, Swift, C++ and BASH - but also things like Matlab, PHP, HTML / CSS / etc. I manage programmes and projects which involve code, but most of my own coding is now done at home for interest and awareness, rather than being a core part of my work.

Updated on July 19, 2022

Comments

DaveWalker almost 2 years
I've been working on a python application which uses OpenCV to read frames from a video and create a composite of the "activity", i.e. the things that have changed from one frame to the next. To do that, I only really want to check one frame per second or so.

For a long time I've been using the following code (simplified, with some error checking, classes, etc removed for brevity) to get the video object and the first frame:
```
video_capture = cv2.VideoCapture(video_fullpath)
this_frame = get_frame(0)

def get_frame(time):
    video_capture.set(cv2.CAP_PROP_POS_MSEC, time)
    capture_success, this_frame = video_capture.read()
    return this_frame
```
The process of getting subsequent frames, using the latter two lines of code above, is really slow. On a 2015 MacBook Pro it takes 0.3-0.4s to get each frame (at 1sec intervals in the video, which is a ~100MB .mp4 video file). By comparison, the rest of my operations, which are comparing each frame to its predecessor, are very quick - typically less than 0.01s.

I've therefore been looking at multi-threading, but I'm struggling.

I can get multi-threading working on a "lookahead" basis, i.e. whilst I'm processing one frame I can be getting the next one. And once I'm done processing the previous frame, I'll wait for the "lookahead" operation to finish before continuing. I do that with the following code:
```
while True:
    this_frame, next_frame_thread = get_frame_async(prev_frame.time + time_increment)
    << do processing of this_frame ... >>
    next_frame_thread.join()

def get_frame_async(time):
    if time not in frames:
        frames[time] = get_frame(time)
    next_frame_thread = Thread(target=get_frame, args=(time,))
    next_frame_thread.start()
    return frames[time], next_frame_thread
```
The above seems to be working, but because the seeking operation is so slow compared to everything else it doesn't actually save much time - in fact it's difficult to see any benefit at all.

I then wondered whether I could be getting multiple frames in parallel. However, whenever I try I get a range of errors, mostly related to async_lock (e.g. Assertion fctx->async_lock failed at libavcodec/pthread_frame.c:155). I wonder whether this is simply that an OpenCV VideoCapture object can't seek to multiple places at once... which would seem reasonable. But if that's true, is there any way to speed this operation up significantly?

I've been using a few different sources, including this one https://nrsyed.com/2018/07/05/multithreading-with-opencv-python-to-improve-video-processing-performance/ which shows huge speed-ups, but I'm struggling with why I'm getting these errors around async_lock. Is it just the seek operation? I can't find any examples of multithreading whilst seeking around the video - just example of people reading all frames sequentially.

Any tips or guidance on where / which parts are most likely to benefit from multithreading (or another approach) would be most welcome. This is my first attempt at multithreading, so completely accept I might have missed something obvious! Based on this page (https://www.toptal.com/python/beginners-guide-to-concurrency-and-parallelism-in-python), I was a bit overwhelmed by the range of different options available.

Thanks!
Gazihan Alankus over 4 years

I think you have a msec-sec bug in your code. You put seconds into frames and you increment this_time = prev_time + 1000 and use this_time in frames.