Getting video properties with Python without calling external software

13,221

OK, after investigating this myself because I needed it too, it looks like it can be done with hachoir. Here's a code snippet that can give you all the metadata hachoir can read:

import re
from hachoir.parser import createParser
from hachoir.metadata import extractMetadata

def get_video_metadata(path):
    """
        Given a path, returns a dictionary of the video's metadata, as parsed by hachoir.
        Keys vary by exact filetype, but for an MP4 file on my machine,
        I get the following keys (inside of "Common" subdict):
            "Duration", "Image width", "Image height", "Creation date",
            "Last modification", "MIME type", "Endianness"

        Dict is nested - common keys are inside of a subdict "Common",
        which will always exist, but some keys *may* be inside of
        video/audio specific stream subdicts, named "Video Stream #1"
        or "Audio Stream #1", etc. Not all formats result in this
        separation.

        :param path: str path to video file
        :return: dict of video metadata
    """

    if not os.path.exists(path):
        raise ValueError("Provided path to video ({}) does not exist".format(path))

    parser = createParser(path)
    if not parser:
        raise RuntimeError("Unable to get metadata from video file")

    with parser:
        metadata = extractMetadata(parser)

        if not metadata:
            raise RuntimeError("Unable to get metadata from video file")

    metadata_dict = {}
    line_matcher = re.compile("-\s(?P<key>.+):\s(?P<value>.+)")
    group_key = None  # group_key stores which group we're currently in for nesting subkeys
    for line in metadata.exportPlaintext():  # this is what hachoir offers for dumping readable information
        parts = line_matcher.match(line)  #
        if not parts:  # not all lines have metadata - at least one is a header
            if line == "Metadata:":  # if it's the generic header, set it to "Common: to match items with multiple streams, so there's always a Common key
                group_key = "Common"
            else:
                group_key = line[:-1]  # strip off the trailing colon of the group header and set it to be the current group we add other keys into
            metadata_dict[group_key] = {}  # initialize the group
            continue

        if group_key:  # if we're inside of a group, then nest this key inside it
            metadata_dict[group_key][parts.group("key")] = parts.group("value")
        else:  # otherwise, put it in the root of the dict
            metadata_dict[parts.group("key")] = parts.group("value")

    return metadata_dict

This seems to return good results for me right now and requires no extra installs. The keys seem to vary a decent amount by video and type of video, so you'll need to do some checking and not just assume any particular key is there. This code is written for Python 3 and is using hachoir3 and adapted from hachoir3 documentation - I haven't investigated if it works for hachoir for Python 2.

In case it's useful, I also have the following for turning the text-based duration values into seconds:

def length(duration_value):

    time_split = re.match("(?P<hours>\d+\shrs)?\s*(?P<minutes>\d+\smin)?\s*(?P<seconds>\d+\ssec)?\s*(?P<ms>\d+\sms)", duration_value)  # get the individual time components

    fields_and_multipliers = {  # multipliers to convert each value to seconds
        "hours": 3600,
        "minutes": 60,
        "seconds": 1,
        "ms": 1
    }

    total_time = 0
    for group in fields_and_multipliers:  # iterate through each portion of time, multiply until it's in seconds and add to total
        if time_split.group(group) is not None:  # not all groups will be defined for all videos (eg: "hrs" may be missing)
            total_time += float(time_split.group(group).split(" ")[0]) * fields_and_multipliers[group]  # get the number from the match and multiply it to make seconds


    return total_time
Share:
13,221
ullix
Author by

ullix

Updated on June 03, 2022

Comments

  • ullix
    ullix almost 2 years

    [Update:] Yes, it is possible, now some 20 months later. See Update3 below! [/update]

    Is that really impossible? All I could find were variants of calling FFmpeg (or other software). My current solution is shown below, but what I really would like to get for portability is a Python-only solution that doesn't require users to install additional software.

    After all, I can easily play videos using PyQt's Phonon, yet I can't get simply things like dimension or duration of the video?

    My solution uses ffmpy (http://ffmpy.readthedocs.io/en/latest/ffmpy.html ) which is a wrapper for FFmpeg and FFprobe (http://trac.ffmpeg.org/wiki/FFprobeTips). Smoother than other offerings, yet it still requires an additional FFmpeg installation.

        import ffmpy, subprocess, json
        ffprobe = ffmpy.FFprobe(global_options="-loglevel quiet -sexagesimal -of json -show_entries stream=width,height,duration -show_entries format=duration -select_streams v:0", inputs={"myvideo.mp4": None})
        print("ffprobe.cmd:", ffprobe.cmd)  # printout the resulting ffprobe shell command
        stdout, stderr = ffprobe.run(stderr=subprocess.PIPE, stdout=subprocess.PIPE)
        # std* is byte sequence, but json in Python 3.5.2 requires str
        ff0string = str(stdout,'utf-8')
    
        ffinfo = json.loads(ff0string)
        print(json.dumps(ffinfo, indent=4)) # pretty print
    
        print("Video Dimensions: {}x{}".format(ffinfo["streams"][0]["width"], ffinfo["streams"][0]["height"]))
        print("Streams Duration:", ffinfo["streams"][0]["duration"])
        print("Format Duration: ", ffinfo["format"]["duration"])
    

    Results in output:

        ffprobe.cmd: ffprobe -loglevel quiet -sexagesimal -of json -show_entries stream=width,height,duration -show_entries format=duration -select_streams v:0 -i myvideo.mp4
        {
            "streams": [
                {
                    "duration": "0:00:32.033333",
                    "width": 1920,
                    "height": 1080
                }
            ],
            "programs": [],
            "format": {
                "duration": "0:00:32.064000"
            }
        }
        Video Dimensions: 1920x1080
        Streams Duration: 0:00:32.033333
        Format Duration:  0:00:32.064000
    

    UPDATE after several days of experimentation: The hachoire solution as proposed by Nick below does work, but will give you a lot of headaches, as the hachoire responses are too unpredictable. Not my choice.

    With opencv coding couldn't be any easier:

    import cv2
    vid = cv2.VideoCapture( picfilename)
    height = vid.get(cv2.CAP_PROP_FRAME_HEIGHT) # always 0 in Linux python3
    width  = vid.get(cv2.CAP_PROP_FRAME_WIDTH)  # always 0 in Linux python3
    print ("opencv: height:{} width:{}".format( height, width))
    

    The problem is that it works well on Python2 but not on Py3. Quote: "IMPORTANT NOTE: MacOS and Linux packages do not support video related functionality (not compiled with FFmpeg)" (https://pypi.python.org/pypi/opencv-python).

    On top of this it seems that opencv needs the presence of the binary packages of FFmeg at runtime (https://docs.opencv.org/3.3.1/d0/da7/videoio_overview.html).

    Well, if I need an installation of FFmpeg anyway, I can stick to my original ffmpy example shown above :-/

    Thanks for the help.

    UPDATE2: master_q (see below) proposed MediaInfo. While this failed to work on my Linux system (see my comments), the alternative of using pymediainfo, a py wrapper to MediaInfo, did work. It is simple to use, but it takes 4 times longer than my initial ffprobe approach to obtain duration, width and height, and still needs external software, i.e. MediaInfo:

    from pymediainfo import MediaInfo
    media_info = MediaInfo.parse("myvideofile")
    for track in media_info.tracks:
        if track.track_type == 'Video':
            print("duration (millisec):", track.duration)
            print("width, height:", track.width, track.height)
    

    UPDATE3: OpenCV is finally available for Python3, and is claimed to run on Linux, Win, and Mac! It makes it really easy, and I verfied that external software - in particular ffmpeg - is NOT needed!

    First install OpenCV via Pip:

    pip install opencv-python
    

    Run in Python:

    import cv2
    cv2video = cv2.VideoCapture( videofilename)
    height = cv2video.get(cv2.CAP_PROP_FRAME_HEIGHT)
    width  = cv2video.get(cv2.CAP_PROP_FRAME_WIDTH) 
    print ("Video Dimension: height:{} width:{}".format( height, width))
    
    framecount = cv2video.get(cv2.CAP_PROP_FRAME_COUNT ) 
    frames_per_sec = cv2video.get(cv2.CAP_PROP_FPS)
    print("Video duration (sec):", framecount / frames_per_sec)
    
    # equally easy to get this info from images
    cv2image = cv2.imread(imagefilename, flags=cv2.IMREAD_COLOR  )
    height, width, channel  = cv2image.shape
    print ("Image Dimension: height:{} width:{}".format( height, width))
    

    I also needed the first frame of a video as an image, and used ffmpeg for this to save the image in the file system. This also is easier with OpenCV:

    hasFrames, cv2image = cv2video.read()   # reads 1st frame
    cv2.imwrite("myfilename.png", cv2image) # extension defines image type
    

    But even better, as I need the image only in memory for use in the PyQt5 toolkit, I can directly read the cv2-image into an Qt-image:

    bytesPerLine = 3 * width
    # my_qt_image = QImage(cv2image, width, height, bytesPerLine, QImage.Format_RGB888) # may give false colors!
    my_qt_image = QImage(cv2image.data, width, height, bytesPerLine, QImage.Format_RGB888).rgbSwapped() # correct colors on my systems
    

    As OpenCV is a huge program, I was concerned about timing. Turned out, OpenCV was never behind the alternatives. I takes some 100ms to read a slide, all the rest combined takes never more than 10ms.

    I tested this successfully on Ubuntu Mate 16.04, 18.04, and 19.04, and on two different installations of Windows 10 Pro. (Did not have Mac avalable). I am really delighted about OpenCV!

    You can see it in action in my SlideSorter program, which allows to sort images and videos, preserve sort order, and present as slideshow. Available here: https://sourceforge.net/projects/slidesorter/

  • ullix
    ullix over 6 years
    Well, that looks like we are getting there. But look at that RegEx jungle and all the caveats! A little bit of json might do wonders.
  • ullix
    ullix over 6 years
    I found opencv could give the answer as a 3-liner, were it not for the problem that "MacOS and Linux packages do not support video related functionality (not compiled with FFmpeg)."
  • Nick
    Nick over 6 years
    Could you clarify your comment about using JSON? The text returned from hachoir lacks structure and the dict this function returns is effectively a cleaner representation similar to data loaded from JSON. I initially implemented this with splits instead of regexes, but as I encountered more edge cases, it made more sense to use regexes
  • ullix
    ullix over 6 years
    My json comments were directed to the hachoire folks, and not to your regexes. I am wondering what the value of „- Image height: 1080 pixels“, or of „- Comment: User volume: 100.0%“ is in programs? Wouldn‘t I rather need the numbers itself, like 1080 or 100% (as numbers, not as strings)? It takes additional effort to extract the numbers.
  • ullix
    ullix over 6 years
    Your code could be made more 'jsonic' to me by replacing return metadata_dict with return json.dumps(metadata_dict, indent=4) - works just as well for pretty printing as for processing. It still leaves the need for extracting the numbers, which I did with if ... elif ...else statements within your for loop. Not nice but a workaround to the hachoir limits.
  • ullix
    ullix over 6 years
    After some more fiddling I am underwhelmed with the hachoir stuff. Depending on the video the keys like width, height and duration may be under completely different headers, which makes a dictionary – be it json compliant or not – not very useful. In addition, some keys come in duplicates, even within the same header, resulting in overwriting of the first one. Eventually I went to string search functions and string splitting to extract width, height and duration. Not sure that is reliable? (Note: your def length fails when no „ms“ is present; a missing „?“ at the end of the match string?)
  • Nick
    Nick over 6 years
    Thanks for the heads up on the length failure. I hadn't encountered any that were missing ms yet - I'll have to correct that. I definitely agree that the approach is limited - it works for my needs right now, but sounds like it doesn't for yours. Definitely not for more complex use cases because it'll require repeated looping through the main keys to check for values.
  • ullix
    ullix over 6 years
    Thanks. In Ubuntu Linux Mate 16.04 you had to install: Py2: python-mediainfodll Py3: python3-mediainfodll and then import as: Py2: MediaInfoDLL Py3: MediaInfoDLL3 Will give it a try.
  • ullix
    ullix over 6 years
    Quite odd: while the import can de done, and the MI.Open is ok, in Py2 each "value" is empty, and in Py3 the MI.Get command always produces an error. Tried with an mp4 and a mov file.
  • master_q
    master_q over 6 years
    Did not play with it on Linux yet, will soon though. On Windows I have imported MediaInfoDLL.py (MediaInfoDLL3.py is exactly the same, not needed) and I have Mediainfo.DLL included in working directory.
  • ullix
    ullix over 6 years
    On (Ubuntu)Linux you must import MediaInfoDLL on Py2 and *3 on Py3, or you will get an importerror. But it does not work. MediaInfo is installed
  • ullix
    ullix over 6 years
    But using pymediainfo ("from pymediainfo import MediaInfo"), which I understand is a py wrapper to MediaInfo, does work. It shows that MediaInfo is accessible. However, for my needs of getting duration, width and height it does take >4 times longer than the ffprobe approach shown in the initial post. And still needs the external installation of MediaInfo