Thread count option in FFmpeg for FASTEST conversion to h264?

68,205

Solution 1

I have found that threads do not do a good job of utilizing all the cores, the hyper-threads do not get used at all. One solution I could come up with is to run a 3 to 4 ffmpeg processes in parallel, See: https://superuser.com/questions/538164/how-many-instances-of-ffmpeg-commands-can-i-run-in-parallel/547340#547340 This approach ends up using all the cores fully and is faster than the single input, multiple outputs in a single command option.

Solution 2

I have experimented thoroughly with threads 0, 6, 12, 24 and it doesn't make a difference in frame rate, overall processing time or CPU utilization. Note my system has 12 physical cores too. Generally it seems to do a good job of using your processing power without specifying threads where my 12 cores are basically 98-99% utilized for the duration while watching top/system monitor.

I wish there was a magic bullet but for now there is no other way to speed things up as ffmpeg is currently optimized very well in my opinion. The only alternative is simply to get more computing power or to do distributed processing.

*Note all my tests were using ffmpeg version 3.3.1

Solution 3

If your 'dual-core' has hyperthreading, then 2x cores would probably be correct. There's unlikely to be gain going beyond the number of virtual cores (inc. hyperthreading), but perhaps due to internal issues in FFmpeg it might be true.

Share:
68,205
S B
Author by

S B

Updated on July 26, 2020

Comments

  • S B
    S B almost 4 years

    I need to maximize speed while converting videos using FFmpeg to h264

    • Any input format of source videos
    • User's machine can have any number of cores
    • Power and memory consumption are non-issues

    Of course, there are a whole bunch of options that can be tweaked but this question is particularly about choosing the best -thread <count> option. I am trying to find an ideal thread count as a function of

    • no. of cores
    • input video format
    • h264-friendly values maybe?
    • anything else missed above?

    I am aware the default -thread 0 follows one-thread-per-core approach which is supposed to be optimal. But I am not sure if this is time or space-optimized. Also, on certain testcases, I've seen more threads (say 4 threads on my dual core test machine) finishes quicker than the default.

    Any other direction, say configure options w.r.t. threads, worth pursuing?