Which resize algorithm to choose for videos?

encoding video-encoding virtualdub

48,262

Solution 1

TL;DR

When sampling down: Use Lanczos or Spline filtering.

When sampling up: Use Bicubic or Lanczos filtering.

These are based on material I've read over the years, and from what I've seen used in the industry. The recommendations may vary depending on content type and application area.

Why does it matter?

It could be argued that the resizing filters don't matter that much when you downscale a video. More importantly, they have an impact on the quality when upscaling, because you need to generate data where there isn't in the first place.

These filters all have only a marginal impact on file size. You therefore shouldn't worry about huge differences there.

Fact is, as always when encoding video, that the result heavily depends on the source material. You can't always predict the result, but just see what works best for you.

Different algorithms

As an example, here's bicubic vs. bilinear interpolation:

enter image description here

See that bicubic interpolation results in smoother edges? That's a very general statement … but you can find an overview of image scaling algorithms here.

Bilinear interpolation uses a 2x2 environment of a pixel and then takes the average of these pixels to interpolate the new value. It's not the best algorithm, but rather fast.
Bicubic interpolation uses a 4x4 environment of a pixel, weighing the innermost pixels higher, and then takes the average to interpolate the new value. It's – as far as I'm concerned – the most popular.
Area averaging uses a mapping of source and destination pixels, averaging the source pixels with regards to the fraction of destination pixels that are covered. According to this page, it should produce better results when downsampling.
Spline and sinc interpolation use higher-order polynomials and are therefore harder to compute than bicubic interpolation. I don't think the overall increase in processing time is worth using them.
Lanczos resampling involves a sinc filter as well. It is more computationally expensive but usually described as very high quality and can be used for up- and downsampling.
hqx as well as 2xSaI filters are used for pixel-art scaling (e.g. game emulators). I don't think there's a good reason for using them in video.

Jeff Atwood's comparison

It turns out Jeff Atwood did a comparison of image interpolation algorithms. His rule of thumb was to use bicubic interpolation for downsampling and bilinear interpolation when upsampling. That said, this is not what is typically recommended for video encoding – and some commenters have raised doubts about Atwood's expertise in the field.

However, he also mentioned that …

Reducing images is a completely safe and rational operation. You're simply reducing precision and resolution by discarding information. Make the image as small as you want, and you have complete fidelity-- within the bounds of the number of pixels you've allowed. You'll get good results no matter which algorithm you pick. (Well, unless you pick the nave Pixel Resize or Nearest Neighbor algorithms.)

Other examples

Here are some more examples of image interpolation algorithms, including the ones I mentioned above.

I also found documents (scene rules) from the video encoding scene that explicitly ban bicubic filtering for downsampling. Instead, they endorse Lanczos, Spline, or "Blackman" resampling.

Solution 2

I found a good image that documents some of this.

Full size version here.

In general you want a mild sharpening effect when making a larger image into a smaller one, and a mild blurring effect when making a smaller image into a larger one. The MadVR filter set defaults to Lanczos for upscaling and bicubic for downscaling.

Solution 3

You are converting 3x3 original pixels to 2x2 target pixels.

If you want to keep sharp lines choose Lanczos or something that uses more surrounding pixels to not blur sharp lines (like fur or reflections)

Otherwise area average etc (also bilinear/trilinear) would suffice.

48,262

Grumpy ol' Bear

A meta-level binary dude embedded to a multimedia proxy world! Now we are all sons of bitches. - Kenneth Tompkins Bainbridge Patria O Muerte! - Ernesto 'Che' Guevara One useless man is a shame, two is a law firm, three or more is a congress. - John Quincy Adams Information is the currency of democracy. - Thomas Jefferson Now git of me goddamned lawn!

Updated on September 18, 2022

Comments

Grumpy ol' Bear almost 2 years

I'm using VirtualDub for encoding with those settings.

However I record my stuff in 1920x1080 and resize it down to 1280x720. Now the question: which algorithm should I choose when making a balanced quality vs. file-size decision?

I always went with Lanczos because that's what was pre-configured. Those descriptions don't really help me at all in my question.
Psycogeek over 12 years

I used to always use "precise bicubic A=100". On a re-install of the updated program it was Lanczos defaulted, many people liked it. I left it that way for a long time. Eventually I get around to watching the later encoded Lanczos things, and thought that wasnt as good, next set of encodes, I turned it back to bicubic. I was crunching the compression also, I think Lanczos might have seemed better if I wasnt trying to reduce the total data size so much.
Alex almost 9 years

FWIW I wouldn't consider Jeff Atwood an expert in image processing, and in that article he doesn't examine anything other than bilinear, nearest neighbour or (one particular variant of) bicubic. Most people would agree his recommendation to use bilinear when enlarging is a bad one.
slhck almost 9 years

@thomasrutter Thanks. I agree with you—back when I wrote this, I probably didn't know as much about image processing as I do now. I guess I'll remove the reference to that article and find some other source.
Admin about 2 years

Hey, it's been 3 years since your last edit, do you know if these scene rules have changed somehow? I use FFmpeg to downscale videos and I think I do not have any access to Blackman algorithm. I do have access to LanczosX (where X is a number set by scale/zscale filter's parameter), Spline16 and Spline32 (by using zscale filter) and some unknown Spline (by using scale filter). Do you know what's the order of preference here? Which algorithm should I use as FFmpeg's best one to downscale videos? Is there any difference in algorithm to use when downscaling "real" and animated videos?
Admin about 2 years

@Lex Scene rules seem to recommend spline 64 and 32. My recommendations are based on what folks in the video encoding and testing domain are using — and here, Lanczos is preferred. Most people aren't pixel peeping though, and there does not seem to be a strong interest in animated videos … Perhaps you could do a quick visual test and compare with svt.github.io/vivict?
Admin about 2 years

Thank you so much for the answer! I thought that these rules will describe the best options because they scale a lot so should have a lot of experience. Anyway it seems that Spline16 is completely out of the options but is it possible to get Spline64 in FFmpeg? Do you know what SplineX is used in scale filter? It is named just spline, without any number. I think that both scale and zscale use Lanczos3 by default - how does it compare to any other LanczosX, like Lanczos4? What LanczosX do you recommend? PS. I really like that site you linked - is there any portable version of this?
Admin about 2 years

@Lex There's github.com/svt/vivict (Node.js-based) and github.com/svt/vivictpp (C++-based) which both do the same thing. As for recommendations on filters, this is where my experience is limited (Bicubic/bilinear/default Lanczos has been fine for my work). I'd probably just run my own tests and look at the output. As for the ffmpeg implementation, I would have to check the source code. Same for Lanczos. Perhaps you could ask on the ffmpeg-devel mailing list or on video.stackexchange.com for details?
Admin about 2 years

Thank you for these links. Vivict++ mentions that "building on Windows or macOS should be possible" - do you know any place where I can find any already compiled ("ready to use") version for Windows? I will try to follow your advice and ask my question about spline on video.stackexchange.com, thanks!
Admin about 2 years

@Lex There is no prebuilt version, as the developers are using Linux. I found a similar tool here that seems to have Windows binaries though: github.com/pixop/video-compare
Admin about 2 years

That looks like something I was looking for, thank you. :D I have one more question: let's say I have two very similar videos and I know that one of them has some bad artifacts for a few seconds but I do not know the exact time of this fault. Do you know any fast way to compare such videos to find out which one is ok and which one is a faulty one? Is there any app that could compare two videos with some set level of difference and can detect at which time of videos this difference is "too big"?
Admin about 2 years

@Lex Yes, for instance github.com/slhck/ffmpeg-quality-metrics (my own tool) can be used to print per-frame quality scores. You would use the known good video as a reference and the one with the artifacts as the distorted one. If the PSNR or SSIM metric suddenly drops below a certain threshold, that'd be where the artifacts appear. There are GUI variants of such tools. For Windows, github.com/fifonik/FFMetrics seems to be useful. If you have any further questions, feel free to send an email.
Admin about 2 years

Thank you for your comments - they were really helpful. :)