Image downscaling algorithm

c++ c algorithm image image-recognition

26,398

Solution 1

The general approach is to filter the input to generate a smaller size, and threshold to convert to monochrome. The easiest filter to implement is a simple average, and it often produces OK results. The Sinc filter is theoretically the best but it's impractical to implement and has ringing artifacts which are often undesirable. Many other filters are available, such as Lanczos or Tent (which is the generalized form of Bilinear).

Here's a version of an average filter combined with thresholding. Assuming picture4 is the input with pixel values of 0 or 1, and the output is picture3 in the same format. I also assumed that x is the least significant dimension which is opposite to the usual mathematical notation, and opposite to the coordinates in your question.

int thumbwidth = 15;
int thumbheight = 15;
double xscale = (thumbwidth+0.0) / width;
double yscale = (thumbheight+0.0) / height;
double threshold = 0.5 / (xscale * yscale);
double yend = 0.0;
for (int f = 0; f < thumbheight; f++) // y on output
{
    double ystart = yend;
    yend = (f + 1) / yscale;
    if (yend >= height) yend = height - 0.000001;
    double xend = 0.0;
    for (int g = 0; g < thumbwidth; g++) // x on output
    {
        double xstart = xend;
        xend = (g + 1) / xscale;
        if (xend >= width) xend = width - 0.000001;
        double sum = 0.0;
        for (int y = (int)ystart; y <= (int)yend; ++y)
        {
            double yportion = 1.0;
            if (y == (int)ystart) yportion -= ystart - y;
            if (y == (int)yend) yportion -= y+1 - yend;
            for (int x = (int)xstart; x <= (int)xend; ++x)
            {
                double xportion = 1.0;
                if (x == (int)xstart) xportion -= xstart - x;
                if (x == (int)xend) xportion -= x+1 - xend;
                sum += picture4[y][x] * yportion * xportion;
            }
        }
        picture3[f][g] = (sum > threshold) ? 1 : 0;
    }
}

I've now tested this code. Here's the input 200x200 image, followed by a nearest-neighbor reduction to 15x15 (done in Paint Shop Pro), followed by the results of this code. I'll leave you to decide which is more faithful to the original; the difference would be much more obvious if the original had some fine detail.

original nearest neighbor average+threshold

Solution 2

Since you're fine with using a library, you could look into the imagemagick C++ bindings.

You could also output the image in a simple format like a pbm, and then call the imagemagick command to resize it:

system("convert input.pbm -resize 10x10 -compress none output.pbm");

Sample output file (note: you don't need to use a new line for each row):

P1
20 20
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The output file:

P1
10 10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 0 1 1 0 
0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 1 
1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0

Solution 3

I've found an implementation of a bilinear interpolaton. C code.

Assuming that:

a - a primary array (which we need to stretch/compress) pointer.

oldw - primary width

oldh - primary height

b - a secondary array (which we get after compressing/stretching) pointer

neww - secondary width

newh - seconday height

#include <stdio.h>
#include <math.h>
#include <sys/types.h>

void resample(void *a, void *b, int oldw, int oldh, int neww,  int newh)
{
int i;
int j;
int l;
int c;
float t;
float u;
float tmp;
float d1, d2, d3, d4;
u_int p1, p2, p3, p4; /* nearby pixels */
u_char red, green, blue;

for (i = 0; i < newh; i++) {
    for (j = 0; j < neww; j++) {

        tmp = (float) (i) / (float) (newh - 1) * (oldh - 1);
        l = (int) floor(tmp);
        if (l < 0) {
            l = 0;
        } else {
            if (l >= oldh - 1) {
                l = oldh - 2;
            }
        }

        u = tmp - l;
        tmp = (float) (j) / (float) (neww - 1) * (oldw - 1);
        c = (int) floor(tmp);
        if (c < 0) {
            c = 0;
        } else {
            if (c >= oldw - 1) {
                c = oldw - 2;
            }
        }
        t = tmp - c;

        /* coefficients */
        d1 = (1 - t) * (1 - u);
        d2 = t * (1 - u);
        d3 = t * u;
        d4 = (1 - t) * u;

        /* nearby pixels: a[i][j] */
        p1 = *((u_int*)a + (l * oldw) + c);
        p2 = *((u_int*)a + (l * oldw) + c + 1);
        p3 = *((u_int*)a + ((l + 1)* oldw) + c + 1);
        p4 = *((u_int*)a + ((l + 1)* oldw) + c);

        /* color components */
        blue = (u_char)p1 * d1 + (u_char)p2 * d2 + (u_char)p3 * d3 + (u_char)p4 * d4;
        green = (u_char)(p1 >> 8) * d1 + (u_char)(p2 >> 8) * d2 + (u_char)(p3 >> 8) * d3 + (u_char)(p4 >> 8) * d4;
        red = (u_char)(p1 >> 16) * d1 + (u_char)(p2 >> 16) * d2 + (u_char)(p3 >> 16) * d3 + (u_char)(p4 >> 16) * d4;

        /* new pixel R G B  */
        *((u_int*)b + (i * neww) + j) = (red << 16) | (green << 8) | (blue);       
    }
}
}

Hope it will be useful for other users. But nevertheless I still doubth whether it will work in my situation (when not stratching, but compressing an array). Any ideas?

Solution 4

To properly downscale an image, you should divide your image up into square blocks of pixels and then use something like Bilinear Interpolation in order to find the right color of the pixel that should replace the NxN block of pixels you're doing the interpolation on.

Since I'm not so good at the math involved, I'm not going to try give you an example of how the code would like. Sorry :(

Solution 5

I think, you need Interpolation. There are a lot of algorithms, for example you can use Bilinear interpolation

View more solutions

26,398

Author by

user1131662

Updated on March 27, 2020

Comments

user1131662 about 4 years
Could you help me find the right algorithm for image resizing? I have an image of a number. The maximum size is 200x200, I need to get an image with size 15x15 or even less. The image is monochrome (black and white) and the result should be the same. That's the info about my task.

I've already tried one algorithm, here it is
```
// xscale, yscale - decrease/increase rate
for (int f = 0; f<=49; f++)
            {
                    for (int g = 0; g<=49; g++)//49+1 - final size
                    {
                            xpos = (int)f * xscale;
                            ypos = (int)g * yscale;
                            picture3[f][g]=picture4[xpos][ypos];
                    }
            }
```
But it won't work with the decrease of an image, which is my prior target. Could you help me find an algorithm, which could solve that problem (quality mustn't be perfect, the speed doesn't even matter). Some information about it would be perfect too considering the fact I'm a newbie. Of course, a short piece of c/c++ code (or a library) will be perfect too.

Edit: I've found an algorithm. Will it be suitable for compressing from 200 to 20?
Mark Ransom about 12 years

This approach will lead to significant aliasing distortion. It's also the equivalent to the code in the question.
bames53 about 12 years

Interpolation would be used when increasing the size of the image, because you need to create extra pixels between pixels of known colors. To get a smaller image the process is called 'filtering'. One way to do it is compute which pixels in the original image are covered by a pixel in the new image, and set the new pixel's value to be the average of the covered pixels.
user1131662 about 12 years

Oh, again this Wiki link. Unfortunately my math is not great too. So I'd be very grateful for "non-mathematical" explanation.
anatolyg about 12 years

@user1131662 Ignore the interpolation; just do the first part with the square blocks of pixels.
Mark Ransom about 12 years

You can delete it if you want to.
user1131662 about 12 years

Could you give me some example of the command-line utility inside the source code please?
user1131662 about 12 years

I do the treshold before interpolation. So it's not right? Maybe I should describe the input picture better for you to decide which filter will be better? I'll look up some info on these filters, but hope you'll help with an implementation of an Average filter.
user1131662 about 12 years

So is it suitable for compressing, not stretching?
Mark Ransom about 12 years

@user1131662, your question described the input image as monochrome which led me to believe that thresholding before resizing wasn't possible.
user1131662 about 12 years

Sorry seems I need to edit my question. The whole story is that I grab a picture. Then I treshold it using a medium treshold algorithm. Now I have a two-dimensional array with numbers 0(for white background dots) and 1(for black ones - the number itself). Then I would like to resize the picture and then - to recognize. That's all.
user1131662 about 12 years

Nice, the result you've shown is okay for me. Hope the decrease from 200 won't be a huge difference. So I need to download this library. Then I include it in my project. Then I do the things you've shown, right? And could you tell me more about this strange file format?
strcat about 12 years

It's the netpbm format. It's just very simple and easy to deal with manually. If you use the imagemagick API, then you can just do it all in C++ without an intermediate image format (I haven't used it before though).
user1131662 about 12 years

Okay, thatks foe explanation. Will try it out later.
Mark Ransom about 12 years

Interpolation (bilinear or any other) is the wrong answer. It might work for minor size adjustments, but at the scale you're talking about 200:15 it won't be any better than just tossing out the extra pixels.
user1131662 about 12 years

Thanks, I'll test this code as soon as possible. On the first glimpse I see some questions which are necessary to understand. Nevertheless thank you for this one very much.
mousomer over 6 years

I second that. I'm working with images all the time, and I can tell you - there's nothing as bad as linear interpolation on images. Don't.
mousomer over 6 years

It's actually better that linear interpolation. It's the fastest route and isn't as bad as most people think. Yes, you're better off with fancy algorithms like Lanczos. But simple subsampling is OK.