2D coordinate normalization

algorithm language-agnostic math coordinate-transformation

14,653

Solution 1

I am posting my response as an answer because I do not have enough points to make a comment.

My interpretation of the question: How do we normalize the coordinates of a set of points in 2 dimensional space?

A normalization operation involves a "shift and scale" operation. In case of 1 dimensional space this is fairly easy and intuitive (as pointed out by @Mizipzor).

normalizedX=(originalX-minX)/(maxX-minX)

In this case we are first shifing the value by a distance of minX and then scaling it by the range which is given by (maxX-minX). The shift operation ensures that the minimum moves to 0 and the scale operation squashes the distribution such that the distribution has an upper limit of 1

In case of 2d , simply dividing by the largest dimension is not enought. Why?

Consider the simplified case with just 2 points as shown below. The maximum value of any dimension is the Y value of point B and this 10000.

Coordinates of normalized A=>5000/10000,8000/10000 ,i.e 0.5,0.8
Coordinates of normalized A=>7000/10000,10000/10000 ,i.e 0.7,1.0

The X and Y values are all with 0 and 1. However, the distribution of the normalized values is far from uniform. The minimum value is just 0.5. Ideally this should be closer to 0.

Preferred approach for normalizing 2d coordinates

To get a more even distribution we should do a "shift" operation around the minimum of all X values and minimum of all Y values. This could be done around the mean of X and mean of Y as well. Considering the above example,

the minimum of all X is 5000
the minimum of all Y is 8000

Step 1 - Shift operation

A=>(5000-5000,8000-8000), i.e (0,0)
B=>(7000-5000,10000-8000), i.e. (2000,2000)

Step 2 - Scale operation

To scale down the values we need some maximum. We could use the diagonal AB whose length is 2000

A=>(0/2000,0/2000),     i.e. (0,0)
B=>(2000/2000,2000/2000)i.e. (1,1)

What happens when there are more than 2 points? The approach remains similar. We find the coordinates of the smallest bounding box which fits all the points.

We find the minimum value of X (MinX) and minimum value of Y (MinY) from all the points and do a shift operation. This changes the origin to the lower left corner of the bounding box.
We find the maximum value of X (MaxX) and maximum value of Y (MaxY) from all the points.
We calculate the length of the diagonal connecting (MinX,MinY) and (MaxX,MaxY) and use this value to do a scale operation.

length of diagonal=sqrt((maxX-minX)*(maxX-minX) + (maxY-minY)*(maxY-minY))   

normalized X = (originalX - minX)/(length of diagonal) 
normalized Y = (originalY - minY)/(length of diagonal)

How does this logic change if we have more than 2 dimensions?

The concept remains the same. - We find the minimum value in each of the dimensions (X,Y,Z) - We find the maximum value in each of the dimensions (X,Y,Z) - Compute the length of the diagonal as a scaling factor - Use the minimum values to shift the origin.

length of diagonal=sqrt((maxX-minX)*(maxX-minX)+(maxY-minY)*(maxY-minY)+(maxZ-minZ)*(maxZ-minZ))

normalized X = (originalX - minX)/(length of diagonal) 
normalized Y = (originalY - minY)/(length of diagonal)
normalized Z = (originalZ - minZ)/(length of diagonal)

Solution 2

It seems you want each vector (1D, 2D or ND) to have length <= 1.
If that's the only requirement, you can just divide each vector by the length of the longest one.

double max = maximum (|vector| for each vector in 'data');
foreach (Vector v : data) {
    li.add(v / max);
}

That will make the longest vector in result list to have length 1.

But this won't be equivalent of your current code for 1-dimensional case, as you can't find minimum or maximum in a set of points on the plane. Thus, no delta.

Solution 3

Simple idea: Find out which dimension is bigger and normalize in this dimension. The second dimension can be computed by using the ratio. This way the ratio is kept and your values are between 0 and 1.

14,653

Author by

Mizipzor

A crazy coder in a passionate hunt for greater wisdom. I take great interest in anything involving math and algorithms. Especially path finding, artificial life, cellular automata and emergent behavior.

Updated on June 04, 2022

Comments

Mizipzor almost 2 years
I need to implement a function which normalizes coordinates. I define normalize as (please suggest a better term if Im wrong):

Mapping entries of a data set from their natural range to values between 0 and 1.

Now this was easy in one dimension:
```
    static List<float> Normalize(float[] nums)
    {
        float max = Max(nums);
        float min = Min(nums);
        float delta = max - min;

        List<float> li = new List<float>();
        foreach (float i in nums)
        {
            li.Add((i - min) / delta);
        }
        return li;
    }
```
I need a 2D version as well and that one has to keep the aspect ratio intact. But Im having some troubles figuring out the math.

Although the code posted is in C# the answers need not to be.

Thanks in advance. :)
Mizipzor over 13 years

You mean I should calculate delta for both dimensions and use the higher number during the normalization of both?
kasten over 13 years

Yep that does the same and is even simpler :).
Mizipzor over 13 years

Hm. Maybe I made a bad call wanting the lowest value at 0 and the highest at 1. I assume this is usually the prefered way of normalizing coordinates?
Nikita Rybak over 13 years

@mizipzor You can choose a vector to be zero and subtract it from every vector in input. Say, you have (0, 1), (1, 0) and (2, 2) and get (0, 0), (1, -1), (2, 1) by moving first vector to zero.
Nikita Rybak over 13 years

@mizipzor But my point is: there're so many ways of 'moving' set of vectors into a circle of radius one, that "the best method" simply doesn't exist.
Nikita Rybak over 13 years

@mizipzor I think I've misread your comment. Yes, you can't compare multidimensional vectors to scalar numbers (like zero).
ysap over 13 years

@Nikita - I think the OP has to clarify if he wants the longest vector to be normalized to 1 (thus all vectors are bounded in a unit circle), or does he want the longest vector component to be bound to 1 (thus all vectors are bounded in the unit square). Additionally, there is a difference if the normalized coordinates span from -1 to 1 or from 0 to 1.