Calculate area of polygon given (x,y) coordinates

94,987

Solution 1

Implementation of Shoelace formula could be done in Numpy. Assuming these vertices:

import numpy as np
x = np.arange(0,1,0.001)
y = np.sqrt(1-x**2)

We can redefine the function in numpy to find the area:

def PolyArea(x,y):
    return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))

And getting results:

print PolyArea(x,y)
# 0.26353377782163534

Avoiding for loop makes this function ~50X faster than PolygonArea:

%timeit PolyArea(x,y)
# 10000 loops, best of 3: 42 µs per loop
%timeit PolygonArea(zip(x,y))
# 100 loops, best of 3: 2.09 ms per loop.

Timing is done in Jupyter notebook.

Solution 2

The most optimized solution that covers all possible cases, would be to use a geometry package, like shapely, scikit-geometry or pygeos. All of them use C++ geometry packages under the hood. The first one is easy to install via pip:

pip install shapely

and simple to use:

from shapely.geometry import Polygon
pgon = Polygon(zip(x, y)) # Assuming the OP's x,y coordinates

print(pgon.area)

To build it from scratch or understand how the underlying algorithm works, check the shoelace formula:

# e.g. corners = [(2.0, 1.0), (4.0, 5.0), (7.0, 8.0)]
def Area(corners):
    n = len(corners) # of corners
    area = 0.0
    for i in range(n):
        j = (i + 1) % n
        area += corners[i][0] * corners[j][1]
        area -= corners[j][0] * corners[i][1]
    area = abs(area) / 2.0
    return area

Since this works for simple polygons:

  • If you have a polygon with holes : Calculate the area of the outer ring and subtrack the areas of the inner rings

  • If you have self-intersecting rings : You have to decompose them into simple sectors

Solution 3

By analysis of Mahdi's answer, I concluded that the majority of time was spent doing np.roll(). By removing the need of the roll, and still using numpy, I got the execution time down to 4-5µs per loop compared to Mahdi's 41µs (for comparison Mahdi's function took an average of 37µs on my machine).

def polygon_area(x,y):
    correction = x[-1] * y[0] - y[-1]* x[0]
    main_area = np.dot(x[:-1], y[1:]) - np.dot(y[:-1], x[1:])
    return 0.5*np.abs(main_area + correction)

By calculating the correctional term, and then slicing the arrays, there is no need to roll or create a new array.

Benchmarks:

10000 iterations
PolyArea(x,y): 37.075µs per loop
polygon_area(x,y): 4.665µs per loop

Timing was done using the time module and time.clock()

Solution 4

maxb's answer gives good performance but can easily lead to loss of precision when coordinate values or the number of points are large. This can be mitigated with a simple coordinate shift:

def polygon_area(x,y):
    # coordinate shift
    x_ = x - x.mean()
    y_ = y - y.mean()
    # everything else is the same as maxb's code
    correction = x_[-1] * y_[0] - y_[-1]* x_[0]
    main_area = np.dot(x_[:-1], y_[1:]) - np.dot(y_[:-1], x_[1:])
    return 0.5*np.abs(main_area + correction)

For example, a common geographic reference system is UTM, which might have (x,y) coordinates of (488685.984, 7133035.984). The product of those two values is 3485814708748.448. You can see that this single product is already at the edge of precision (it has the same number of decimal places as the inputs). Adding just a few of these products, let alone thousands, will result in loss of precision.

A simple way to mitigate this is to shift the polygon from large positive coordinates to something closer to (0,0), for example by subtracting the centroid as in the code above. This helps in two ways:

  1. It eliminates a factor of x.mean() * y.mean() from each product
  2. It produces a mix of positive and negative values within each dot product, which will largely cancel.

The coordinate shift does not alter the total area, it just makes the calculation more numerically stable.

Solution 5

It's faster to use shapely.geometry.Polygon rather than to calculate yourself.

from shapely.geometry import Polygon
import numpy as np
def PolyArea(x,y):
    return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
coords = np.random.rand(6, 2)
x, y = coords[:, 0], coords[:, 1]

With those codes, and do %timeit

%timeit PolyArea(x,y)
46.4 µs ± 2.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit Polygon(coords).area
20.2 µs ± 414 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Share:
94,987
pbreach
Author by

pbreach

Updated on July 05, 2022

Comments

  • pbreach
    pbreach almost 2 years

    I have a set of points and would like to know if there is a function (for the sake of convenience and probably speed) that can calculate the area enclosed by a set of points.

    for example:

    x = np.arange(0,1,0.001)
    y = np.sqrt(1-x**2)
    
    points = zip(x,y)
    

    given points the area should be approximately equal to (pi-2)/4. Maybe there is something from scipy, matplotlib, numpy, shapely, etc. to do this? I won't be encountering any negative values for either the x or y coordinates... and they will be polygons without any defined function.

    EDIT:

    points will most likely not be in any specified order (clockwise or counterclockwise) and may be quite complex as they are a set of utm coordinates from a shapefile under a set of boundaries

  • pbreach
    pbreach almost 10 years
    Mine might be very complex polygons. The points are utm coordinates selected from a shapefile under a set of boundaries
  • Mark Dickinson
    Mark Dickinson almost 10 years
    @user2593236: So long as your polygon boundary doesn't cross itself (which is what "simple" means in this context), you should be fine.
  • Nikos Athanasiou
    Nikos Athanasiou almost 10 years
    @user2593236 Simple means concave or convex without holes or self intersections.
  • pbreach
    pbreach almost 10 years
    Oh okay I understand now. Good explanation.
  • user989762
    user989762 almost 9 years
    Great solution. I'm not sure why, but the "top" answer by @Nikos Athanasiou does not work when some of the coordinates are negative. Also another solution listed here had that issue. Your solution is the only one that worked. Just check with xxx = np.array([[-100,0],[100,0],[100,150],[-100,150],[-100,0]])
  • Mahdi
    Mahdi almost 9 years
    @user989762: But I am getting the same answer using both methods!
  • pbreach
    pbreach over 8 years
    Interesting. Seems like this would be fast and easy to jit compile with numba. Do you have a reference for this?
  • DrSocket
    DrSocket over 8 years
    # Given the number of sides, n, and the length of each side, s, the polygon's area is # 1/4 n s2 / tan( pi/n) Interactive Python (Rice University, Coursera) again here: Area of a Polygon (academia.edu/5179705/Exercise_1_How_to_design_programs) I did the function from that...
  • pbreach
    pbreach over 8 years
    This is for a regular polygon which is a special but very limited case of this problem. All sides must be the same length (which would need to be calculated as well). If you explained what n and s are then maybe it would be more apparent...
  • diegopso
    diegopso about 7 years
    I tried with very simple coordinates [(0.0, 0.0), (1.0, 0.0), (0.0, 1.0), (1.0, 1.0)] and it gave 0.0 area. Are there any limitations that you know? Also tried to shift it out of the origin, getting the same result.
  • emredog
    emredog almost 6 years
    rookie mistake: not providing the points in an ordered (clockwise/counter clockwise) manner would yield faulty results.
  • Eskapp
    Eskapp over 5 years
    I get a difference between this approach and the one of Mahdi when I define x and y such as x_{n+1} = x_1 and x_0 = x_n, as well as y_{n+1} = y_1 and y_0 = y_n as required to apply the shoelace formula (see en.wikipedia.org/wiki/Shoelace_formula#Definition) The difference is slight because the points are the vertices are so close to each other but exists and may be magnified when working with polygons with longer sides.
  • maxb
    maxb over 5 years
    Of course there are floating point errors, as with any implementation. Could you provide a full example of the difference? If you need more precision, you could use arbitrary precision arithmetics.
  • Eskapp
    Eskapp over 5 years
    My bad, I was confused about the correction term and thought that some difference I could observe could come from there while tracking a bug in my code. Seems to work perfectly after many more tests comparing different implementations for computing the area of polygons. Your solution has the speed advantage as well as being easy to read!
  • maxb
    maxb over 5 years
    @Eskapp glad to hear everything is working correctly!
  • Pithikos
    Pithikos about 5 years
    @diegopso seems that it works only if the points are in a series of drawing. So it will work for [(0, 0), (0, 1), (1, 1), (1, 0)]
  • Edip Ahmet
    Edip Ahmet almost 5 years
    It gave wrong result when I tried more complex area with 15 vertices.
  • Admin
    Admin almost 5 years
    can you please provide the coordinates?
  • Edip Ahmet
    Edip Ahmet almost 5 years
    Sorry it is my fault. I tested your code a few times and compared the results with CAD software, I tested coords=[(1141.784,893.124), (1521.933,893.124), (1521.933,999.127), (1989.809,999.127), (1989.809,622.633), (2125.054,622.633), (2125.054,326.556), (1372.067,326.556), (1372.067,-60.903), (1872.84,-60.903), (1872.84,52.41), (2015.396,52.41), (2015.396,-455.673), (1090.611,-455.673), (1086.955,436.214), (1141.784,893.124)] Yesterday I got wrong result maybe I missed something, today it works great like PolygonArea function.
  • Edip Ahmet
    Edip Ahmet almost 5 years
    I think I comment it by mistake, maybe I tried another function here yesterday.
  • Admin
    Admin almost 5 years
    Glad I could help
  • pstatix
    pstatix over 4 years
    Can you explain how the dot product is used instead of cross products?
  • pstatix
    pstatix over 4 years
    Can you explain how you used the dot product instead of the cross product as the shoelace forumla states?
  • Mahdi
    Mahdi over 4 years
    @pstatix: Indeed the shoelace formula can be written in terms of the exterior product but you can expand the product, and you'll see there are two types of terms: positive terms and negative terms. If you separate them into two terms, you'd see they are the product of x and y then you can write those x's and y's as two vectors with a dot product between them. Look at the proof for a triangle section here: en.wikipedia.org/wiki/Shoelace_formula
  • maxb
    maxb over 4 years
    @pstatix if you look at the Wikipedia article for the Shoelace formula, it can be visualized as a shifted dot product. I didn't come up with the formula myself, but I did realize that the pattern of calculation used directly matched using the dot product (or rather two dot products), with one vector in each product shifted around. For more info I'd just read the article, the only thing I did for this answer was to improve the performance of the algorithm.
  • Jiadong
    Jiadong over 3 years
    numpy is quite standard, but shapely is abit faster
  • HumbleBee
    HumbleBee over 3 years
    The only solution that offered the correct result! Kudos! See my answer for an slightly modified version that takes a list of tuples.
  • Eugene W.
    Eugene W. over 3 years
    You did not mention here that it works for integer arrays
  • crazjo
    crazjo almost 3 years
    This actually seems the fastest, at least for the simple polygons I tested