sp::over() for point in polygon analysis
Solution 1
You should not supply a function. You are aggregating the attribute values of your points over the geometry of the polygon, (i.e. the number returned is the mean
of the attribute of the points that fall within the polygon). In addition you have your x
and y
the wrong way round for what you want to do. Should be...
over( pnts , ind_adm , fn = NULL)
Solution 2
Found this concise and intuitive syntax for over:
pnts[ind_adm,]
from this Intro document
Solution 3
You can use point.in.poly
fom spatialEco
package. It "intersects point and polygon feature classes and adds polygon attributes to points".
library(spatialEco)
new_shape <- point.in.poly(pnts, ind_adm)
Solution 4
You could also use the st_intersection
function from the sf
package:
Load the library
library(sf)
Create a simple feature geometry (polygon) from your polygon
ind_adm <- st_as_sf(ind_adm)
Create a simple feature geometry (point) from your points of interest
(24047 is the EPSG code for India)
pnts <- st_as_sf(pnts) %>% st_set_crs(., 24047)
Keep only the points inside the polygon
kept_points <- st_intersection(ind_adm, pnts)
DotPi
Updated on December 31, 2020Comments
-
DotPi over 3 years
I have a shapefile named "ind_adm" and a SpatialPointsDataFrame called "pnts". The "pnts" contains points generated at random, and some of the points overlap with the polygon. See picture below.
Now, I want do do a point in polygon analysis, i.e. I want to find out which points lie inside the gray polygon representing the boundary of India. For this I am using the over() function in the sp library.
pt.in.poly <- sp::over(ind_adm, pnts, fn = mean) #do the join
However, the output I am getting is
>pt.in.poly values 0 6.019467
I should actually get the index of the points that are "in" the polygon.
Where am I going wrong?
-
DotPi over 10 yearsThanks Simon. The problem was with the order of x and y. The function specifies how the objects are aggregated, hence "mean" works fine. I guess a good way to remember the order of x and y in the future is by remembering that the analysis is called "Point in Polygon", hence the point goes first, followed by the polygon.
-
asado23 about 8 yearsYou have to make sure that your spatialpolygon (in this case ind_adm) does not have missing values in some of the columns, otherwise it will show an error when you use point.in.poly.
-
jjunju almost 7 yearsThis answer worked best for me. In my case pnts was a SpatialPointsDataFrame and ind_adm a SpatialPolygonsDataFrame.
-
Martin Boros almost 6 yearsThank you so much! This deserves 1000 upvotes, very intuitive and works perfectly for subsetting location data by shapefiles, especially when working with census data and revenue and so forth
-
Huanfa Chen over 5 yearsGreat solution. Tip: make sure that pnts and ind_adm use the same CRS.