Geopandas: how to read a csv and convert to a geopandas dataframe with polygons?

22,591

Solution 1

For some reason geopandas seems to be unable to convert a geometry column from a pandas dataframe. You could try two approaches.

Number 2: Try applying the shapely wkt.loads function on your column before converting your dataframe to a geodataframe.

from shapely import wkt

df['geometry'] = df['geometry'].apply(wkt.loads)
gdf = gpd.GeoDataFrame(df, crs='epsg:4326')

Either way should work. Good luck!


Do not use - crashes spyder and jupyter kernel for some people

Number 1: Try loading the csv directly with geopandas

gdf = gpd.read_file('myFile.csv')
gdf.crs = 'epsg:4326'

Solution 2

You could also try this:

gdf = gpd.GeoDataFrame(
    df, geometry=gpd.points_from_xy(df.longitude, df.latitude)
)

This will convert those lat/long columns to points

Share:
22,591
emax
Author by

emax

Updated on March 05, 2022

Comments

  • emax
    emax about 2 years

    I read a .csv file as a dataframe that looks like the following:

    import pandas as pd
    df = pd.read_csv('myFile.csv')
    df.head()
        BoroName    geometry
    0   Brooklyn    MULTIPOLYGON (((-73.97604935657381 40.63127590...
    1   Queens      MULTIPOLYGON (((-73.80379022888098 40.77561011...
    2   Queens      MULTIPOLYGON (((-73.8610972440186 40.763664477...
    3   Queens      MULTIPOLYGON (((-73.75725671509139 40.71813860...
    4   Manhattan   MULTIPOLYGON (((-73.94607828674226 40.82126321...
    

    I want to convert it to a geopandas dataframe.

    import geopandas as gpd
    crs = {'init': 'epsg:4326'}
    gdf = gpd.GeoDataFrame(df, crs=crs).set_geometry('geometry')
    

    but I get the following error

    TypeError: Input must be valid geometry objects: MULTIPOLYGON (((-73.97604935657381 40.631275905646774, -73.97716511994669 40.63074665412933,....
    
  • shadow_dev
    shadow_dev almost 4 years
    But where within the dataframe presented are the columns latitude and longitude?
  • Sam Murphy
    Sam Murphy about 3 years
    number 2 works for me. number 1 kills my jupyter-notebook kernel for some reason
  • Stelios K.
    Stelios K. almost 3 years
    Number 1 also crashes Spyder kernel
  • mins
    mins over 2 years
    The correct order when using geometric coordinates with points_from_xy is points_from_xy(longitudes, latitudes).
  • eeny
    eeny over 2 years
    Number 2 works great