How to Get Polygon Centroids in GeoPandas
Problem statement
A common GIS task is converting polygon features into point features based on their centers. In GeoPandas, this usually means calculating a centroid for each polygon in a GeoDataFrame.
This is useful when you need to:
- label polygons with point locations
- create point layers from area features
- prepare polygon-derived points for joins, exports, or analysis
- generate point outputs from polygon layers for automation workflows
The practical questions are usually:
- which GeoPandas method to use
- what type of result it returns
- whether the centroid is reliable in the current CRS
- why some centroids appear outside polygon boundaries
Quick answer
To get polygon centroids in GeoPandas, use the .centroid attribute on the geometry column:
import geopandas as gpd
gdf = gpd.read_file("data/parcels.shp")
projected_gdf = gdf.to_crs("EPSG:3857")
projected_gdf["centroid"] = projected_gdf.geometry.centroid
This returns a GeoSeries of Point geometries, one for each polygon.
In most GIS workflows, calculate centroids in a projected CRS, not a geographic CRS like EPSG:4326, then convert the result back if needed.
Step-by-step solution
Load polygon data into a GeoDataFrame
Read a shapefile or GeoJSON
Start by loading polygon data from a real GIS file.
import geopandas as gpd
gdf = gpd.read_file("data/parcels.shp")
# or
# gdf = gpd.read_file("data/parcels.geojson")
print(gdf.head())
Check geometry type and CRS
Before calculating centroids, confirm that the geometry column contains polygon data and that the CRS is set correctly.
print("CRS:", gdf.crs)
print("Geometry types:")
print(gdf.geometry.geom_type.value_counts())
Example output:
CRS: EPSG:4326
Geometry types:
Polygon 120
MultiPolygon 15
If the CRS is geographic, for example EPSG:4326, reproject to a suitable projected CRS before calculating centroids when you need spatially reliable results.
Calculate polygon centroids
Reproject before calculating centroids
For accurate centroid placement, use a projected CRS. Web Mercator is shown here for simplicity, but a local projected CRS is usually better for analysis.
projected_gdf = gdf.to_crs("EPSG:3857")
projected_gdf["centroid"] = projected_gdf.geometry.centroid
print(projected_gdf[["centroid"]].head())
This creates a new column containing Shapely Point geometries.
Inspect one centroid result
first_centroid = projected_gdf["centroid"].iloc[0]
print(first_centroid)
print(first_centroid.geom_type)
Keep the original polygon geometry
In most workflows, keep centroids in a separate column so you do not overwrite the polygon geometry too early.
projected_gdf["centroid"] = projected_gdf.geometry.centroid
Create a centroid point layer from polygons
Make centroid geometry the active geometry
If you want to work with centroid points as the active geometry, use set_geometry().
centroid_gdf = projected_gdf.set_geometry("centroid").copy()
print(centroid_gdf.geometry.head())
Now centroid_gdf behaves like a point layer.
Replace geometry in a copied GeoDataFrame
Another common pattern is to create a new GeoDataFrame and replace the geometry column.
centroid_gdf = projected_gdf.copy()
centroid_gdf["geometry"] = projected_gdf.geometry.centroid
This preserves the original gdf while giving you a point-based output.
Keep selected attributes only
If you only want a few fields plus centroid geometry:
centroid_gdf = projected_gdf[["parcel_id", "owner_name"]].copy()
centroid_gdf = centroid_gdf.join(projected_gdf.geometry.centroid.rename("geometry"))
centroid_gdf = gpd.GeoDataFrame(centroid_gdf, geometry="geometry", crs=projected_gdf.crs)
Convert centroid output back to the original CRS
If your source data was in EPSG:4326 or another output CRS is required, convert the centroid layer back after calculating centroids in projected coordinates.
projected_gdf["centroid"] = projected_gdf.geometry.centroid
centroid_gdf = projected_gdf.drop(columns=projected_gdf.geometry.name).set_geometry("centroid")
centroid_gdf = centroid_gdf.to_crs(gdf.crs)
A simple end-to-end pattern looks like this:
import geopandas as gpd
gdf = gpd.read_file("data/parcels.shp")
projected = gdf.to_crs("EPSG:3857")
centroids = projected.copy()
centroids["geometry"] = projected.geometry.centroid
centroids = centroids.to_crs(gdf.crs)
print(centroids.head())
Export centroid points
Once the centroid layer is ready, export it like any other GeoDataFrame.
centroid_gdf.to_file("output/parcels_centroids.shp")
centroid_gdf.to_file("output/parcels_centroids.geojson", driver="GeoJSON")
This is a standard way to create a point layer from polygons in Python.
Check whether centroids fall outside polygons
Why this happens
A centroid is the geometric center of a polygon, but it is not guaranteed to fall inside the polygon. This is common with:
- concave polygons
- polygons with holes
- irregular multipart features
Use representative points when you need an interior point
If you need a point guaranteed to lie inside the polygon, use .representative_point() instead.
projected_gdf["rep_point"] = projected_gdf.geometry.representative_point()
Compare centroid and representative point
projected_gdf["centroid"] = projected_gdf.geometry.centroid
projected_gdf["rep_point"] = projected_gdf.geometry.representative_point()
print(projected_gdf[["centroid", "rep_point"]].head())
Use centroids when you need the geometric center. Use representative points when you need a point inside the polygon for labeling or display.
Explanation
The centroid property in GeoPandas comes from the underlying Shapely geometry operation. When you run:
projected_gdf.geometry.centroid
GeoPandas returns a GeoSeries of Point geometries, one for each polygon or multipolygon.
You can use that result in three common ways:
- store it in a new column
- make it the active geometry with
set_geometry() - replace the geometry column in a copied GeoDataFrame to create a point layer
The main practical issue is CRS choice. If you calculate centroids in a geographic CRS such as EPSG:4326, the result is based on latitude and longitude coordinates rather than projected planar coordinates. For many GIS tasks, that is not the right basis for a center point calculation. Reproject first when spatial accuracy matters.
The difference between centroid and representative point is also important:
- Centroid: geometric center, may fall outside the polygon
- Representative point: inside the polygon, but not the true geometric center
Typical GIS uses for polygon centroids include:
- creating parcel label points
- generating center points for export
- preparing polygon-derived points for spatial joins
- simplifying polygon layers for downstream automation
Edge cases and notes
-
Geographic CRS: Avoid calculating centroids directly in EPSG:4326 unless approximate output is acceptable.
-
Projected CRS choice: EPSG:3857 is convenient for examples, but a local projected CRS is usually better for measurement-based work.
-
Multipart polygons: A multipolygon gets one centroid for the full geometry, not one centroid per part.
-
Invalid geometries: Invalid polygons can produce unreliable results. Check geometry validity first:
print(gdf.geometry.is_valid.value_counts())For simple repair cases, you may see workflows like:
gdf["geometry"] = gdf.buffer(0)Validate the repaired output before using it in production.
-
Empty or null geometries: Missing geometries produce missing centroid results.
gdf = gdf[gdf.geometry.notna() & ~gdf.geometry.is_empty] -
Performance: Centroid calculation is vectorized and usually fast, but large datasets still benefit from limiting unnecessary columns and copies.
Internal links
For a broader concept, see How to Work with Geometry Columns in GeoPandas.
For related tasks, see How to Read a Shapefile in Python with GeoPandas and How to Reproject Spatial Data in Python (GeoPandas).
If your output looks wrong, check Why GeoPandas Centroid Results Look Wrong.
FAQ
How do I get the centroid of a polygon in GeoPandas?
Use the .centroid attribute on the geometry column:
projected_gdf["centroid"] = projected_gdf.geometry.centroid
This creates one point for each polygon or multipolygon.
Does GeoPandas return one centroid per polygon?
Yes. GeoPandas returns one Point geometry for each polygon or multipolygon, unless the geometry is null or empty.
Why is my centroid outside the polygon?
That can be normal. Concave polygons, polygons with holes, and multipart shapes can have centroids outside their boundaries. If you need an interior point, use:
projected_gdf["rep_point"] = projected_gdf.geometry.representative_point()
Should I reproject before calculating centroids?
Usually yes. If your data is in a geographic CRS such as EPSG:4326, reproject to a suitable projected CRS before calculating centroids.
What is the difference between centroid and representative point?
A centroid is the geometric center of the polygon. A representative point is guaranteed to lie inside the polygon. They are useful for different GIS tasks.