How to Count Points in Polygons with GeoPandas
Problem statement
A common GIS task is to count how many point features fall inside each polygon. Typical examples include:
- counting addresses inside parcels
- counting stores inside districts
- counting incident locations inside census tracts
- counting GPS observations inside management zones
The expected output is usually a polygon layer with a new count field, such as point_count, where each polygon stores the number of matching points.
In GeoPandas, the standard workflow is:
- load the point and polygon layers
- make sure both layers use the same CRS
- spatially match points to polygons
- aggregate the matches by polygon
- join the counts back to the polygon layer
Quick answer
To count points in polygons with GeoPandas:
- use
geopandas.sjoin()to match point features to polygons - group the joined records by a polygon ID
- count matched points per polygon
- merge the counts back into the polygon GeoDataFrame
Both layers must use the same CRS before the spatial join, and the polygon layer should have a stable unique ID such as polygon_id.
import geopandas as gpd
points = gpd.read_file("data/addresses.gpkg")
polygons = gpd.read_file("data/districts.gpkg")
if points.crs != polygons.crs:
points = points.to_crs(polygons.crs)
polygons = polygons.reset_index(drop=True)
polygons["polygon_id"] = polygons.index
joined = gpd.sjoin(
points,
polygons[["polygon_id", "geometry"]],
how="left",
predicate="within"
)
counts = (
joined.dropna(subset=["polygon_id"])
.groupby("polygon_id")
.size()
.reset_index(name="point_count")
)
result = polygons.merge(counts, on="polygon_id", how="left")
result["point_count"] = result["point_count"].fillna(0).astype(int)
Step-by-step solution
Load the point and polygon layers
Read both datasets with gpd.read_file(). GeoPandas supports shapefiles, GeoJSON, and GeoPackage files.
import geopandas as gpd
points = gpd.read_file("data/addresses.shp")
polygons = gpd.read_file("data/census_tracts.geojson")
print(points.head())
print(polygons.head())
print(points.columns)
print(polygons.columns)
Check that:
- the point layer contains point geometries
- the polygon layer contains polygon geometries
- the polygon layer has a usable identifier field, or you can create one
Check and align coordinate reference systems
Point-in-polygon operations require both layers to use the same CRS.
print("Points CRS:", points.crs)
print("Polygons CRS:", polygons.crs)
if points.crs != polygons.crs:
points = points.to_crs(polygons.crs)
If the CRS does not match, the spatial join may return wrong results or no matches at all.
For example, if points are in EPSG:4326 and polygons are in EPSG:3857, they will not line up correctly until one layer is reprojected.
Make sure polygons have a unique ID
You should group counts by a stable unique field, not by a name field that may repeat.
If the polygon layer does not already have a unique ID, create one:
if "polygon_id" not in polygons.columns:
polygons = polygons.reset_index(drop=True)
polygons["polygon_id"] = polygons.index + 1
Using district_name or tract_name alone can fail if multiple polygons share the same name.
Run a spatial join to match points to polygons
For most point counting workflows, it is easiest to join points to polygons. Each matched point gets the polygon ID it falls inside.
joined = gpd.sjoin(
points,
polygons[["polygon_id", "geometry"]],
how="left",
predicate="within"
)
print(joined.head())
This keeps all point records and adds polygon attributes where a match exists.
within is a good default when your rule is to count points strictly inside polygons. If points on boundaries should also count, use a different predicate such as intersects.
Count points per polygon
Now group the joined point records by polygon ID and count the number of matched rows.
counts = (
joined.dropna(subset=["polygon_id"])
.groupby("polygon_id")
.size()
.reset_index(name="point_count")
)
print(counts.head())
This produces a table like:
| polygon_id | point_count |
|---|---|
| 1 | 12 |
| 2 | 5 |
Polygons with no matching points will not appear in this table yet.
Join the counts back to the polygons
Merge the aggregated count table back into the polygon GeoDataFrame.
result = polygons.merge(counts, on="polygon_id", how="left")
result["point_count"] = result["point_count"].fillna(0).astype(int)
print(result[["polygon_id", "point_count"]].head())
Using a left merge from the polygon layer ensures that polygons with no points are kept in the final output.
Save the result
Write the final polygons with counts to a new file.
result.to_file("output/census_tracts_point_counts.gpkg", driver="GPKG")
You can also export to shapefile or GeoJSON:
result.to_file("output/census_tracts_point_counts.shp")
result.to_file("output/census_tracts_point_counts.geojson", driver="GeoJSON")
GeoPackage is often the safer choice because it handles field names and data types better than shapefiles.
Code examples
Example 1: Basic point count in polygons
import geopandas as gpd
points = gpd.read_file("data/stores.gpkg")
polygons = gpd.read_file("data/districts.gpkg")
if points.crs != polygons.crs:
points = points.to_crs(polygons.crs)
if "polygon_id" not in polygons.columns:
polygons = polygons.reset_index(drop=True)
polygons["polygon_id"] = polygons.index + 1
joined = gpd.sjoin(
points,
polygons[["polygon_id", "geometry"]],
how="left",
predicate="within"
)
counts = (
joined.dropna(subset=["polygon_id"])
.groupby("polygon_id")
.size()
.reset_index(name="point_count")
)
result = polygons.merge(counts, on="polygon_id", how="left")
result["point_count"] = result["point_count"].fillna(0).astype(int)
Example 2: Count points using intersects instead of within
joined = gpd.sjoin(
points,
polygons[["polygon_id", "geometry"]],
how="left",
predicate="intersects"
)
Use this when you want to count points inside polygons as well as points that fall exactly on polygon boundaries. This can change results compared to within.
Example 3: Count unique point IDs instead of all joined rows
counts = (
joined.dropna(subset=["polygon_id"])
.groupby("polygon_id")["point_id"]
.nunique()
.reset_index(name="point_count")
)
Use this if your point layer may contain duplicate records and you only want to count unique features.
Example 4: Export counted polygons to file
result.to_file("output/district_store_counts.gpkg", driver="GPKG")
This output can be opened directly in QGIS for styling, labeling, or further analysis.
Explanation
gpd.sjoin() is used because counting points in polygons starts with a spatial relationship: which points fall inside which polygons. The spatial join attaches polygon attributes to matching point features.
The count is calculated after the join because the join creates the row-level matches first. Once each point has a polygon ID, standard pandas grouping can count how many points belong to each polygon.
That is why the workflow is:
- spatial join first
- aggregation second
There is also a difference between:
- counting joined rows with
.size() - counting unique point IDs with
.nunique()
If every point feature should count once, .size() is usually correct. If your point data may contain duplicates and you want unique features only, count a point ID field instead.
Predicate choice matters:
- use
withinwhen boundary points should not count - use
intersectswhen boundary points should count too
For large datasets, performance depends on:
- number of points and polygons
- geometry complexity
- spatial index availability
GeoPandas uses a spatial index when available, which is much faster than checking every point against every polygon manually.
Edge cases and notes
Points on polygon boundaries
Boundary cases depend on the predicate:
withinusually excludes points exactly on the boundaryintersectsincludes points inside polygons and points touching polygon boundaries
Choose the predicate that matches your counting rule.
Polygons with no points
If you only look at the grouped count table, polygons with no matches are missing. To keep them, always merge the counts back to the full polygon layer with how="left" and fill null values with zero.
Duplicate or multipart polygons
If multiple polygons share the same polygon_id, their counts will be combined. Make sure the ID is truly unique.
Multipart polygons are supported. A point inside any part of the multipart feature will count toward that polygon.
Invalid geometries
Broken polygon geometries can cause unexpected join results. If results look wrong, check validity:
invalid = polygons[~polygons.is_valid]
print(invalid)
If invalid features exist, repair them before running the join.
CRS issues
A mismatched CRS is one of the most common reasons for empty results. Always compare points.crs and polygons.crs before running sjoin().
Internal links
If you want the broader pattern behind this workflow, see spatial joins in GeoPandas.
For related tasks, see:
- how to join attributes by location in GeoPandas
- how to reproject spatial data in Python with GeoPandas
- how to export GeoJSON in Python with GeoPandas
If your join returns no matches, check why GeoPandas spatial join is returning empty results.
FAQ
How do I count points in polygons with GeoPandas?
Use geopandas.sjoin() to match points to polygons, then group by a polygon ID and count the matched records. Merge the counts back into the polygon layer.
What is the difference between within and intersects when counting points?
within counts points strictly inside polygons. intersects counts points inside polygons and points that touch polygon boundaries. Boundary points can change your totals.
Why are some polygons missing from my output after counting points?
They are usually missing because only matched polygons appear in the grouped count table. Merge the counts back to the full polygon layer with a left join and fill missing values with zero.
Why does GeoPandas return zero or empty matches for all polygons?
The most common causes are:
- CRS mismatch
- invalid geometries
- wrong spatial predicate
- points and polygons not actually overlapping
How do I keep polygons that contain no points?
Use a left merge from the polygon GeoDataFrame to the count table:
result = polygons.merge(counts, on="polygon_id", how="left")
result["point_count"] = result["point_count"].fillna(0).astype(int)
This keeps all polygons and assigns 0 where no points were found.