How to Simplify Geometry in Python with GeoPandas and Shapely
Problem statement
Vector datasets often contain far more vertices than you need for a given task. This is common with administrative boundaries, coastlines, parcel outlines, and road centerlines. Extra detail causes practical GIS problems:
- larger shapefiles, GeoJSON, or GeoPackage files
- slower plotting and web map rendering
- slower spatial processing
- unnecessary detail at small map scales
Geometry simplification reduces vertex count while keeping the overall shape usable.
This is useful when you need to:
- prepare data for a web map
- create a lighter preview layer
- reduce file size
- clean detailed boundaries for a smaller-scale map
This page shows how to simplify line and polygon geometries in Python using GeoPandas and Shapely. Points are not relevant here because simplifying a point geometry does not meaningfully reduce its complexity the way it does for lines and polygons.
Quick answer
To simplify geometry in GeoPandas, read your data, simplify the geometry column, and save the result:
import geopandas as gpd
gdf = gpd.read_file("admin_boundaries.shp")
# Reproject to a CRS with meter units if needed
gdf = gdf.to_crs("EPSG:3857")
gdf["geometry"] = gdf.geometry.simplify(tolerance=100, preserve_topology=True)
gdf.to_file("admin_boundaries_simplified.shp")
Key points:
tolerancecontrols how much detail is removedpreserve_topology=Truehelps avoid broken polygons- a projected CRS is usually better than EPSG:4326 because tolerance uses CRS units
EPSG:3857is acceptable for many display and web mapping workflows, but a local projected CRS is usually better for more precise distance-based tolerances
Step-by-step solution
Load vector data into a GeoDataFrame
Start by reading your input file with GeoPandas. This works for shapefiles, GeoJSON, and GeoPackage layers.
import geopandas as gpd
gdf = gpd.read_file("data/admin_boundaries.shp")
print(gdf.head())
print(gdf.geometry.geom_type.value_counts())
print(gdf.crs)
Checking geometry types matters before simplification. This page focuses on:
PolygonMultiPolygonLineStringMultiLineString
If your dataset contains mixed geometry types, inspect it before applying one workflow to everything.
Check the coordinate reference system before simplifying
Simplification tolerance uses the units of the current CRS.
If your data is in EPSG:4326, the units are degrees. A tolerance like 100 would mean 100 degrees, which is not meaningful for most GIS work. Even smaller degree-based tolerances are hard to choose consistently.
Use a projected CRS with meter or foot units instead.
print(gdf.crs)
if gdf.crs is None:
raise ValueError("Input data has no CRS. Set the CRS before simplifying.")
# Example: reproject to Web Mercator for web mapping or display workflows
gdf_projected = gdf.to_crs("EPSG:3857")
print(gdf_projected.crs)
For local or regional analysis, a local projected CRS is usually better than EPSG:3857.
Simplify geometries with GeoPandas
GeoPandas provides .simplify() on the geometry column. Under the hood, this uses Shapely geometry methods.
tolerance = 100 # meters in EPSG:3857
gdf_simplified = gdf_projected.copy()
gdf_simplified["geometry"] = gdf_projected.geometry.simplify(
tolerance=tolerance,
preserve_topology=True
)
Important parameters:
tolerance: maximum allowed deviation in CRS unitspreserve_topology=True: tries to avoid invalid output, especially for polygons
If you want a more aggressive result and can tolerate shape changes, you can set preserve_topology=False, but verify the output carefully.
Compare original and simplified geometry
The number of rows should stay the same unless you later filter collapsed or empty geometries.
print("Original features:", len(gdf_projected))
print("Simplified features:", len(gdf_simplified))
You can also compare the visual result:
import matplotlib.pyplot as plt
ax = gdf_projected.plot(figsize=(8, 8), color="none", edgecolor="gray", linewidth=0.5)
gdf_simplified.plot(ax=ax, color="none", edgecolor="red", linewidth=0.8)
plt.show()
To estimate how much complexity was reduced, count vertices for line and polygon geometries.
def vertex_count(geom):
if geom is None or geom.is_empty:
return 0
if geom.geom_type == "LineString":
return len(geom.coords)
if geom.geom_type == "Polygon":
return len(geom.exterior.coords) + sum(len(ring.coords) for ring in geom.interiors)
if geom.geom_type.startswith("Multi"):
return sum(vertex_count(part) for part in geom.geoms)
return 0
gdf_projected["vertex_count"] = gdf_projected.geometry.apply(vertex_count)
gdf_simplified["vertex_count"] = gdf_simplified.geometry.apply(vertex_count)
print("Original total vertices:", gdf_projected["vertex_count"].sum())
print("Simplified total vertices:", gdf_simplified["vertex_count"].sum())
Save the simplified output
Write the result to a new file instead of overwriting the original immediately.
gdf_simplified.to_file("output/admin_boundaries_simplified.gpkg", driver="GPKG")
You can also save to other formats:
gdf_simplified.to_file("output/admin_boundaries_simplified.shp")
gdf_simplified.to_file("output/admin_boundaries_simplified.geojson", driver="GeoJSON")
Code examples
Example 1: Simplify polygon boundaries in a shapefile
This example reads administrative boundaries, reprojects to meters, simplifies polygons, and saves a new file.
import geopandas as gpd
gdf = gpd.read_file("data/admin_boundaries.shp")
# Reproject from geographic CRS to metric CRS
gdf = gdf.to_crs("EPSG:3857")
# Simplify polygon boundaries
gdf["geometry"] = gdf.geometry.simplify(
tolerance=250,
preserve_topology=True
)
gdf.to_file("output/admin_boundaries_simplified.shp")
This is a common workflow when you want smaller polygon datasets for regional maps.
Example 2: Simplify line geometries for faster map display
This example reduces detail in a roads layer for preview or web display.
import geopandas as gpd
import matplotlib.pyplot as plt
roads = gpd.read_file("data/roads.geojson")
roads = roads.to_crs("EPSG:3857")
roads_simple = roads.copy()
roads_simple["geometry"] = roads.geometry.simplify(
tolerance=20,
preserve_topology=True
)
fig, axes = plt.subplots(1, 2, figsize=(12, 6))
roads.plot(ax=axes[0], color="black", linewidth=0.5)
axes[0].set_title("Original roads")
roads_simple.plot(ax=axes[1], color="blue", linewidth=0.5)
axes[1].set_title("Simplified roads")
plt.tight_layout()
plt.show()
This is useful when detailed road centerlines are too heavy for quick rendering.
Example 3: Use Shapely directly on a single geometry
If you are working with one geometry object instead of a full GeoDataFrame, use Shapely directly.
from shapely.geometry import LineString
line = LineString([
(0, 0), (1, 0.1), (2, -0.1), (3, 0.05), (4, 0), (5, 0)
])
simple_topology = line.simplify(0.5, preserve_topology=True)
simple_no_topology = line.simplify(0.5, preserve_topology=False)
print("Original:", list(line.coords))
print("With topology preservation:", list(simple_topology.coords))
print("Without topology preservation:", list(simple_no_topology.coords))
For polygons, the same .simplify() method applies:
from shapely.geometry import Polygon
polygon = Polygon([
(0, 0), (1, 0.1), (2, 0), (2, 2), (1, 2.1), (0, 2), (0, 0)
])
simplified_polygon = polygon.simplify(0.3, preserve_topology=True)
print(simplified_polygon.wkt)
Explanation
How tolerance affects the result
Tolerance controls how aggressively vertices are removed.
- small tolerance: minor detail removed
- medium tolerance: clearer reduction in complexity
- large tolerance: shape may become distorted or collapse
There is no universal correct value. The right tolerance depends on:
- map scale
- dataset detail
- CRS units
- whether you are simplifying roads, coastlines, or administrative polygons
Test a few values and compare results visually.
GeoPandas vs Shapely for simplification
Use GeoPandas when you want to simplify an entire dataset stored in a GeoDataFrame.
Use Shapely when you want to simplify:
- a single polygon or linestring
- geometry objects inside a custom script
- geometry values before building a GeoDataFrame
GeoPandas simplify operations are based on Shapely geometry methods.
What preserve_topology changes
preserve_topology=True is usually the safer setting for polygons. It helps reduce problems such as invalid geometry or unexpected self-intersections.
The tradeoff is that topology-preserving simplification may keep more detail than non-topology-preserving simplification.
For line data, the difference may be less noticeable, but it is still worth testing.
preserve_topology=True helps preserve validity for each geometry, but it does not preserve shared boundaries across multiple features. If you simplify adjacent polygons independently, shared edges may no longer match exactly.
Edge cases and notes
Simplifying data in EPSG:4326
Avoid simplifying directly in EPSG:4326 unless you have a very specific reason. Tolerance values in degrees are difficult to interpret and inconsistent across locations.
A better workflow is:
- read the data
- reproject to a projected CRS
- simplify
- optionally reproject back
Invalid geometries after processing
Simplification can expose existing geometry problems or create issues if tolerance is too aggressive.
Check validity if output looks wrong:
invalid = gdf_simplified[~gdf_simplified.geometry.is_valid]
print(invalid)
If many features are invalid before simplification, fix those first.
Shared polygon boundaries may stop matching
If you simplify polygon features one by one, neighboring boundaries can drift apart. This matters for administrative areas, parcels, and any layer where adjacent polygons are supposed to share the same edge.
If exact shared boundaries matter, inspect the result carefully and use a topology-aware workflow outside simple per-feature .simplify() operations.
Very small features may collapse
Small polygons, narrow corridors, or short line segments may become empty or lose important shape detail if tolerance is too large.
Inspect output for:
- empty geometries
- missing small islands or enclaves
- oversimplified road bends
Attribute data stays the same
Simplification changes geometry only. Your attribute columns remain unchanged unless you explicitly modify them.
Internal links
If you are not sure why CRS units matter here, read projected vs geographic CRS in GeoPandas.
If you need to change CRS before simplifying, see How to Reproject Spatial Data in Python (GeoPandas).
If you are starting from a file workflow, see How to Read a Shapefile in Python with GeoPandas.
If you need to export the result for web use, see How to Export GeoJSON in Python with GeoPandas.
If the output looks distorted or far too aggressive, check why GeoPandas simplify gives unexpected results.
FAQ
How do I choose the right tolerance for geometry simplification in GeoPandas?
Start with a small value in projected units, then test larger values. For meter-based CRS, try values like 5, 20, 100, or 250 depending on the dataset scale. Always compare the result visually.
Should I simplify geometries before or after reprojecting?
Usually after reprojecting. Simplification tolerance depends on CRS units, so a projected CRS in meters is easier to work with than geographic coordinates in degrees.
What does preserve_topology=True do in Shapely and GeoPandas?
It makes simplification safer by trying to avoid invalid geometry and major per-feature topology problems. This is especially important for polygons. It does not guarantee that shared boundaries between separate polygon features will still match exactly.
Can simplification remove small features or make polygons invalid?
Yes. If tolerance is too large, small polygons may collapse and narrow shapes may disappear. Even with topology preservation, you should validate and inspect the output.