How to Snap and Align Geometries to Fix Slivers and Gaps in Python
Adjacent polygons that should share an edge often don't. After an overlay, a union, or a reprojection you end up with a layer that looks fine at full extent but, when you zoom in, is riddled with hairline overlaps and gaps. These artefacts break dissolves, inflate feature counts, and make area statistics wrong. This guide shows you how to find and fix them with shapely and geopandas.
Problem statement
You are dealing with this article if you see any of these symptoms:
- Thin sliver polygons appear after an
overlay()orunion— long, skinny shapes with almost no area that sit along the boundary between two real features. - Gaps open up between neighbours that are supposed to be coincident — small slivers of empty space where two polygons fail to touch.
- Vertices that almost-but-not-quite match — two boundaries that should be identical differ by a fraction of a millimetre, so the geometries neither snap together nor dissolve cleanly.
- A
dissolve()that should merge two adjacent polygons into one leaves a visible internal boundary or a tiny hole. - Feature counts jump after an intersection because every shared edge spawns one or two extra micro-polygons.
Quick answer
Reduce coordinate precision so near-identical vertices collapse onto a shared grid, then drop the tiny sliver polygons that survive by filtering on area. Always do this in a projected, metric CRS so your grid_size and area thresholds are in metres.
import geopandas as gpd
import shapely
gdf = gpd.read_file("parcels.gpkg").to_crs(3857) # metric CRS
# 1. Align all vertices to a 0.001 m (1 mm) grid
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.001)
# 2. Repair anything precision reduction broke
gdf["geometry"] = gdf.geometry.make_valid()
# 3. Drop sliver polygons below an area threshold (e.g. 1 m^2)
gdf = gdf[gdf.geometry.area >= 1.0].reset_index(drop=True)
Step-by-step solution
Identify slivers by area and thinness
A sliver is small and thin. Filtering on area alone catches some, but a long thin sliver can have a deceptively large area. Combine area with a thinness ratio (area / perimeter^2): compact shapes approach 1 / (4*pi) (~0.0796), slivers approach zero.
import geopandas as gpd
import numpy as np
gdf = gpd.read_file("overlay_result.gpkg").to_crs(3857)
area = gdf.geometry.area
perim = gdf.geometry.length
thinness = area / (perim ** 2)
# Candidate slivers: tiny area OR very thin
slivers = gdf[(area < 5.0) | (thinness < 0.01)]
print(f"{len(slivers)} candidate slivers out of {len(gdf)} features")
Align vertices with set_precision
shapely.set_precision snaps every coordinate to a fixed grid. When two boundaries differ only by floating-point noise, rounding both to the same grid makes them identical, so shared edges line up exactly.
import geopandas as gpd
import shapely
gdf = gpd.read_file("parcels.gpkg").to_crs(3857)
# Snap all coordinates to a 1 mm grid
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.001)
# Precision reduction can produce invalid rings — fix them
gdf["geometry"] = gdf.geometry.make_valid()
Snap one layer to another with shapely.snap
When you want vertices of one layer to land exactly on the vertices/edges of a reference layer, use shapely.snap(geom, reference, tolerance). It moves vertices of geom that fall within tolerance of the reference onto it. This is ideal for aligning a roughly-digitised layer to an authoritative boundary.
import geopandas as gpd
import shapely
draft = gpd.read_file("draft_boundaries.gpkg").to_crs(3857)
authoritative = gpd.read_file("admin_boundary.gpkg").to_crs(3857)
ref = authoritative.geometry.union_all() # single reference geometry
draft["geometry"] = draft.geometry.apply(
lambda g: shapely.snap(g, ref, tolerance=0.5) # snap within 0.5 m
)
draft["geometry"] = draft.geometry.make_valid()
Remove sliver polygons after overlay
Overlays are the classic sliver factory. After an intersection or union, filter the result on area (and optionally thinness) to drop the artefacts. Use explode() first so multi-part features are evaluated part by part.
import geopandas as gpd
a = gpd.read_file("layer_a.gpkg").to_crs(3857)
b = gpd.read_file("layer_b.gpkg").to_crs(3857)
result = gpd.overlay(a, b, how="intersection")
result = result.explode(index_parts=False).reset_index(drop=True)
# Drop parts smaller than 1 m^2
result = result[result.geometry.area >= 1.0].reset_index(drop=True)
Close gaps between adjacent polygons
Gaps are the inverse of overlaps: empty slivers where neighbours fail to touch. The most robust fix is to align everything to a grid with set_precision, then dissolve and re-split, or use a small positive-then-negative buffer to bridge sub-tolerance gaps.
import geopandas as gpd
import shapely
gdf = gpd.read_file("districts.gpkg").to_crs(3857)
# Align to grid so coincident edges become identical
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.01)
# Bridge tiny gaps: buffer out then back in (closing operation)
tol = 0.05 # 5 cm
gdf["geometry"] = gdf.geometry.buffer(tol, join_style="mitre").buffer(
-tol, join_style="mitre"
)
gdf["geometry"] = gdf.geometry.make_valid()
Verify the result
Always confirm the cleanup worked: check there are no remaining tiny parts, that every geometry is valid, and that the total area is roughly unchanged (you should only have removed noise).
import geopandas as gpd
cleaned = gpd.read_file("cleaned.gpkg").to_crs(3857)
print("All valid:", cleaned.is_valid.all())
print("Min area:", cleaned.geometry.area.min())
print("Empty geoms:", (cleaned.geometry.is_empty).sum())
print("Total area km^2:", cleaned.geometry.area.sum() / 1e6)
Code examples
Example 1: A reusable sliver detector
import geopandas as gpd
def flag_slivers(gdf, min_area=5.0, min_thinness=0.01):
"""Return a boolean Series: True where the feature is a likely sliver."""
area = gdf.geometry.area
perim = gdf.geometry.length
thinness = area / (perim ** 2)
return (area < min_area) | (thinness < min_thinness)
gdf = gpd.read_file("overlay_result.gpkg").to_crs(3857)
gdf["is_sliver"] = flag_slivers(gdf)
print(gdf["is_sliver"].value_counts())
Example 2: Snap an entire layer to itself to merge near-duplicate edges
import geopandas as gpd
import shapely
gdf = gpd.read_file("parcels.gpkg").to_crs(3857)
# Build one reference from all geometries, then snap each back to it
ref = gdf.geometry.union_all()
gdf["geometry"] = gdf.geometry.apply(
lambda g: shapely.snap(g, ref, tolerance=0.1)
)
gdf["geometry"] = gdf.geometry.make_valid()
Example 3: Precision reduction with a structured grid keep mode
import geopandas as gpd
import shapely
gdf = gpd.read_file("parcels.gpkg").to_crs(3857)
# "valid_output" (default) returns valid geometry; "keep_collapsed"
# preserves collapsed parts as lower-dimension geometries
gdf["geometry"] = shapely.set_precision(
gdf.geometry.values,
grid_size=0.001,
mode="valid_output",
)
print("Invalid after:", (~gdf.is_valid).sum())
Example 4: Absorb slivers into the largest neighbour instead of deleting
import geopandas as gpd
gdf = gpd.read_file("overlay_result.gpkg").to_crs(3857)
gdf = gdf.explode(index_parts=False).reset_index(drop=True)
is_sliver = gdf.geometry.area < 1.0
keep = gdf[~is_sliver].copy()
slivers = gdf[is_sliver]
# Join each sliver to the keeper it shares the longest border with
for idx, sliver in slivers.iterrows():
touching = keep[keep.geometry.touches(sliver.geometry)]
if touching.empty:
continue
shared = touching.geometry.intersection(sliver.geometry).length
target = shared.idxmax()
keep.loc[target, "geometry"] = keep.loc[target, "geometry"].union(
sliver.geometry
)
keep = keep.reset_index(drop=True)
keep["geometry"] = keep.geometry.make_valid()
Example 5: Build a clean coverage with coverage_union
import geopandas as gpd
import shapely
# If your polygons already form a near-coverage (edge-matched, no overlaps),
# coverage_union dissolves them far faster than a generic union and
# preserves the shared internal edges.
gdf = gpd.read_file("districts.gpkg").to_crs(3857)
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.001)
geoms = shapely.GeometryCollection(gdf.geometry.tolist())
dissolved = shapely.coverage_union(geoms)
print(dissolved.geom_type)
Explanation
Why near-misses happen. Coordinates are stored as double-precision floats. The moment you reproject, round-trip through a file format, or run a constructive operation like a buffer or overlay, the maths introduces tiny errors. Two vertices that should be identical end up at, say, (529384.1200000001, 180233.4399999998) and (529384.12, 180233.44). To you they are the same point; to GEOS they are two distinct points, so the edges between them don't coincide and a sliver appears.
Snapping vs precision reduction. These solve overlapping but different problems:
shapely.snap(geom, reference, tolerance)moves vertices ofgeomtoward a specific reference geometry when they fall withintolerance. It is directional and relational — use it to make one layer conform to another. It does not enforce a global grid, and it can leave self-intersections, so re-validate.shapely.set_precision(geom, grid_size)rounds every coordinate to a fixed grid with no reference geometry. It is global and absolute — use it to make coincident edges across a whole dataset become bit-for-bit identical. It can collapse parts smaller than the grid, which is why it returns valid output by default.
The thinness ratio. For any polygon, area / perimeter^2 is a scale-invariant shape descriptor. A circle maximises it at 1 / (4*pi) ≈ 0.0796; a square sits at 0.0625. A sliver — long, with a huge perimeter relative to its area — drives the ratio toward zero. That makes it a better sliver test than raw area, because it catches long thin artefacts that a pure area filter misses, while ignoring small-but-compact features that are genuinely real.
Edge cases or notes
You must be in a projected CRS for metric tolerances
grid_size, tolerance, area thresholds, and buffer distances are all in the units of your CRS. In EPSG:4326 those units are degrees, so a tolerance=0.5 means half a degree (tens of kilometres). Always to_crs() to a projected, metric system (a local UTM zone, a national grid, or EPSG:3857 as a rough default) before applying any distance- or area-based fix.
A tolerance that is too large merges real features
Snapping and gap-closing are blunt instruments. If your tolerance or grid_size exceeds the size of genuine small features, you will weld neighbours together, delete real narrow polygons, or pull boundaries onto the wrong edge. Start small (millimetres to centimetres), inspect the result, and increase only if needed.
Snapping can create invalid geometry
Both shapely.snap and set_precision can produce self-intersections or zero-area rings as vertices move. Always follow them with make_valid() (GeoPandas: gdf.geometry.make_valid()) and check is_valid.all() before continuing. Skipping this step causes failures further down the pipeline that are hard to trace back.
set_precision can collapse tiny parts
If a polygon part is smaller than grid_size, precision reduction may shrink it to nothing. With the default mode="valid_output" these collapsed parts disappear; with mode="keep_collapsed" they survive as lower-dimensional geometry (a line or point), which can pollute a polygon layer. Choose deliberately and filter out non-polygon types afterwards if needed.
Order of operations matters
Reproject first, then reduce precision / snap, then make_valid, then filter slivers, and only then dissolve or run coverage_union. Filtering slivers before aligning vertices removes artefacts you could have fixed; dissolving before aligning leaves internal boundaries that won't merge. Get the sequence right and each step does its job cleanly.
coverage_union needs a valid coverage
shapely.coverage_union is fast because it assumes a valid polygonal coverage: features that are edge-matched, non-overlapping, and gap-free. Feed it overlapping or misaligned polygons and you get wrong or undefined results. Align with set_precision first; if you are unsure the input is a clean coverage, validate it (shapely.coverage_is_valid) or fall back to union_all().
Always verify topology after
After any cleanup, re-check: no empty geometries, all valid, plausible total area, and the expected feature count. A silent precision change that deletes 200 real parcels looks identical to a successful run until someone audits the numbers. Make verification a fixed last step, not an afterthought.
Internal links
- How to Fix Invalid Geometries in Python (GeoPandas)
- How to Simplify Geometry in Python with GeoPandas and Shapely
- How to Find and Remove Duplicate Geometries in GeoPandas
- How to Fix CRS Mismatch in GeoPandas
- The Python GIS Data Cleaning Checklist
FAQ
What grid_size should I use for set_precision?
Pick a value smaller than the real-world precision of your data but large enough to absorb floating-point noise. For cadastral or survey data in metres, 0.001 (1 mm) is a safe default. Going coarser (e.g. 0.01 or 0.1) closes bigger gaps but risks distorting genuine detail.
What is the difference between snap and set_precision?
shapely.snap aligns one geometry's vertices to a reference geometry within a tolerance — it is relational and directional. shapely.set_precision rounds all coordinates to a fixed grid with no reference — it is global and absolute. Use snap to conform one layer to another; use set_precision to make a whole dataset's coincident edges identical.
Why do I get slivers after gpd.overlay?
Overlay computes geometric intersections, and wherever two boundaries that should coincide differ by floating-point noise, the operation produces a thin polygon along that edge. Reduce precision on both inputs with set_precision before the overlay to make shared edges identical, then filter any remaining slivers by area.
How do I remove slivers without deleting real small features?
Combine an area threshold with a thinness ratio (area / perimeter^2). Real small features tend to be compact (higher ratio), while slivers are long and thin (ratio near zero). Filtering on both keeps genuine small parcels while dropping artefacts. To preserve area, absorb slivers into their largest neighbour instead of deleting them.
Do I have to reproject before snapping?
Yes, if you want tolerances in metres. In a geographic CRS like EPSG:4326, all distances are in degrees, so a tolerance meant to be a few centimetres becomes kilometres. Always to_crs() to a projected metric CRS first, do the cleanup, then reproject back if needed.
My geometries became invalid after snapping — what now?
That is expected: moving vertices can create self-intersections. Run gdf.geometry.make_valid() immediately after snap or set_precision, then confirm with gdf.is_valid.all(). If make_valid returns mixed geometry types, explode and keep only the polygon parts.
When should I use coverage_union instead of union_all?
Use shapely.coverage_union when your polygons already form a clean coverage — edge-matched, non-overlapping, gap-free — because it is much faster and preserves shared internal edges. Use union_all() (or dissolve()) for arbitrary, possibly overlapping geometries. Align with set_precision before either to ensure shared edges actually match.
Can I close gaps with a buffer trick?
Yes — buffering outward by a small distance and then inward by the same distance (a morphological closing) bridges sub-tolerance gaps. Use join_style="mitre" to keep corners sharp, and keep the distance small so you do not distort real geometry. For coincident boundaries, set_precision is usually cleaner because it does not change the shape of the polygons themselves.