How to Snap and Align Geometries to Fix Slivers and Gaps in Python

Adjacent polygons that should share an edge often don't. After an overlay, a union, or a reprojection you end up with a layer that looks fine at full extent but, when you zoom in, is riddled with hairline overlaps and gaps. These artefacts break dissolves, inflate feature counts, and make area statistics wrong. This guide shows you how to find and fix them with shapely and geopandas.

Problem statement

You are dealing with this article if you see any of these symptoms:

  • Thin sliver polygons appear after an overlay() or union — long, skinny shapes with almost no area that sit along the boundary between two real features.
  • Gaps open up between neighbours that are supposed to be coincident — small slivers of empty space where two polygons fail to touch.
  • Vertices that almost-but-not-quite match — two boundaries that should be identical differ by a fraction of a millimetre, so the geometries neither snap together nor dissolve cleanly.
  • A dissolve() that should merge two adjacent polygons into one leaves a visible internal boundary or a tiny hole.
  • Feature counts jump after an intersection because every shared edge spawns one or two extra micro-polygons.

Quick answer

Reduce coordinate precision so near-identical vertices collapse onto a shared grid, then drop the tiny sliver polygons that survive by filtering on area. Always do this in a projected, metric CRS so your grid_size and area thresholds are in metres.

A thin sliver between two misaligned polygons removed by snapping to a shared edge.
Aligning vertices to a shared grid turns near-miss boundaries into a clean shared edge.
import geopandas as gpd
import shapely

gdf = gpd.read_file("parcels.gpkg").to_crs(3857)  # metric CRS

# 1. Align all vertices to a 0.001 m (1 mm) grid
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.001)

# 2. Repair anything precision reduction broke
gdf["geometry"] = gdf.geometry.make_valid()

# 3. Drop sliver polygons below an area threshold (e.g. 1 m^2)
gdf = gdf[gdf.geometry.area >= 1.0].reset_index(drop=True)

Step-by-step solution

Identify slivers by area and thinness

A sliver is small and thin. Filtering on area alone catches some, but a long thin sliver can have a deceptively large area. Combine area with a thinness ratio (area / perimeter^2): compact shapes approach 1 / (4*pi) (~0.0796), slivers approach zero.

import geopandas as gpd
import numpy as np

gdf = gpd.read_file("overlay_result.gpkg").to_crs(3857)

area = gdf.geometry.area
perim = gdf.geometry.length
thinness = area / (perim ** 2)

# Candidate slivers: tiny area OR very thin
slivers = gdf[(area < 5.0) | (thinness < 0.01)]
print(f"{len(slivers)} candidate slivers out of {len(gdf)} features")

Align vertices with set_precision

shapely.set_precision snaps every coordinate to a fixed grid. When two boundaries differ only by floating-point noise, rounding both to the same grid makes them identical, so shared edges line up exactly.

import geopandas as gpd
import shapely

gdf = gpd.read_file("parcels.gpkg").to_crs(3857)

# Snap all coordinates to a 1 mm grid
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.001)

# Precision reduction can produce invalid rings — fix them
gdf["geometry"] = gdf.geometry.make_valid()

Snap one layer to another with shapely.snap

When you want vertices of one layer to land exactly on the vertices/edges of a reference layer, use shapely.snap(geom, reference, tolerance). It moves vertices of geom that fall within tolerance of the reference onto it. This is ideal for aligning a roughly-digitised layer to an authoritative boundary.

import geopandas as gpd
import shapely

draft = gpd.read_file("draft_boundaries.gpkg").to_crs(3857)
authoritative = gpd.read_file("admin_boundary.gpkg").to_crs(3857)

ref = authoritative.geometry.union_all()  # single reference geometry

draft["geometry"] = draft.geometry.apply(
    lambda g: shapely.snap(g, ref, tolerance=0.5)  # snap within 0.5 m
)
draft["geometry"] = draft.geometry.make_valid()

Remove sliver polygons after overlay

Overlays are the classic sliver factory. After an intersection or union, filter the result on area (and optionally thinness) to drop the artefacts. Use explode() first so multi-part features are evaluated part by part.

import geopandas as gpd

a = gpd.read_file("layer_a.gpkg").to_crs(3857)
b = gpd.read_file("layer_b.gpkg").to_crs(3857)

result = gpd.overlay(a, b, how="intersection")
result = result.explode(index_parts=False).reset_index(drop=True)

# Drop parts smaller than 1 m^2
result = result[result.geometry.area >= 1.0].reset_index(drop=True)

Close gaps between adjacent polygons

Gaps are the inverse of overlaps: empty slivers where neighbours fail to touch. The most robust fix is to align everything to a grid with set_precision, then dissolve and re-split, or use a small positive-then-negative buffer to bridge sub-tolerance gaps.

import geopandas as gpd
import shapely

gdf = gpd.read_file("districts.gpkg").to_crs(3857)

# Align to grid so coincident edges become identical
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.01)

# Bridge tiny gaps: buffer out then back in (closing operation)
tol = 0.05  # 5 cm
gdf["geometry"] = gdf.geometry.buffer(tol, join_style="mitre").buffer(
    -tol, join_style="mitre"
)
gdf["geometry"] = gdf.geometry.make_valid()

Verify the result

Always confirm the cleanup worked: check there are no remaining tiny parts, that every geometry is valid, and that the total area is roughly unchanged (you should only have removed noise).

import geopandas as gpd

cleaned = gpd.read_file("cleaned.gpkg").to_crs(3857)

print("All valid:", cleaned.is_valid.all())
print("Min area:", cleaned.geometry.area.min())
print("Empty geoms:", (cleaned.geometry.is_empty).sum())
print("Total area km^2:", cleaned.geometry.area.sum() / 1e6)

Code examples

Example 1: A reusable sliver detector

import geopandas as gpd

def flag_slivers(gdf, min_area=5.0, min_thinness=0.01):
    """Return a boolean Series: True where the feature is a likely sliver."""
    area = gdf.geometry.area
    perim = gdf.geometry.length
    thinness = area / (perim ** 2)
    return (area < min_area) | (thinness < min_thinness)

gdf = gpd.read_file("overlay_result.gpkg").to_crs(3857)
gdf["is_sliver"] = flag_slivers(gdf)
print(gdf["is_sliver"].value_counts())

Example 2: Snap an entire layer to itself to merge near-duplicate edges

import geopandas as gpd
import shapely

gdf = gpd.read_file("parcels.gpkg").to_crs(3857)

# Build one reference from all geometries, then snap each back to it
ref = gdf.geometry.union_all()
gdf["geometry"] = gdf.geometry.apply(
    lambda g: shapely.snap(g, ref, tolerance=0.1)
)
gdf["geometry"] = gdf.geometry.make_valid()

Example 3: Precision reduction with a structured grid keep mode

import geopandas as gpd
import shapely

gdf = gpd.read_file("parcels.gpkg").to_crs(3857)

# "valid_output" (default) returns valid geometry; "keep_collapsed"
# preserves collapsed parts as lower-dimension geometries
gdf["geometry"] = shapely.set_precision(
    gdf.geometry.values,
    grid_size=0.001,
    mode="valid_output",
)
print("Invalid after:", (~gdf.is_valid).sum())

Example 4: Absorb slivers into the largest neighbour instead of deleting

import geopandas as gpd

gdf = gpd.read_file("overlay_result.gpkg").to_crs(3857)
gdf = gdf.explode(index_parts=False).reset_index(drop=True)

is_sliver = gdf.geometry.area < 1.0
keep = gdf[~is_sliver].copy()
slivers = gdf[is_sliver]

# Join each sliver to the keeper it shares the longest border with
for idx, sliver in slivers.iterrows():
    touching = keep[keep.geometry.touches(sliver.geometry)]
    if touching.empty:
        continue
    shared = touching.geometry.intersection(sliver.geometry).length
    target = shared.idxmax()
    keep.loc[target, "geometry"] = keep.loc[target, "geometry"].union(
        sliver.geometry
    )

keep = keep.reset_index(drop=True)
keep["geometry"] = keep.geometry.make_valid()

Example 5: Build a clean coverage with coverage_union

import geopandas as gpd
import shapely

# If your polygons already form a near-coverage (edge-matched, no overlaps),
# coverage_union dissolves them far faster than a generic union and
# preserves the shared internal edges.
gdf = gpd.read_file("districts.gpkg").to_crs(3857)
gdf["geometry"] = shapely.set_precision(gdf.geometry.values, grid_size=0.001)

geoms = shapely.GeometryCollection(gdf.geometry.tolist())
dissolved = shapely.coverage_union(geoms)
print(dissolved.geom_type)

Explanation

Why near-misses happen. Coordinates are stored as double-precision floats. The moment you reproject, round-trip through a file format, or run a constructive operation like a buffer or overlay, the maths introduces tiny errors. Two vertices that should be identical end up at, say, (529384.1200000001, 180233.4399999998) and (529384.12, 180233.44). To you they are the same point; to GEOS they are two distinct points, so the edges between them don't coincide and a sliver appears.

Snapping vs precision reduction. These solve overlapping but different problems:

  • shapely.snap(geom, reference, tolerance) moves vertices of geom toward a specific reference geometry when they fall within tolerance. It is directional and relational — use it to make one layer conform to another. It does not enforce a global grid, and it can leave self-intersections, so re-validate.
  • shapely.set_precision(geom, grid_size) rounds every coordinate to a fixed grid with no reference geometry. It is global and absolute — use it to make coincident edges across a whole dataset become bit-for-bit identical. It can collapse parts smaller than the grid, which is why it returns valid output by default.

The thinness ratio. For any polygon, area / perimeter^2 is a scale-invariant shape descriptor. A circle maximises it at 1 / (4*pi) ≈ 0.0796; a square sits at 0.0625. A sliver — long, with a huge perimeter relative to its area — drives the ratio toward zero. That makes it a better sliver test than raw area, because it catches long thin artefacts that a pure area filter misses, while ignoring small-but-compact features that are genuinely real.

Edge cases or notes

You must be in a projected CRS for metric tolerances

grid_size, tolerance, area thresholds, and buffer distances are all in the units of your CRS. In EPSG:4326 those units are degrees, so a tolerance=0.5 means half a degree (tens of kilometres). Always to_crs() to a projected, metric system (a local UTM zone, a national grid, or EPSG:3857 as a rough default) before applying any distance- or area-based fix.

A tolerance that is too large merges real features

Snapping and gap-closing are blunt instruments. If your tolerance or grid_size exceeds the size of genuine small features, you will weld neighbours together, delete real narrow polygons, or pull boundaries onto the wrong edge. Start small (millimetres to centimetres), inspect the result, and increase only if needed.

Snapping can create invalid geometry

Both shapely.snap and set_precision can produce self-intersections or zero-area rings as vertices move. Always follow them with make_valid() (GeoPandas: gdf.geometry.make_valid()) and check is_valid.all() before continuing. Skipping this step causes failures further down the pipeline that are hard to trace back.

set_precision can collapse tiny parts

If a polygon part is smaller than grid_size, precision reduction may shrink it to nothing. With the default mode="valid_output" these collapsed parts disappear; with mode="keep_collapsed" they survive as lower-dimensional geometry (a line or point), which can pollute a polygon layer. Choose deliberately and filter out non-polygon types afterwards if needed.

Order of operations matters

Reproject first, then reduce precision / snap, then make_valid, then filter slivers, and only then dissolve or run coverage_union. Filtering slivers before aligning vertices removes artefacts you could have fixed; dissolving before aligning leaves internal boundaries that won't merge. Get the sequence right and each step does its job cleanly.

coverage_union needs a valid coverage

shapely.coverage_union is fast because it assumes a valid polygonal coverage: features that are edge-matched, non-overlapping, and gap-free. Feed it overlapping or misaligned polygons and you get wrong or undefined results. Align with set_precision first; if you are unsure the input is a clean coverage, validate it (shapely.coverage_is_valid) or fall back to union_all().

Always verify topology after

After any cleanup, re-check: no empty geometries, all valid, plausible total area, and the expected feature count. A silent precision change that deletes 200 real parcels looks identical to a successful run until someone audits the numbers. Make verification a fixed last step, not an afterthought.

FAQ

What grid_size should I use for set_precision?

Pick a value smaller than the real-world precision of your data but large enough to absorb floating-point noise. For cadastral or survey data in metres, 0.001 (1 mm) is a safe default. Going coarser (e.g. 0.01 or 0.1) closes bigger gaps but risks distorting genuine detail.

What is the difference between snap and set_precision?

shapely.snap aligns one geometry's vertices to a reference geometry within a tolerance — it is relational and directional. shapely.set_precision rounds all coordinates to a fixed grid with no reference — it is global and absolute. Use snap to conform one layer to another; use set_precision to make a whole dataset's coincident edges identical.

Why do I get slivers after gpd.overlay?

Overlay computes geometric intersections, and wherever two boundaries that should coincide differ by floating-point noise, the operation produces a thin polygon along that edge. Reduce precision on both inputs with set_precision before the overlay to make shared edges identical, then filter any remaining slivers by area.

How do I remove slivers without deleting real small features?

Combine an area threshold with a thinness ratio (area / perimeter^2). Real small features tend to be compact (higher ratio), while slivers are long and thin (ratio near zero). Filtering on both keeps genuine small parcels while dropping artefacts. To preserve area, absorb slivers into their largest neighbour instead of deleting them.

Do I have to reproject before snapping?

Yes, if you want tolerances in metres. In a geographic CRS like EPSG:4326, all distances are in degrees, so a tolerance meant to be a few centimetres becomes kilometres. Always to_crs() to a projected metric CRS first, do the cleanup, then reproject back if needed.

My geometries became invalid after snapping — what now?

That is expected: moving vertices can create self-intersections. Run gdf.geometry.make_valid() immediately after snap or set_precision, then confirm with gdf.is_valid.all(). If make_valid returns mixed geometry types, explode and keep only the polygon parts.

When should I use coverage_union instead of union_all?

Use shapely.coverage_union when your polygons already form a clean coverage — edge-matched, non-overlapping, gap-free — because it is much faster and preserves shared internal edges. Use union_all() (or dissolve()) for arbitrary, possibly overlapping geometries. Align with set_precision before either to ensure shared edges actually match.

Can I close gaps with a buffer trick?

Yes — buffering outward by a small distance and then inward by the same distance (a morphological closing) bridges sub-tolerance gaps. Use join_style="mitre" to keep corners sharp, and keep the distance small so you do not distort real geometry. For coincident boundaries, set_precision is usually cleaner because it does not change the shape of the polygons themselves.