GeoPandas "Columns Overlap but No Suffix Specified" Error: How to Fix It

Problem statement

A common GeoPandas task is combining two tables, such as joining attribute data to a GeoDataFrame, merging summary statistics onto regions, or attaching polygon attributes to points with a spatial join. These operations use merge(), join(), or sjoin().

The problem is that these methods raise an error when both frames share one or more non-key column names:

ValueError: columns overlap but no suffix specified: Index(['name', 'value'], dtype='object')

It usually appears in one of these situations:

  • gdf.merge(df, on="id") where both frames have a column like name
  • left.join(right) where both frames share a column name
  • gpd.sjoin(left, right, predicate="within") where both layers share attribute names, or where sjoin adds an index_right column that collides with an existing one
  • repeated joins that keep colliding because earlier suffixed columns were never cleaned up

The cause is always the same: pandas found duplicate column names in the two inputs and refused to guess how to disambiguate them. The fix is to tell it how, or to remove the overlap before joining.

Common causes:

  • both frames have identically named non-key columns
  • you forgot suffixes= on merge() or lsuffix/rsuffix on join()
  • sjoin() added index_right and a previous index_right already exists
  • you carried extra columns you did not need into the join

Quick answer

If merge(), join(), or sjoin() raises columns overlap but no suffix specified:

  1. find the overlapping column names with .columns.intersection()
  2. pass suffixes=("_left", "_right") to merge() (or lsuffix=/rsuffix= to join())
  3. or rename the duplicate columns before joining with .rename()
  4. or drop the columns you do not need with .drop(columns=[...])
  5. for sjoin(), watch for the extra index_right column and drop or rename it

The most direct fix is supplying suffixes:

import geopandas as gpd
import pandas as pd

zones = gpd.read_file("data/zones.gpkg")          # has columns: id, name, geometry
stats = pd.read_csv("data/zone_stats.csv")        # has columns: id, name, value

merged = zones.merge(stats, on="id", suffixes=("_zone", "_stats"))
print(merged.columns)

The shared name column becomes name_zone and name_stats, and the join succeeds.

Choosing the fix

Flowchart — Choosing the fix for a columns-overlap error.
Choosing the fix for a columns-overlap error.

Step-by-step solution

Find which columns actually overlap

Before fixing anything, list the shared column names so you know what is colliding.

overlap = zones.columns.intersection(stats.columns)
print("Overlapping columns:", list(overlap))

This includes any key you are joining on. If you merge with on="id", the id column is consumed by the join and does not need a suffix. The columns that trigger the error are the non-key shared columns.

Add suffixes to merge()

The simplest fix is to let pandas rename the colliding columns for you.

merged = zones.merge(
    stats,
    on="id",
    suffixes=("_zone", "_stats"),
)
print(merged[["name_zone", "name_stats"]].head())

Choose suffixes that describe the source, such as ("_zone", "_stats"), so the result is readable later. The geometry column stays on the left GeoDataFrame and is preserved.

Use lsuffix and rsuffix for .join()

DataFrame.join() does not accept suffixes. It uses separate lsuffix and rsuffix arguments, and it joins on the index by default.

# both indexed by the same key
left = zones.set_index("id")
right = stats.set_index("id")

joined = left.join(right, lsuffix="_zone", rsuffix="_stats")
print(joined.columns)

If you leave both suffixes empty and the frames share a column, .join() raises the same overlap error. Provide at least one suffix.

Rename duplicate columns before joining

If you want clean, explicit names instead of generated suffixes, rename the columns on one side first.

stats = stats.rename(columns={"name": "stats_name", "value": "stats_value"})

merged = zones.merge(stats, on="id")  # no suffixes needed now
print(merged.columns)

After renaming, there is no overlap, so merge() and join() work without suffix arguments.

Drop columns you do not need before joining

Often the overlapping column on one side is redundant. Dropping it removes the conflict and keeps the result narrow.

# keep only the key and the columns you actually want from stats
stats_slim = stats[["id", "value"]]

merged = zones.merge(stats_slim, on="id")
print(merged.columns)

Selecting only needed columns is usually cleaner than carrying everything and suffixing afterward.

Handle the index_right column from sjoin()

gpd.sjoin() attaches the right frame's index as a new column named index_right. If your data already has a column called index_right, or if you run sjoin() twice, the new column collides.

points = gpd.read_file("data/points.gpkg")
zones = gpd.read_file("data/zones.gpkg")

joined = gpd.sjoin(points, zones, how="left", predicate="within")

# drop the index column sjoin added if you do not need it
joined = joined.drop(columns=["index_right"])
print(joined.columns)

If both layers share attribute names such as name, sjoin() also supports suffixes through lsuffix and rsuffix:

joined = gpd.sjoin(
    points,
    zones,
    how="left",
    predicate="within",
    lsuffix="pt",
    rsuffix="zone",
)
print(joined.columns)

Code examples

Example 1: merge a CSV onto a GeoDataFrame with suffixes

A typical workflow attaches tabular statistics to zone polygons that happen to share a name column.

import geopandas as gpd
import pandas as pd

zones = gpd.read_file("data/zones.gpkg")        # id, name, geometry
stats = pd.read_csv("data/zone_stats.csv")      # id, name, value

overlap = zones.columns.intersection(stats.columns)
print("Overlap:", list(overlap))

merged = zones.merge(stats, on="id", suffixes=("_zone", "_stats"))
print(merged.head())

Example 2: rename before merging for clean column names

When you control the column names, renaming is clearer than reading generated suffixes later.

import geopandas as gpd
import pandas as pd

zones = gpd.read_file("data/zones.gpkg")
stats = pd.read_csv("data/zone_stats.csv")

stats = stats.rename(columns={"name": "stat_name", "value": "population"})

merged = zones.merge(stats, on="id")
print(merged.columns)

Example 3: spatial join with shared columns and index_right

This handles both the attribute collision and the index column that sjoin() adds.

import geopandas as gpd

points = gpd.read_file("data/points.gpkg")      # id, name, geometry
zones = gpd.read_file("data/zones.gpkg")        # id, name, geometry

joined = gpd.sjoin(
    points,
    zones,
    how="left",
    predicate="within",
    lsuffix="pt",
    rsuffix="zone",
)

# remove the index column from the right layer if it is not needed
if "index_right" in joined.columns:
    joined = joined.drop(columns=["index_right"])

print(joined.columns)

Example 4: drop redundant columns before a join

Selecting only the columns you need avoids both the error and a wide, confusing result.

import geopandas as gpd
import pandas as pd

zones = gpd.read_file("data/zones.gpkg")
stats = pd.read_csv("data/zone_stats.csv")

stats_slim = stats[["id", "value"]]            # drop the duplicate name column

merged = zones.merge(stats_slim, on="id")
print(merged[["id", "name", "value"]].head())

Explanation

When you combine two tables, pandas places columns from both inputs side by side in the result. If both inputs contain a column with the same name, the result would have two columns called, for example, name. That is ambiguous: later code that asks for df["name"] would not know which one to return.

Rather than silently keep one and drop the other, or produce duplicated labels that break selection, pandas raises ValueError: columns overlap but no suffix specified and lists the colliding names. It is asking you to resolve the ambiguity.

You can resolve it in two ways. Either tell pandas how to rename the duplicates, with suffixes= on merge() or lsuffix/rsuffix on join(), or remove the overlap yourself by renaming or dropping columns before the join. The join key itself is not a problem, because merge(on=...) consumes it into a single column.

sjoin() adds one extra wrinkle: it appends the right frame's index as a column named index_right. That column can collide with an existing column or with the index column left behind by a previous sjoin(), so a clean spatial workflow usually drops or renames it between steps.

Edge cases or notes

  • sjoin always adds index_right: Even when no attribute columns overlap, sjoin() creates an index_right column. Drop it with .drop(columns=["index_right"]) if you do not need it, especially before a second join.
  • Repeated joins accumulate suffixes: Joining the same frames more than once can produce columns like value_left_left. Clean up or rename columns between joins instead of stacking suffixes.
  • .join vs .merge differences: .merge() joins on columns and uses suffixes=. .join() joins on the index by default and uses lsuffix/rsuffix. They are not interchangeable argument-for-argument.
  • Only non-key columns trigger the error: The column named in on= is shared deliberately and is merged into one. The error lists the other shared columns.
  • Empty-string suffixes still collide: Passing suffixes=("", "") does not avoid the error if real duplicates exist; it just reproduces the ambiguity. Use distinct suffixes or drop a column.
  • Geometry column on both sides: If both inputs are GeoDataFrames with a geometry column, merging them can create geometry_x/geometry_y. Keep geometry on one side only, or use sjoin() for spatial combination.

FAQ

Why does merge() complain about overlapping columns but not the join key?

The join key named in on= is merged into a single shared column, so it is not ambiguous. The error lists only the other columns that appear in both frames with the same name.

What is the difference between suffixes and lsuffix/rsuffix?

merge() takes a single suffixes=("_left", "_right") tuple, while join() takes separate lsuffix and rsuffix string arguments. They do the same thing but belong to different methods.

How do I remove the index_right column after sjoin()?

Drop it with joined = joined.drop(columns=["index_right"]). It holds the matched index from the right layer, which you usually do not need once the join is done.

Should I add suffixes or drop the duplicate columns?

Drop or rename when one side's column is redundant, since that keeps the result narrow and readable. Use suffixes when you genuinely need both versions of the column in the output.

Does concat raise the same overlapping-columns error?

No. pd.concat() stacks frames and allows duplicate column labels, so it will not raise this error. The error is specific to merge(), join(), and sjoin(), which place columns side by side and require unique names.

How do I rename suffixed columns like name_left and name_right after the join?

Use .rename(columns={...}) once the join is done, for example merged.rename(columns={"name_left": "zone_name", "name_right": "stat_name"}). This gives the result readable labels instead of generated suffixes.

Why do I get geometry_x and geometry_y after merging two GeoDataFrames?

Both inputs carried a geometry column, so merge() suffixed them like any other duplicate. Keep geometry on one side only by selecting non-geometry columns from the other frame before merging, or use sjoin() for a spatial combination.

Can I keep the right layer's index from sjoin() instead of dropping index_right?

Yes. The index_right column holds the matched index from the right layer, so you can keep it and rename it to something meaningful like district_idx if you need to trace matches back. Drop it only when it is not useful downstream.