GeoPandas "Columns Overlap but No Suffix Specified" Error: How to Fix It
Problem statement
A common GeoPandas task is combining two tables, such as joining attribute data to a GeoDataFrame, merging summary statistics onto regions, or attaching polygon attributes to points with a spatial join. These operations use merge(), join(), or sjoin().
The problem is that these methods raise an error when both frames share one or more non-key column names:
ValueError: columns overlap but no suffix specified: Index(['name', 'value'], dtype='object')
It usually appears in one of these situations:
gdf.merge(df, on="id")where both frames have a column likenameleft.join(right)where both frames share a column namegpd.sjoin(left, right, predicate="within")where both layers share attribute names, or wheresjoinadds anindex_rightcolumn that collides with an existing one- repeated joins that keep colliding because earlier suffixed columns were never cleaned up
The cause is always the same: pandas found duplicate column names in the two inputs and refused to guess how to disambiguate them. The fix is to tell it how, or to remove the overlap before joining.
Common causes:
- both frames have identically named non-key columns
- you forgot
suffixes=onmerge()orlsuffix/rsuffixonjoin() sjoin()addedindex_rightand a previousindex_rightalready exists- you carried extra columns you did not need into the join
Quick answer
If merge(), join(), or sjoin() raises columns overlap but no suffix specified:
- find the overlapping column names with
.columns.intersection() - pass
suffixes=("_left", "_right")tomerge()(orlsuffix=/rsuffix=tojoin()) - or rename the duplicate columns before joining with
.rename() - or drop the columns you do not need with
.drop(columns=[...]) - for
sjoin(), watch for the extraindex_rightcolumn and drop or rename it
The most direct fix is supplying suffixes:
import geopandas as gpd
import pandas as pd
zones = gpd.read_file("data/zones.gpkg") # has columns: id, name, geometry
stats = pd.read_csv("data/zone_stats.csv") # has columns: id, name, value
merged = zones.merge(stats, on="id", suffixes=("_zone", "_stats"))
print(merged.columns)
The shared name column becomes name_zone and name_stats, and the join succeeds.
Choosing the fix
Step-by-step solution
Find which columns actually overlap
Before fixing anything, list the shared column names so you know what is colliding.
overlap = zones.columns.intersection(stats.columns)
print("Overlapping columns:", list(overlap))
This includes any key you are joining on. If you merge with on="id", the id column is consumed by the join and does not need a suffix. The columns that trigger the error are the non-key shared columns.
Add suffixes to merge()
The simplest fix is to let pandas rename the colliding columns for you.
merged = zones.merge(
stats,
on="id",
suffixes=("_zone", "_stats"),
)
print(merged[["name_zone", "name_stats"]].head())
Choose suffixes that describe the source, such as ("_zone", "_stats"), so the result is readable later. The geometry column stays on the left GeoDataFrame and is preserved.
Use lsuffix and rsuffix for .join()
DataFrame.join() does not accept suffixes. It uses separate lsuffix and rsuffix arguments, and it joins on the index by default.
# both indexed by the same key
left = zones.set_index("id")
right = stats.set_index("id")
joined = left.join(right, lsuffix="_zone", rsuffix="_stats")
print(joined.columns)
If you leave both suffixes empty and the frames share a column, .join() raises the same overlap error. Provide at least one suffix.
Rename duplicate columns before joining
If you want clean, explicit names instead of generated suffixes, rename the columns on one side first.
stats = stats.rename(columns={"name": "stats_name", "value": "stats_value"})
merged = zones.merge(stats, on="id") # no suffixes needed now
print(merged.columns)
After renaming, there is no overlap, so merge() and join() work without suffix arguments.
Drop columns you do not need before joining
Often the overlapping column on one side is redundant. Dropping it removes the conflict and keeps the result narrow.
# keep only the key and the columns you actually want from stats
stats_slim = stats[["id", "value"]]
merged = zones.merge(stats_slim, on="id")
print(merged.columns)
Selecting only needed columns is usually cleaner than carrying everything and suffixing afterward.
Handle the index_right column from sjoin()
gpd.sjoin() attaches the right frame's index as a new column named index_right. If your data already has a column called index_right, or if you run sjoin() twice, the new column collides.
points = gpd.read_file("data/points.gpkg")
zones = gpd.read_file("data/zones.gpkg")
joined = gpd.sjoin(points, zones, how="left", predicate="within")
# drop the index column sjoin added if you do not need it
joined = joined.drop(columns=["index_right"])
print(joined.columns)
If both layers share attribute names such as name, sjoin() also supports suffixes through lsuffix and rsuffix:
joined = gpd.sjoin(
points,
zones,
how="left",
predicate="within",
lsuffix="pt",
rsuffix="zone",
)
print(joined.columns)
Code examples
Example 1: merge a CSV onto a GeoDataFrame with suffixes
A typical workflow attaches tabular statistics to zone polygons that happen to share a name column.
import geopandas as gpd
import pandas as pd
zones = gpd.read_file("data/zones.gpkg") # id, name, geometry
stats = pd.read_csv("data/zone_stats.csv") # id, name, value
overlap = zones.columns.intersection(stats.columns)
print("Overlap:", list(overlap))
merged = zones.merge(stats, on="id", suffixes=("_zone", "_stats"))
print(merged.head())
Example 2: rename before merging for clean column names
When you control the column names, renaming is clearer than reading generated suffixes later.
import geopandas as gpd
import pandas as pd
zones = gpd.read_file("data/zones.gpkg")
stats = pd.read_csv("data/zone_stats.csv")
stats = stats.rename(columns={"name": "stat_name", "value": "population"})
merged = zones.merge(stats, on="id")
print(merged.columns)
Example 3: spatial join with shared columns and index_right
This handles both the attribute collision and the index column that sjoin() adds.
import geopandas as gpd
points = gpd.read_file("data/points.gpkg") # id, name, geometry
zones = gpd.read_file("data/zones.gpkg") # id, name, geometry
joined = gpd.sjoin(
points,
zones,
how="left",
predicate="within",
lsuffix="pt",
rsuffix="zone",
)
# remove the index column from the right layer if it is not needed
if "index_right" in joined.columns:
joined = joined.drop(columns=["index_right"])
print(joined.columns)
Example 4: drop redundant columns before a join
Selecting only the columns you need avoids both the error and a wide, confusing result.
import geopandas as gpd
import pandas as pd
zones = gpd.read_file("data/zones.gpkg")
stats = pd.read_csv("data/zone_stats.csv")
stats_slim = stats[["id", "value"]] # drop the duplicate name column
merged = zones.merge(stats_slim, on="id")
print(merged[["id", "name", "value"]].head())
Explanation
When you combine two tables, pandas places columns from both inputs side by side in the result. If both inputs contain a column with the same name, the result would have two columns called, for example, name. That is ambiguous: later code that asks for df["name"] would not know which one to return.
Rather than silently keep one and drop the other, or produce duplicated labels that break selection, pandas raises ValueError: columns overlap but no suffix specified and lists the colliding names. It is asking you to resolve the ambiguity.
You can resolve it in two ways. Either tell pandas how to rename the duplicates, with suffixes= on merge() or lsuffix/rsuffix on join(), or remove the overlap yourself by renaming or dropping columns before the join. The join key itself is not a problem, because merge(on=...) consumes it into a single column.
sjoin() adds one extra wrinkle: it appends the right frame's index as a column named index_right. That column can collide with an existing column or with the index column left behind by a previous sjoin(), so a clean spatial workflow usually drops or renames it between steps.
Edge cases or notes
sjoinalways addsindex_right: Even when no attribute columns overlap,sjoin()creates anindex_rightcolumn. Drop it with.drop(columns=["index_right"])if you do not need it, especially before a second join.- Repeated joins accumulate suffixes: Joining the same frames more than once can produce columns like
value_left_left. Clean up or rename columns between joins instead of stacking suffixes. .joinvs.mergedifferences:.merge()joins on columns and usessuffixes=..join()joins on the index by default and useslsuffix/rsuffix. They are not interchangeable argument-for-argument.- Only non-key columns trigger the error: The column named in
on=is shared deliberately and is merged into one. The error lists the other shared columns. - Empty-string suffixes still collide: Passing
suffixes=("", "")does not avoid the error if real duplicates exist; it just reproduces the ambiguity. Use distinct suffixes or drop a column. - Geometry column on both sides: If both inputs are GeoDataFrames with a
geometrycolumn, merging them can creategeometry_x/geometry_y. Keep geometry on one side only, or usesjoin()for spatial combination.
Internal links
- How to Join Attribute Data to a GeoDataFrame in Python
- How to Perform a Spatial Join in Python (GeoPandas)
- Spatial Join Returns Empty Results in GeoPandas: How to Fix It
- How to Select Features by Location in GeoPandas
- How to Aggregate Spatial Data by Region in GeoPandas
FAQ
Why does merge() complain about overlapping columns but not the join key?
The join key named in on= is merged into a single shared column, so it is not ambiguous. The error lists only the other columns that appear in both frames with the same name.
What is the difference between suffixes and lsuffix/rsuffix?
merge() takes a single suffixes=("_left", "_right") tuple, while join() takes separate lsuffix and rsuffix string arguments. They do the same thing but belong to different methods.
How do I remove the index_right column after sjoin()?
Drop it with joined = joined.drop(columns=["index_right"]). It holds the matched index from the right layer, which you usually do not need once the join is done.
Should I add suffixes or drop the duplicate columns?
Drop or rename when one side's column is redundant, since that keeps the result narrow and readable. Use suffixes when you genuinely need both versions of the column in the output.
Does concat raise the same overlapping-columns error?
No. pd.concat() stacks frames and allows duplicate column labels, so it will not raise this error. The error is specific to merge(), join(), and sjoin(), which place columns side by side and require unique names.
How do I rename suffixed columns like name_left and name_right after the join?
Use .rename(columns={...}) once the join is done, for example merged.rename(columns={"name_left": "zone_name", "name_right": "stat_name"}). This gives the result readable labels instead of generated suffixes.
Why do I get geometry_x and geometry_y after merging two GeoDataFrames?
Both inputs carried a geometry column, so merge() suffixed them like any other duplicate. Keep geometry on one side only by selecting non-geometry columns from the other frame before merging, or use sjoin() for a spatial combination.
Can I keep the right layer's index from sjoin() instead of dropping index_right?
Yes. The index_right column holds the matched index from the right layer, so you can keep it and rename it to something meaningful like district_idx if you need to trace matches back. Drop it only when it is not useful downstream.