How to Create a Choropleth Map in Python with GeoPandas
Problem statement
A common GIS task is to color polygon features based on values in an attribute column. This is how you build a choropleth map in GeoPandas.
Typical examples include:
- coloring census tracts by median income
- mapping districts by population density
- showing parcels by assessed value
- visualizing regions by unemployment rate
In practice, the main problems are usually not the plotting step itself. They are:
- loading a shapefile or GeoJSON correctly
- making sure the geometry is polygon or multipolygon
- choosing a numeric column to map
- dealing with missing values
- styling the output so the map is readable
If you need to create a choropleth map in Python from polygon GIS data, GeoPandas provides a direct workflow.
Quick answer
If your polygon layer already contains the numeric column you want to map, the shortest workflow is:
import geopandas as gpd
import matplotlib.pyplot as plt
gdf = gpd.read_file("data/districts.shp")
ax = gdf.plot(
column="population_density",
cmap="OrRd",
legend=True,
figsize=(10, 8),
edgecolor="black",
linewidth=0.3
)
ax.set_title("Population Density by District")
ax.set_axis_off()
plt.show()
This is enough when:
- the file loads correctly
- your target column is numeric
- you want a simple map quickly
You need more control when:
- values contain nulls
- the map needs classified ranges
- styling matters for reports or export
- the attribute column needs cleanup first
Step-by-step solution
Load polygon data into a GeoDataFrame
GeoPandas can read shapefiles, GeoJSON, and many other GIS formats through read_file().
import geopandas as gpd
gdf = gpd.read_file("data/admin_areas.geojson")
print(gdf.head())
print(gdf.geom_type.value_counts())
print(gdf.crs)
Check that the geometry type is Polygon or MultiPolygon. A choropleth map is designed for area features, not points.
If your layer contains mixed geometry types, filter to polygons before plotting:
gdf = gdf[gdf.geom_type.isin(["Polygon", "MultiPolygon"])].copy()
If you are working with a shapefile:
gdf = gpd.read_file("data/admin_areas.shp")
Check the attribute column you want to map
Before plotting, inspect the available fields and confirm the target column is numeric.
print(gdf.columns)
print(gdf["density"].dtype)
print(gdf["density"].describe())
If the values are stored as text, convert them with pandas:
import pandas as pd
gdf["density"] = pd.to_numeric(gdf["density"], errors="coerce")
Using errors="coerce" turns invalid values into NaN, which you can handle later.
Create a basic choropleth map
The core GeoPandas choropleth workflow uses plot() with the column argument.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 8))
gdf.plot(
column="density",
ax=ax,
legend=True
)
plt.show()
This colors each polygon according to the value in density.
Improve the map styling
A default plot works, but it is usually better to control color, borders, title, and axes.
fig, ax = plt.subplots(figsize=(10, 8))
gdf.plot(
column="density",
ax=ax,
cmap="YlGnBu",
legend=True,
edgecolor="black",
linewidth=0.2
)
ax.set_title("Population Density by Administrative Area", fontsize=14)
ax.set_axis_off()
plt.show()
Useful colormaps include:
OrRdYlGnBuBluesviridisplasma
Handle missing values
Missing values often appear as blank polygons unless you define a missing-data style.
fig, ax = plt.subplots(figsize=(10, 8))
gdf.plot(
column="density",
ax=ax,
cmap="YlGnBu",
legend=True,
edgecolor="black",
linewidth=0.2,
missing_kwds={
"color": "lightgrey",
"edgecolor": "red",
"hatch": "///",
"label": "Missing values"
}
)
ax.set_title("Population Density by Administrative Area")
ax.set_axis_off()
plt.show()
You can also remove records with null values before plotting:
gdf_clean = gdf[gdf["density"].notna()].copy()
Classify values for clearer interpretation
A continuous scale is not always the best choice. If values are unevenly distributed, classified ranges are easier to read.
GeoPandas supports classification schemes through the scheme argument. This requires the optional mapclassify package.
Install it if needed:
pip install mapclassify
Then use a classification method such as quantiles:
fig, ax = plt.subplots(figsize=(10, 8))
gdf.plot(
column="density",
ax=ax,
cmap="OrRd",
legend=True,
scheme="quantiles",
k=5,
edgecolor="black",
linewidth=0.2
)
ax.set_title("Population Density by Quantile Class")
ax.set_axis_off()
plt.show()
Common schemes:
equal_intervalquantilesnatural_breaks
Save the choropleth map
To export the final map to an image:
from pathlib import Path
import matplotlib.pyplot as plt
Path("output").mkdir(parents=True, exist_ok=True)
fig, ax = plt.subplots(figsize=(10, 8))
gdf.plot(
column="density",
ax=ax,
cmap="OrRd",
legend=True,
edgecolor="black",
linewidth=0.2
)
ax.set_title("Population Density by Administrative Area")
ax.set_axis_off()
plt.savefig("output/population_density_map.png", dpi=300, bbox_inches="tight")
plt.close()
Use dpi=300 for reports or print output. bbox_inches="tight" helps remove unnecessary margins.
Code examples
Example 1: Basic choropleth from a shapefile
import geopandas as gpd
import matplotlib.pyplot as plt
gdf = gpd.read_file("data/districts.shp")
fig, ax = plt.subplots(figsize=(10, 8))
gdf.plot(
column="pop_density",
ax=ax,
cmap="OrRd",
legend=True
)
ax.set_title("Population Density by District")
ax.set_axis_off()
plt.show()
Example 2: Choropleth from GeoJSON with custom styling
import geopandas as gpd
import matplotlib.pyplot as plt
gdf = gpd.read_file("data/neighborhoods.geojson")
fig, ax = plt.subplots(figsize=(11, 8))
gdf.plot(
column="median_income",
ax=ax,
cmap="YlGnBu",
legend=True,
edgecolor="white",
linewidth=0.5,
missing_kwds={"color": "lightgrey", "label": "No data"}
)
ax.set_title("Median Income by Neighborhood", fontsize=14)
ax.set_axis_off()
plt.show()
Example 3: Choropleth with classified value ranges
This example requires mapclassify to be installed.
import geopandas as gpd
import matplotlib.pyplot as plt
gdf = gpd.read_file("data/tracts.geojson")
fig, ax = plt.subplots(figsize=(10, 8))
gdf.plot(
column="unemployment_rate",
ax=ax,
cmap="PuRd",
legend=True,
scheme="natural_breaks",
k=5,
edgecolor="black",
linewidth=0.2
)
ax.set_title("Unemployment Rate by Census Tract")
ax.set_axis_off()
plt.show()
Example 4: Export the final map image
from pathlib import Path
import geopandas as gpd
import matplotlib.pyplot as plt
Path("output").mkdir(parents=True, exist_ok=True)
gdf = gpd.read_file("data/regions.shp")
fig, ax = plt.subplots(figsize=(12, 9))
gdf.plot(
column="value_index",
ax=ax,
cmap="viridis",
legend=True,
edgecolor="black",
linewidth=0.3
)
ax.set_title("Regional Value Index")
ax.set_axis_off()
plt.savefig("output/regional_value_index.png", dpi=300, bbox_inches="tight")
plt.close()
Explanation
When you pass column="density" to GeoDataFrame.plot(), GeoPandas reads the numeric values in that field and assigns colors to polygons based on those values. This is the core of a choropleth map in GeoPandas.
There are two common ways to symbolize values:
- Continuous color scale: each value is colored along a gradient
- Classified choropleth: values are grouped into ranges such as 5 classes
A continuous scale is simple and preserves detail. A classified map is often easier to interpret in reports because the legend shows clear value ranges.
Be careful about what values you map. Choropleths work best with standardized or comparable variables such as:
- population density
- rate per square kilometer
- percent unemployment
- median income
They are often misleading with raw totals like total population, because large polygons may dominate the visual impression. If area sizes vary a lot, use rates or densities instead of totals.
Also note that the boundary geometry and the attribute values must already be in the same dataset. If your values are in a separate CSV or table, join them to the polygon layer before plotting.
Edge cases or notes
- Column stored as text: convert it with
pd.to_numeric(..., errors="coerce")before plotting. - Null values: use
missing_kwdsor filter null rows out. - Mixed geometry types: filter to
PolygonandMultiPolygonfeatures before making the map. - Invalid or empty geometries: these can cause plotting problems. Check with
gdf.geometry.is_validandgdf.geometry.is_empty. - CRS issues: CRS usually does not prevent plotting by itself, but it matters if you are combining layers. Make sure all layers use the same CRS before overlaying or comparing them.
- Map looks wrong: if the file loads but plots incorrectly, inspect geometry validity and CRS, and confirm the layer is actually polygon data.
- Large datasets: plotting many polygons can be slow. Simplifying geometry or filtering to the area of interest may help.
- Point layers: choropleths are for polygons. For points, use proportional symbols, heatmaps, or point styling instead.
- Raw totals: avoid mapping raw totals when polygon sizes vary widely, because the result can be visually misleading.
Internal links
If you need more background, read What Is a Choropleth Map in GIS.
For related GeoPandas tasks, see How to Read a Shapefile in Python with GeoPandas and How to Reproject Spatial Data in Python (GeoPandas).
If you need to prepare output data for other tools, see How to Export GeoJSON in Python with GeoPandas.
If your layer draws in the wrong place or looks distorted, check Why GeoPandas Plot Is Not Showing the Correct Map.
FAQ
How do I create a choropleth map in GeoPandas?
Load a polygon dataset with gpd.read_file(), then call gdf.plot(column="your_field", legend=True). Add cmap, edgecolor, and figsize for better output.
Does GeoPandas work with shapefiles and GeoJSON for choropleth maps?
Yes. GeoPandas supports both shapefiles and GeoJSON through read_file(), so you can use the same choropleth workflow with either format.
What column type should I use for a choropleth map?
Use a numeric column such as integer or float. If your values are strings, convert them to numeric before plotting.
How do I handle missing values in a GeoPandas choropleth?
Use missing_kwds to style polygons with null values, or remove null rows with gdf[gdf["field"].notna()].