EPSG Codes Explained: How to Choose the Right CRS in Python
Problem statement
In Python GIS work, you will often see CRS values like EPSG:4326, EPSG:3857, or a local projected system such as a UTM zone. The problem is not just understanding what the number means. The practical problem is choosing the right CRS for the job.
If you use the wrong CRS, common GIS tasks can produce misleading results:
- distance and area calculations can be wrong
- layers can appear in different places
- spatial joins and overlays can return wrong or empty results
- map output can look distorted or confusing
This is a common issue when reading shapefiles, GeoJSON, or data from web APIs in GeoPandas. Many users can load the data, but are not sure whether to keep the CRS, assign one, or reproject it.
Quick answer
An EPSG code is a standard identifier for a coordinate reference system (CRS).
The practical rule in Python GIS workflows is:
- use the source CRS that matches the data when the metadata is correct
- assign a CRS only when metadata is missing and you know what the source CRS should be
- use a projected CRS for area and distance analysis
- use a geographic CRS like
EPSG:4326mainly for storage, exchange, GPS, and many web/API workflows
In Python GIS, you will usually inspect CRS with GeoPandas using .crs, assign missing CRS metadata with .set_crs(), and transform coordinates with .to_crs().
Step-by-step solution
Step 1: Check the current CRS of your data
Always inspect the CRS first.
import geopandas as gpd
gdf = gpd.read_file("data/city_parks.geojson")
print(gdf.crs)
Possible results:
EPSG:4326- a full WKT CRS definition
Noneif the file has no CRS metadata
This tells you whether the dataset already has a known coordinate reference system.
Step 2: Decide whether the CRS is correct or just missing
There are two different situations:
- The coordinates are already correct, but CRS metadata is missing
- The data must be transformed into another CRS
These are not the same.
If the coordinates are longitude and latitude values and you know the file should be WGS84, assign the CRS:
gdf = gdf.set_crs("EPSG:4326")
This only labels the data. It does not change coordinates.
If the data is already labeled correctly but you need another CRS for analysis, reproject it:
gdf_projected = gdf.to_crs("EPSG:32633")
This changes the coordinate values.
A common mistake is assigning a new CRS with set_crs() when the data actually needs to_crs().
Step 3: Match the CRS to the task
Choose the CRS based on what you are doing.
Web maps and GPS data
Use cases often involve:
EPSG:4326for latitude/longitude dataEPSG:3857for web map display
Distance and area analysis
Use a projected CRS with linear units such as meters or feet. Good choices are often:
- local UTM zones
- official national projected systems
- regional projected CRS used by your organization
Multi-layer analysis
If you are doing overlays, joins, buffering, or measurement, put all layers into one shared projected CRS when possible.
Step 4: Reproject data when needed
Reprojection is required when the current CRS is not suitable for the task.
Example: convert polygon data from WGS84 to a projected CRS before calculating area.
import geopandas as gpd
districts = gpd.read_file("data/districts.geojson")
print(districts.crs)
districts_m = districts.to_crs("EPSG:32633")
districts_m["area_sq_m"] = districts_m.area
print(districts_m[["area_sq_m"]].head())
If you calculate area directly in EPSG:4326, the results will be in angular units, not useful square meters.
Step 5: Verify the result before analysis
After assigning or transforming CRS, check that the result makes sense.
import geopandas as gpd
roads = gpd.read_file("data/roads.shp")
parcels = gpd.read_file("data/parcels.shp")
print(roads.crs)
print(parcels.crs)
parcels = parcels.to_crs(roads.crs)
print(roads.total_bounds)
print(parcels.total_bounds)
Useful checks:
- do layers now overlap in the expected area?
- are the units meters or feet if you need measurement?
- do bounds look reasonable for the location?
- does a sample buffer or distance result look realistic?
Code examples
Example 1: Read a file and inspect its CRS
import geopandas as gpd
buildings = gpd.read_file("data/buildings.shp")
print("CRS:", buildings.crs)
This confirms whether CRS metadata exists.
Example 2: Assign a missing EPSG code to data
import geopandas as gpd
points = gpd.read_file("data/gps_points.geojson")
if points.crs is None:
points = points.set_crs("EPSG:4326")
print(points.crs)
Use this only when the coordinates are already in that CRS and the metadata is missing.
Example 3: Reproject data to a projected CRS for analysis
import geopandas as gpd
points = gpd.read_file("data/gps_points.geojson")
if points.crs is None:
points = points.set_crs("EPSG:4326")
points_utm = points.to_crs("EPSG:32633")
points_utm["buffer_500m"] = points_utm.buffer(500)
print(points_utm.crs)
This is the correct pattern when you need meter-based analysis.
Example 4: Confirm layer alignment after reprojection
import geopandas as gpd
rivers = gpd.read_file("data/rivers.shp")
catchments = gpd.read_file("data/catchments.geojson")
catchments_aligned = catchments.to_crs(rivers.crs)
print("Rivers bounds:", rivers.total_bounds)
print("Catchments bounds:", catchments_aligned.total_bounds)
If the bounds now fall in the same region and the map overlays correctly, the reprojection likely worked.
Explanation
An EPSG code is a short identifier for a full CRS definition. In Python libraries, you will usually see it written like this:
"EPSG:4326"
That code tells GeoPandas, Shapely-based workflows, and other GIS tools how coordinates relate to real places on Earth.
The most important practical distinction is between geographic and projected CRS:
- Geographic CRS uses angular units, usually degrees
- Projected CRS uses linear units, usually meters or feet
This matters because many geometry operations depend on the CRS. If you measure distance or area in a geographic CRS, the results are usually not suitable for analysis.
You will see a few EPSG codes often:
- EPSG:4326 — WGS84 latitude/longitude; common for GPS, GeoJSON, and APIs
- EPSG:3857 — Web Mercator; common for web maps and basemaps
- UTM or national projected EPSG codes — better for local analysis and measurement
The workflow is simple:
- inspect the CRS with
.crs - decide whether the CRS is missing or whether transformation is needed
- choose the CRS based on the task
- verify alignment and units before analysis
The key distinction is:
set_crs()assigns metadatato_crs()transforms coordinates
That is the key idea behind choosing the correct CRS workflow in Python GIS. If your data is in WGS84 and you need accurate area or distance, reproject it to a projected CRS first. If your file has no CRS metadata but you know what it should be, assign the correct CRS without changing coordinates.
Edge cases and notes
Files without an EPSG code but with a valid CRS definition
Some files store CRS as WKT or PROJ text rather than a simple EPSG code. GeoPandas may still read this correctly. You do not always need a numeric EPSG value if the CRS definition is valid.
Data from different sources may use similar coordinate values
Two datasets can appear similar and still use different CRS. Do not rely on visual similarity alone. Check .crs for every layer.
Not every projected CRS is suitable for every analysis
A projected CRS is not automatically the right one. Prefer local or official projected systems for better accuracy.
For example, if you are measuring parcel areas in one city, a local UTM zone or official local projected CRS is usually a better choice than a global web map CRS.
EPSG:3857 is useful for display, not precise measurement
EPSG:3857 is standard for web maps, but it is usually not the best choice for precise distance or area calculations.
Invalid geometries can affect results
Even with the correct CRS, invalid polygons or broken geometries can cause overlay or measurement issues. If results look wrong, also validate your geometry, not just the CRS.
Internal links
To understand the broader concept, see Coordinate Reference Systems (CRS) Explained for Python GIS.
For related tasks, read GeoPandas Basics: Working with Spatial Data in Python and Python for GIS: What It Is and When to Use It.
FAQ
What is the difference between an EPSG code and a CRS?
A CRS is the full coordinate reference system definition. An EPSG code is a standard identifier for one specific CRS.
When should I use EPSG:4326 in Python GIS?
Use EPSG:4326 for latitude/longitude data, GPS data, GeoJSON, and many API or storage workflows. It is usually not the best choice for distance or area analysis.
How do I know if I should set a CRS or reproject the data?
Use set_crs() when coordinates are already correct and only the metadata is missing. Use to_crs() when you need to transform the coordinates into another CRS.
Why are my distance or area calculations wrong after loading data in GeoPandas?
You are likely working in a geographic CRS such as EPSG:4326. Reproject to a suitable projected CRS before measuring.