Introduction to Rasterio: Reading Raster Data in Python
Problem statement
A common GIS task is opening a raster file in Python and checking what it contains before doing any analysis. For example, you may receive a GeoTIFF for elevation, land cover, or satellite imagery and need to confirm:
- the raster size
- how many bands it has
- its coordinate reference system
- its extent
- its nodata value
- the actual pixel values
This is usually the first step in a larger workflow such as clipping, reprojection, raster calculations, or batch processing. If you do not inspect the raster first, it is easy to use the wrong band, miss a missing CRS, or read data into the wrong array shape.
This page is a practical Rasterio tutorial focused only on reading raster data and inspecting basic properties. It does not cover reprojection, masking, or raster algebra.
Quick answer
Use Rasterio to open the raster with a context manager, inspect its metadata, and read one band or all bands into NumPy arrays.
import rasterio
raster_path = "data/dem.tif"
with rasterio.open(raster_path) as src:
print("Width:", src.width)
print("Height:", src.height)
print("Band count:", src.count)
print("CRS:", src.crs)
print("Bounds:", src.bounds)
print("Transform:", src.transform)
print("Dtype:", src.dtypes[0])
print("NoData:", src.nodata)
band1 = src.read(1)
print("Band 1 shape:", band1.shape)
This is the standard way to open a raster with Rasterio and inspect a GeoTIFF in Python.
Step-by-step solution
Install Rasterio
In a working GIS Python environment, Rasterio is commonly installed with pip or conda.
pip install rasterio
or:
conda install -c conda-forge rasterio
If installation fails, handle that separately before continuing. Rasterio depends on compiled geospatial libraries, so environment issues are common.
Open a raster file with Rasterio
Use rasterio.open() to open a GeoTIFF or another supported raster format. A context manager is the safest option because it closes the file automatically.
import rasterio
raster_path = "data/landcover.tif"
with rasterio.open(raster_path) as src:
print(src)
If the file opens successfully, src is a Rasterio dataset object.
Check raster metadata
Before reading cell values, inspect the core dataset properties.
import rasterio
raster_path = "data/landcover.tif"
with rasterio.open(raster_path) as src:
print("Width:", src.width)
print("Height:", src.height)
print("Band count:", src.count)
print("CRS:", src.crs)
print("Bounds:", src.bounds)
print("Transform:", src.transform)
print("Dtypes:", src.dtypes)
print("NoData:", src.nodata)
These properties answer practical questions:
width,height: number of columns and rowscount: number of raster bandscrs: spatial reference systembounds: raster extent as left, bottom, right, top coordinatestransform: maps pixel positions to real-world coordinatesdtypes: data type for each bandnodata: missing-data value
Read one raster band
For a single-band DEM or one specific spectral band, read only that band.
import rasterio
raster_path = "data/dem.tif"
with rasterio.open(raster_path) as src:
elevation = src.read(1)
print("Array shape:", elevation.shape)
print("Top-left value:", elevation[0, 0])
This is the usual way to read a GeoTIFF in Python when only one band is needed.
Read all raster bands
For multiband imagery, read the full stack.
import rasterio
raster_path = "data/multiband_satellite.tif"
with rasterio.open(raster_path) as src:
all_bands = src.read()
print("Array shape:", all_bands.shape)
The returned shape is:
(bands, rows, columns)
For example, a 4-band raster with 1000 rows and 2000 columns gives:
(4, 1000, 2000)
Understand Rasterio band indexing
Rasterio uses 1-based band indexing when reading from the dataset:
src.read(1) # first band
src.read(2) # second band
But once data is loaded into a NumPy array, indexing is 0-based:
first_value = elevation[0, 0]
first_band = all_bands[0]
This difference causes mistakes in many beginner workflows.
Print basic raster statistics or sample values
Basic validation helps confirm that the raster loaded correctly.
import rasterio
import numpy as np
raster_path = "data/dem.tif"
with rasterio.open(raster_path) as src:
band1 = src.read(1)
print("Shape:", band1.shape)
print("Minimum:", np.min(band1))
print("Maximum:", np.max(band1))
print("Sample values:", band1[0:3, 0:3])
For rasters with nodata values, mask them first if needed:
import rasterio
raster_path = "data/dem.tif"
with rasterio.open(raster_path) as src:
band1 = src.read(1, masked=True)
print("Minimum:", band1.min())
print("Maximum:", band1.max())
Code examples
Example 1: Open a GeoTIFF and print core dataset properties
import rasterio
raster_path = "data/elevation_utm.tif"
with rasterio.open(raster_path) as src:
print("CRS:", src.crs)
print("Bounds:", src.bounds)
print("Width:", src.width)
print("Height:", src.height)
print("Band count:", src.count)
print("Dtype:", src.dtypes[0])
Example 2: Read the first raster band into a NumPy array
import rasterio
raster_path = "data/elevation_utm.tif"
with rasterio.open(raster_path) as src:
band1 = src.read(1)
print("Band 1 shape:", band1.shape)
print("Cell [0, 0]:", band1[0, 0])
print("First 5 values in first row:", band1[0, :5])
Example 3: Read all bands from a multiband raster
import rasterio
raster_path = "data/sentinel_subset.tif"
with rasterio.open(raster_path) as src:
data = src.read()
print("Full array shape:", data.shape)
print("First band shape:", data[0].shape)
Example 4: Check nodata and metadata before processing
import rasterio
raster_path = "data/landcover.tif"
with rasterio.open(raster_path) as src:
print("NoData:", src.nodata)
print("Transform:", src.transform)
print("Profile:")
print(src.profile)
The profile dictionary is useful when you later write output rasters with matching metadata.
Explanation
Rasterio is a Python library for reading and writing raster datasets such as GeoTIFFs. In practical GIS work, it gives you direct access to both metadata and raster values.
Typical examples include:
- a single-band DEM with elevation values
- multiband satellite imagery with multiple spectral bands
- a classified raster where each cell stores a class code
Metadata tells you how the raster fits into space. That includes the CRS, bounds, transform, dimensions, band count, and nodata value. Raster values are the actual cell values stored in each band.
CRS and transform are especially important. A raster may open without errors, but if the CRS is missing or wrong, any later overlay with vectors or other rasters may fail. The transform tells you where each pixel sits in real-world coordinates and what the pixel size is.
If your goal is automation, reading metadata first is a reliable habit. It helps catch problems before you batch-process many files.
Edge cases / notes
Raster file opens but data looks unexpected
Common causes:
- you read the wrong band
- nodata values are affecting min and max statistics
- you expected
(rows, columns, bands)but Rasterio returned(bands, rows, columns)
CRS issues
Some rasters have missing, incomplete, or incorrect CRS information. Always inspect src.crs before spatial analysis. If the CRS is unclear, confirm it from the source data before continuing.
Large rasters and memory use
Reading an entire raster into memory may not be practical for large files. If performance becomes a problem, use windowed reading later instead of src.read() on the full dataset.
Not all rasters are GeoTIFFs
GeoTIFF is the most common starting format, but Rasterio supports other raster formats through GDAL drivers.
Internal links
If you need the conceptual background, see raster data in Python: core concepts for GIS workflows.
For related follow-up tasks, see how to get raster metadata with Rasterio in Python and how to read a single band from a GeoTIFF with Rasterio.
If Rasterio will not install or import correctly, check Rasterio installation problems on Windows and Conda.
FAQ
How do I read a GeoTIFF in Python with Rasterio?
Open it with rasterio.open() and read a band with src.read(1).
import rasterio
with rasterio.open("data/dem.tif") as src:
band1 = src.read(1)
Why does Rasterio use band numbers starting at 1?
Rasterio follows GIS raster conventions where bands are numbered starting from 1 at the dataset level. NumPy arrays still use 0-based indexing after the data is loaded.
What is the difference between raster metadata and raster values?
Metadata describes the raster structure and spatial reference, such as CRS, bounds, transform, and nodata. Raster values are the actual cell values stored in each band.
Should I read the whole raster at once or one band at a time?
Read one band if that is all you need. Read the full raster only when you actually need all bands and the dataset fits comfortably in memory.