A Precise Guide to Investigate Sentinel-2 data in Python (Downloading & Visualizing)

Aman Sharma
4 min readJul 16, 2020

The Sentinels are a constellation of satellites commissioned by Europe’s Copernicus Programme. The Copernicus Sentinel-2 mission comprises of two satellites that provide high-resolution optical imagery. Interestingly, the data is released under an Open Data policy allowing free access to images, and this opens a range of possibilities for academic works. It is always good to know some details about our data before working on, so here we go:

The Sentinel-2 acquires 13 spectral bands ranging from Visible and Near-Infrared to Shortwave Infrared wavelengths along a 290-km orbital swath. While it’s spatial resolution varies from 10m to 60m, the four bands wiz. red, blue, green, and NIR have a resolution of 10m per pixel. The revisit time of the satellites is 5 days but we get cloud-free products typically every 15 to 30 days over Europe and Africa.

Photo by NASA on Unsplash

Now let’s talk about automating the download process using Python.

The outline is that we’ll define our area of interest (AOI) in a Shapefile or GeoJSON file (a format for encoding geographic data structures) and generate a query to download a scene overlapping our AOI. All this to eventually crop out our AOI from the downloaded scene. There are several ways to create a Shapefile or a GeoJSON file and geojson.io is one such place.

Before proceeding, install a few essential libraries; geopandas, sentinelsat, rasterio, gdal, pyproj, fiona. If you face any trouble installing these libraries follow this link. Create an account on Copernicus Open Access Hub, and note down username and password. Everything’s ready, now let’s get our hands dirty with some codes.

import folium
import geopandas as gpd
from sentinelsat.sentinel import SentinelAPI
import rasterio
import matplotlib.pyplot as plt
from rasterio import plot
from rasterio.plot import show
from rasterio.mask import mask
from osgeo import gdal

Visualize and verify our region of interest on an interactive folium map.

m = folium.Map([latitude, longitude], zoom_start=11)
boundary = gpd.read_file(r'#path to the geojson file')

Create a footprint of the region and generate a query.

footprint = None
for i in boundary['geometry']:
footprint = i

user = #Username
password = #Password
api = SentinelAPI(user, password, 'https://scihub.copernicus.eu/dhus')products = api.query(footprint,
date = ('20200109', '20200510'),
platformname = 'Sentinel-2',
processinglevel = 'Level-2A',
cloudcoverpercentage = (0, 20))

Create a GeoDataframe for the products and sort the queries on cloud cover percentage to fetch the scene with the lowest cloud cover or as per requirement. Call the required ‘uuid’ in api.download, and initiate the download process. Extract the data manually and locate its directory.

gdf = api.to_geodataframe(products)
gdf_sorted = gdf.sort_values(['cloudcoverpercentage'], ascending=[True])


Our dataset is ready with all 13 bands of information, though we just need red, blue, and green bands to create an RGB image. We will write the data of these three bands in an empty tiff file and mask our region of interest on it. Locate the folder named ‘R10’ and change the names of files in the code below accordingly.

bands = r'...\GRANULE\L2A_T18TWL_A025934_20200609T155403\IMG_DATA\R10m'
blue = rasterio.open(bands+'\T18TWL_20200609T154911_B02_10m.jp2')
green = rasterio.open(bands+'\T18TWL_20200609T154911_B03_10m.jp2')
red = rasterio.open(bands+'\T18TWL_20200609T154911_B04_10m.jp2')
with rasterio.open('image_name.tiff','w',driver='Gtiff', width=blue.width, height=blue.height, count=3, crs=blue.crs,transform=blue.transform, dtype=blue.dtypes[0]) as rgb:

Instead of visualizing this image right away, it is recommended to mask our region of interest as it will save computational power. Make sure the coordinate reference system (CRS) of the GeoJSON file is converted to the CRS of the RGB image created above.

bound_crs = boundary.to_crs({'init': 'epsg:32618'})with rasterio.open("image_name.tiff") as src:
out_image, out_transform = mask(src,
out_meta = src.meta.copy()
out_meta.update({"driver": "GTiff",
"height": out_image.shape[1],
"width": out_image.shape[2],
"transform": out_transform})

with rasterio.open("masked_image.tif", "w", **out_meta) as final:

src = rasterio.open(r'...\masked_image.tif')
plt.title('Final Image')
plot.show(src, adjust='linear')