In the Copernicus Data Space Ecosystem (CDSE), STAC stands for SpatioTemporal Asset Catalog. It is a standardized, open-source metadata specification used to structure, search, and discover Earth Observation (EO) data.
Instead of requiring users to download massive, raw satellite images, the CDSE STAC RESTful API allows developers to query precise metadata (e.g., location, time, cloud cover, and specific bands) to locate exact data assets.
Key Functions of STAC:
- Data Access: It provides a direct path (such as S3 storage links) to cloud-hosted imagery, enabling tools to process a subset of data without full downloads.
- System Interoperability: It replaces satellite-specific extensions with a unified data model, allowing the same code or software to seamlessly handle diverse datasets like Sentinel-1 and Sentinel-2.
The package also offers features for the complementary primary
catalogue via OData. For more details on that read
vignette("OData").
In general to understand which STAC client the server is offering,
you can call dse_stac_client().
Data Exploration
A good starting point of exploring data with STAC is being aware which collections of data are available in the first place. You can list them as follows:
library(CopernicusDataspace)
dse_stac_collections()
#> # A tibble: 10 × 20
#> id type links title assets extent license keywords providers
#> <chr> <chr> <list> <chr> <list> <list> <chr> <list> <list>
#> 1 ccm-optical Coll… <list> Cope… <tibble> <tibble> other <list> <list>
#> 2 ccm-sar Coll… <list> Cope… <tibble> <tibble> other <list> <list>
#> 3 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> 4 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> 5 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> 6 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> 7 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> 8 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> 9 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> 10 clms_ba_glob… Coll… <list> CLMS… <tibble> <tibble> other <NULL> <list>
#> # ℹ 11 more variables: summaries <list>, description <chr>, item_assets <list>,
#> # `auth:schemes` <list>, `ceosard:type` <list>, stac_version <chr>,
#> # stac_extensions <list>, `storage:schemes` <list>, bands <list>,
#> # `sci:doi` <list>, contacts <list>The returned data.frame contains descriptive information
about each of the collections. It can help you focus your search. Once
you have identified a collection, you can check which filter/search are
available for further narrowing your exploration tour:
dse_stac_queryables("sentinel-1-grd") |> summary()
#> Length Class Mode
#> $id 1 -none- character
#> type 1 -none- character
#> title 1 -none- character
#> $schema 1 -none- character
#> properties 11 -none- list
#> additionalProperties 1 -none- logicalThe example above shows 11 properties that can be used to focus the
search. You can start an actual search by creating a STAC search request
with: dse_stac_search_request(). It creates a special class
of httr2 request object. In essence, it is a request to the
API server, which you can modify with tidyverse operators. This sounds
more complicated than it is.
Once you have created the request, you can add tidyverse operators
(like filter(), arrange() and
slice_head()), to modify this request. You can join those
modifications with the pipe operator (|> or
%>%). You can also query products that intersect with
specific spatial features (sf) using
st_intersects().
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.4.0; sf_use_s2() is TRUE
bbox <-
sf::st_bbox(
c(xmin = 5.261, ymin = 52.680, xmax = 5.319, ymax = 52.715),
crs = 4326)
dse_stac_search_request("sentinel-1-grd") |>
filter(`sat:orbit_state` == "ascending") |>
arrange("id") |>
st_intersects(bbox) |>
collect()
#> # A tibble: 10 × 56
#> id bbox type links assets geometry collection properties.created
#> * <chr> <list> <chr> <list> <list> <list> <chr> <chr>
#> 1 S1A_EW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-06-26T05:37:…
#> 2 S1A_EW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-04-27T16:33:…
#> 3 S1A_EW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-04-27T18:49:…
#> 4 S1A_IW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-05-08T16:00:…
#> 5 S1A_IW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-05-08T16:50:…
#> 6 S1A_IW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-04-25T15:40:…
#> 7 S1A_IW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-04-27T11:18:…
#> 8 S1A_IW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-04-27T13:01:…
#> 9 S1A_IW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-04-27T15:24:…
#> 10 S1A_IW_G… <list> Feat… <list> <tibble> <tibble> sentinel-… 2023-05-04T14:05:…
#> # ℹ 48 more variables: properties.updated <chr>, properties.datetime <chr>,
#> # properties.platform <chr>, properties.published <chr>,
#> # properties.instruments <list>, `properties.auth:schemes.s3.type` <chr>,
#> # `properties.auth:schemes.oidc.type` <chr>,
#> # `properties.auth:schemes.oidc.openIdConnectUrl` <chr>,
#> # properties.end_datetime <chr>, `properties.product:type` <chr>,
#> # `properties.view:azimuth` <dbl>, properties.constellation <chr>, …Downloading Data
When downloading data, you could retrieve the Uniform Resource
Identifier (URI) with dse_stac_get_uri(). To get an URI,
you need at least the asset identifier and the specific asset. Both can
be obtained with a search as shown above. The example below shows you
how:
dse_stac_get_uri(
asset_id = "S2A_MSIL1C_20260109T132741_N0511_R024_T39XVL_20260109T142148",
asset = "B01",
collection = "sentinel-2-l1c"
)
#> [1] "s3://eodata/Sentinel-2/MSI/L1C/2026/01/09/S2A_MSIL1C_20260109T132741_N0511_R024_T39XVL_20260109T142148.SAFE/GRANULE/L1C_T39XVL_A055105_20260109T132737/IMG_DATA/T39XVL_20260109T132741_B01.jp2"
#> attr(,"local_path")
#> [1] "S2A_MSIL1C_20260109T132741_N0511_R024_T39XVL_20260109T142148.SAFE/GRANULE/L1C_T39XVL_A055105_20260109T132737/IMG_DATA/T39XVL_20260109T132741_B01.jp2"This approach also needs the collection from which the asset is made
available. If you don’t provide it, it will be guessed with
dse_stac_guess_collection(). This function is not 100%
reliable, so it’s best practice to provide the collection manually.
Instead of working with the URI yourself it is easier to call
dse_stac_download(). It will automatically takes care of
required authentication for downloading the file (if properly provided).
Check vignette("Authentication") for more information about
the authentication process.
dse_stac_download(
asset_id = "S2A_MSIL1C_20260109T132741_N0511_R024_T39XVL_20260109T142148",
asset = "B01",
collection = "sentinel-2-l1c",
destination = tempdir()
)