This repository provides examples of how to create stac-geoparquet files and multiple ways to query them.
The data referenced in this repository has been published to the AWS pgc-opendata-dems
bucket under the prefix stac-api-data
. The content of that prefix is shown below and mirrors the state of the PGC Public DEMs Dynamic STAC API as of April 2025.
The .ndjson
files are those that were ingested into a stac-fastapi-pgstac application that serves the Dynamic STAC API.
The .parquet
file are stac-geoparquet generated from the .ndjson
files using rustac
. We are exploring this file format and the emerging tooling around it as a durable fallback should the Dynamic STAC API become unavailable.
$ aws --no-sign-request s3 ls --recursive --human-readable s3://pgc-opendata-dems/stac-api-data/
2025-06-13 10:12:55 1.9 MiB stac-api-data/geoparquet/arcticdem-mosaics-v3.0-10m.parquet
2025-06-13 10:12:55 4.2 MiB stac-api-data/geoparquet/arcticdem-mosaics-v3.0-2m.parquet
2025-06-13 10:12:55 2.0 MiB stac-api-data/geoparquet/arcticdem-mosaics-v3.0-32m.parquet
2025-06-13 10:12:54 8.9 MiB stac-api-data/geoparquet/arcticdem-mosaics-v4.1-10m.parquet
2025-06-13 10:12:54 14.3 MiB stac-api-data/geoparquet/arcticdem-mosaics-v4.1-2m.parquet
2025-06-13 10:12:54 9.1 MiB stac-api-data/geoparquet/arcticdem-mosaics-v4.1-32m.parquet
2025-06-13 10:12:54 796.3 MiB stac-api-data/geoparquet/arcticdem-strips-s2s041-2m.parquet
2025-06-13 10:12:56 7.5 MiB stac-api-data/geoparquet/earthdem-strips-s2s041-2m.parquet
2025-06-13 10:13:01 5.8 MiB stac-api-data/geoparquet/rema-mosaics-v2.0-10m.parquet
2025-06-13 10:12:55 9.3 MiB stac-api-data/geoparquet/rema-mosaics-v2.0-2m.parquet
2025-06-13 10:12:56 5.9 MiB stac-api-data/geoparquet/rema-mosaics-v2.0-32m.parquet
2025-06-13 10:12:56 576.7 MiB stac-api-data/geoparquet/rema-strips-s2s041-2m.parquet
2025-06-13 10:13:21 12.0 MiB stac-api-data/ndjson/arcticdem-mosaics-v3.0-10m.ndjson
2025-06-13 10:13:22 52.4 MiB stac-api-data/ndjson/arcticdem-mosaics-v3.0-2m.ndjson
2025-06-13 10:13:23 12.0 MiB stac-api-data/ndjson/arcticdem-mosaics-v3.0-32m.ndjson
2025-06-13 10:13:23 36.8 MiB stac-api-data/ndjson/arcticdem-mosaics-v4.1-10m.ndjson
2025-06-13 10:13:24 92.3 MiB stac-api-data/ndjson/arcticdem-mosaics-v4.1-2m.ndjson
2025-06-13 10:13:26 36.8 MiB stac-api-data/ndjson/arcticdem-mosaics-v4.1-32m.ndjson
2025-06-13 10:13:26 4.8 GiB stac-api-data/ndjson/arcticdem-strips-s2s041-2m.ndjson
2025-06-13 10:14:43 17.9 KiB stac-api-data/ndjson/collections.ndjson
2025-06-13 10:14:57 38.5 MiB stac-api-data/ndjson/earthdem-strips-s2s041-2m.ndjson
2025-06-13 10:14:55 23.5 MiB stac-api-data/ndjson/rema-mosaics-v2.0-10m.ndjson
2025-06-13 10:14:56 58.2 MiB stac-api-data/ndjson/rema-mosaics-v2.0-2m.ndjson
2025-06-13 10:14:58 23.5 MiB stac-api-data/ndjson/rema-mosaics-v2.0-32m.ndjson
2025-06-13 10:14:58 3.4 GiB stac-api-data/ndjson/rema-strips-s2s041-2m.ndjson
Using uv (recommended)
The example tool and script calls below assume that uv is available. This will handle creating an isolated virtual environment and then using it to run the scripts/tools.
Using conda
An equivalent conda environment can be created and activate with the following commands.
conda create -n stac-geoparquet-demo -c conda-forge 'click=8.2.1' 'duckdb=1.3.0' 'rustac=0.7.0'
conda activate stac-geoparquet-dem
# rustac can be called directly
rustac --help
# query_with_duckdb.py can be called via python
python query_with_duckdb.py --help
# deactivate the environment
conda deactivate
Using pip
An equivalent python environment can be created and activate with the following commands.
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install 'click==8.2.1' 'duckdb==1.3.0' 'rustac==0.7.0'
# rustac can be called directly
rustac --help
# query_with_duckdb.py can be called via python
python query_with_duckdb.py --help
# deactivate the environment
deactivate
The stac-geoparquet files were created from newlined-delimited json STAC Item files using the rustac translate
commmand.
uvx rustac@0.7.0 translate arcticdem-mosaics-v3.0-10m.ndjson arcticdem-mosaics-v3.0-10m.parquet
The rustac search
command provides STAC API-like search functionality that can read directly from stac-geoparquet files.
NOTE: Currently, rustac does not propery deserialize geometry objects that end up nested in the STAC Item 'properties' field (e.g. 'proj:geometry'). This can be circumvented by passing the flag --fields='-proj:geometry'
, which translates to return all fields except 'proj:geometry'
.
uvx rustac@0.7.0 search \
--intersects='{ "coordinates": [ -113.03279621237077, 61.405347059880484 ], "type": "Point" }' \
--fields='-proj:geometry' \
https://pgc-opendata-dems.s3.us-west-2.amazonaws.com/stac-api-data/geoparquet/arcticdem-mosaics-v4.1-2m.parquet
DuckDB is the engine that rustac
uses for reading, querying, and writing stac-geoparquet files. It can be used to extract content in any shape directly from the stac-geoparquet files. The file query_with_duckdb.py
provides an example of how to construct FeatureCollections, like those that exist in the pgc-opendata-dems bucket, from a stac-geoparuet file.
# Show CLI help
uv run query_with_duckdb.py --help
# Example returning features with a given 'pgc:geocell' value
uv run query_with_duckdb.py by-geocell --collection arcticdem-strips-s2s041-2m --geocell n51w176 | jq
# Example returning features intersecting a given WKT geometry
uv run query_with_duckdb.py by-wkt --collection arcticdem-strips-s2s041-2m --wkt 'Point(-113.0328 61.4053)' | jq