Quickstart
point-collocation gets matchups to lat/lon using the pixel center that is closest to the lat/lon point (equivalent to method="nearest"). For time, you can select a buffer of 0, which means the time of the point must be within the time range of the file or a buffer like buffer="1D" to find files within 1 day of the point. Using a buffer can help for L2 files with short windows (minutes) or collections with infrequent files.
- Create a plan for files to use
pc.plan() - Print the plan to check it
plan.summary() - Do the plan and get matchups for variables
pc.matchup(plan, geometry='grid', variables=['var'])
Prerequisite -- Login to EarthData
The examples here use NASA EarthData and you need to have an account with EarthData. Make sure you can login.
/Users/eli.holmes/Documents/GitHub/point-collocation/.micromamba/envs/point-collocation-dev/lib/python3.14/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
<earthaccess.auth.Auth at 0x11212d7f0>
Get some points to matchup
from pathlib import Path
import pandas as pd
HERE = Path.cwd()
POINTS_CSV = HERE / "fixtures" / "points.csv"
df_points = pd.read_csv(POINTS_CSV) # lat, lon, date columns
print(len(df_points))
df_points.head()
595
| lat | lon | date | |
|---|---|---|---|
| 0 | 27.3835 | -82.7375 | 2024-06-13 |
| 1 | 27.1190 | -82.7125 | 2024-06-14 |
| 2 | 26.9435 | -82.8170 | 2024-06-14 |
| 3 | 26.6875 | -82.8065 | 2024-06-14 |
| 4 | 26.6675 | -82.6455 | 2024-06-14 |
Start plan -- Take a look at the files in a collection
Now we use the point_collocation package. First we will look at the files available and figure out which ones we want.
%%time
import point_collocation as pc
plan = pc.plan(
df_points,
data_source="earthaccess",
source_kwargs={
"short_name": "PACE_OCI_L3M_RRS",
}
)
CPU times: user 1.82 s, sys: 72.5 ms, total: 1.89 s
Wall time: 8.25 s
Plan: 595 points → 210 unique granule(s)
Points with 0 matches : 0
Points with >1 matches: 595
Time buffer: 0 days 00:00:00
First 1 point(s):
[0] lat=27.3835, lon=-82.7375, time=2024-06-13 12:00:00: 16 match(es)
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240321_20240620.L3m.SNSP.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240321_20240620.L3m.SNSP.RRS.V3_1.Rrs.4km.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240516_20240616.L3m.R32.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240516_20240616.L3m.R32.RRS.V3_1.Rrs.4km.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240524_20240624.L3m.R32.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240524_20240624.L3m.R32.RRS.V3_1.Rrs.4km.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240630.L3m.MO.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240630.L3m.MO.RRS.V3_1.Rrs.4km.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240702.L3m.R32.RRS.V3_1.Rrs.4km.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240702.L3m.R32.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240609_20240616.L3m.8D.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240609_20240616.L3m.8D.RRS.V3_1.Rrs.4km.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240609_20240710.L3m.R32.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240609_20240710.L3m.R32.RRS.V3_1.Rrs.4km.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240613.L3m.DAY.RRS.V3_1.Rrs.0p1deg.nc
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240613.L3m.DAY.RRS.V3_1.Rrs.4km.nc
Create new plan with filter on file names
We will use the monthly 4km files.
%%time
import point_collocation as pc
plan = pc.plan(
df_points,
data_source="earthaccess",
source_kwargs={
"short_name": "PACE_OCI_L3M_RRS",
"granule_name": "*.MO.*.4km.*",
}
)
CPU times: user 93.2 ms, sys: 32.5 ms, total: 126 ms
Wall time: 3.25 s
# check the plan and see how many files per point
# we want 1 file per point in this case
# Looks like 6 monthly files
plan.summary()
Plan: 595 points → 4 unique granule(s)
Points with 0 matches : 0
Points with >1 matches: 0
Time buffer: 0 days 00:00:00
First 5 point(s):
[0] lat=27.3835, lon=-82.7375, time=2024-06-13 12:00:00: 1 match(es)
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240630.L3m.MO.RRS.V3_1.Rrs.4km.nc
[1] lat=27.1190, lon=-82.7125, time=2024-06-14 12:00:00: 1 match(es)
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240630.L3m.MO.RRS.V3_1.Rrs.4km.nc
[2] lat=26.9435, lon=-82.8170, time=2024-06-14 12:00:00: 1 match(es)
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240630.L3m.MO.RRS.V3_1.Rrs.4km.nc
[3] lat=26.6875, lon=-82.8065, time=2024-06-14 12:00:00: 1 match(es)
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240630.L3m.MO.RRS.V3_1.Rrs.4km.nc
[4] lat=26.6675, lon=-82.6455, time=2024-06-14 12:00:00: 1 match(es)
→ https://obdaac-tea.earthdatacloud.nasa.gov/ob-cumulus-prod-public/PACE_OCI.20240601_20240630.L3m.MO.RRS.V3_1.Rrs.4km.nc
Check the variables in the files
This will open one file and show us the variables. We want 'Rrs' in this case.
geometry : 'grid'
open_method : 'dataset'
Dimensions : {'lat': 4320, 'lon': 8640, 'wavelength': 172, 'rgb': 3, 'eightbitcolor': 256}
Variables : ['Rrs', 'palette']
Geolocation: ('lon', 'lat') — lon dims=('lon',), lat dims=('lat',)
Get the matchups using our plan
Let's start with 100 points since 595 might take awhile.
CPU times: user 7.94 s, sys: 1.58 s, total: 9.52 s
Wall time: 46.7 s
| lat | lon | time | granule_id | Rrs_346 | Rrs_348 | Rrs_351 | Rrs_353 | Rrs_356 | Rrs_358 | ... | Rrs_706 | Rrs_707 | Rrs_708 | Rrs_709 | Rrs_711 | Rrs_712 | Rrs_713 | Rrs_714 | Rrs_717 | Rrs_719 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 27.3835 | -82.7375 | 2024-06-13 12:00:00 | https://obdaac-tea.earthdatacloud.nasa.gov/ob-... | 0.004034 | 0.004070 | 0.004170 | 0.004278 | 0.004462 | 0.004604 | ... | 0.000224 | 0.000202 | 0.000190 | 0.000176 | 0.000168 | 0.000156 | 0.000144 | 0.000134 | 0.000158 | 0.000202 |
| 1 | 27.1190 | -82.7125 | 2024-06-14 12:00:00 | https://obdaac-tea.earthdatacloud.nasa.gov/ob-... | 0.004562 | 0.004616 | 0.004700 | 0.004692 | 0.004806 | 0.005070 | ... | 0.000108 | 0.000094 | 0.000084 | 0.000078 | 0.000072 | 0.000066 | 0.000060 | 0.000048 | 0.000062 | 0.000098 |
| 2 | 26.9435 | -82.8170 | 2024-06-14 12:00:00 | https://obdaac-tea.earthdatacloud.nasa.gov/ob-... | 0.005112 | 0.005282 | 0.005458 | 0.005582 | 0.005868 | 0.006226 | ... | 0.000118 | 0.000108 | 0.000102 | 0.000098 | 0.000098 | 0.000092 | 0.000086 | 0.000068 | 0.000052 | 0.000066 |
| 3 | 26.6875 | -82.8065 | 2024-06-14 12:00:00 | https://obdaac-tea.earthdatacloud.nasa.gov/ob-... | 0.004648 | 0.004904 | 0.005108 | 0.005242 | 0.005548 | 0.005944 | ... | 0.000178 | 0.000158 | 0.000148 | 0.000138 | 0.000130 | 0.000126 | 0.000126 | 0.000120 | 0.000158 | 0.000230 |
| 4 | 26.6675 | -82.6455 | 2024-06-14 12:00:00 | https://obdaac-tea.earthdatacloud.nasa.gov/ob-... | 0.004944 | 0.005064 | 0.005190 | 0.005288 | 0.005504 | 0.005838 | ... | 0.000094 | 0.000078 | 0.000068 | 0.000062 | 0.000058 | 0.000054 | 0.000052 | 0.000050 | 0.000106 | 0.000166 |
5 rows × 176 columns
Open files in plan
Sometimes it is helpful to look at the granules. There are helper functions for that. You need to specify the format of the data, "grid" for level 3 gridded or "swath" for level 2 swath data.
<xarray.Dataset> Size: 26GB
Dimensions: (lat: 4320, lon: 8640, wavelength: 172, rgb: 3,
eightbitcolor: 256)
Coordinates:
* lat (lat) float32 17kB 89.98 89.94 89.9 ... -89.9 -89.94 -89.98
* lon (lon) float32 35kB -180.0 -179.9 -179.9 ... 179.9 179.9 180.0
* wavelength (wavelength) float64 1kB 346.0 348.0 351.0 ... 714.0 717.0 719.0
Dimensions without coordinates: rgb, eightbitcolor
Data variables:
Rrs (lat, lon, wavelength) float32 26GB dask.array<chunksize=(16, 1024, 8), meta=np.ndarray>
palette (rgb, eightbitcolor) uint8 768B dask.array<chunksize=(3, 256), meta=np.ndarray>
Attributes: (12/64)
product_name: PACE_OCI.20240601_20240630.L3m.MO.RRS....
instrument: OCI
title: OCI Level-3 Standard Mapped Image
project: Ocean Biology Processing Group (NASA/G...
platform: PACE
source: satellite observations from OCI-PACE
... ...
identifier_product_doi: 10.5067/PACE/OCI/L3M/RRS/3.1
keywords: Earth Science > Oceans > Ocean Optics ...
keywords_vocabulary: NASA Global Change Master Directory (G...
data_bins: 16464585
data_minimum: -0.009998
data_maximum: 0.09856601