Base run
A simple run of xagg
, aggregating gridded temperature data over US counties. For a deeper dive into xagg
’s functionality, see the Detailed Code Run.
[1]:
import xagg as xa
import xarray as xr
import numpy as np
import geopandas as gpd
Import
The sample data in this example are:
gridded: month-of-year average temperature projections for the end-of-century from a climate model (CCSM4)
shapefiles: US counties
[2]:
# Load some climate data as an xarray dataset
ds = xr.open_dataset('../../data/climate_data/tas_Amon_CCSM4_rcp85_monthavg_20700101-20991231.nc')
ds
[2]:
<xarray.Dataset> Dimensions: (lon: 288, lat: 192, month: 12, bnds: 2) Coordinates: height float64 ... * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8 * lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0 * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12 Dimensions without coordinates: bnds Data variables: lat_bnds (month, lat, bnds) float64 ... lon_bnds (month, lon, bnds) float64 ... tas (month, lat, lon) float32 ...
[3]:
# Load US counties shapefile as a geopandas GeoDataFrame
gdf = gpd.read_file('../../data/geo_data/UScounties.shp')
gdf.head()
[3]:
NAME | STATE_NAME | STATE_FIPS | CNTY_FIPS | FIPS | geometry | |
---|---|---|---|---|---|---|
0 | Lake of the Woods | Minnesota | 27 | 077 | 27077 | POLYGON ((-95.34283 48.54668, -95.34105 48.715... |
1 | Ferry | Washington | 53 | 019 | 53019 | POLYGON ((-118.85163 47.94956, -118.84846 48.4... |
2 | Stevens | Washington | 53 | 065 | 53065 | POLYGON ((-117.43883 48.04412, -117.54219 48.0... |
3 | Okanogan | Washington | 53 | 047 | 53047 | POLYGON ((-118.97209 47.93915, -118.97406 47.9... |
4 | Pend Oreille | Washington | 53 | 051 | 53051 | POLYGON ((-117.43858 48.99992, -117.03205 48.9... |
Aggregate
Now, aggregate the gridded variable in ds
onto the polygons in gdf
. Use the option silent=True
if you’d like to suppress the status updates.
[4]:
# Calculate overlaps
weightmap = xa.pixel_overlaps(ds,gdf)
creating polygons for each pixel...
calculating overlaps between pixels and output polygons...
success!
[5]:
# Aggregate
aggregated = xa.aggregate(ds,weightmap)
adjusting grid... (this may happen because only a subset of pixels were used for aggregation for efficiency - i.e. [subset_bbox=True] in xa.pixel_overlaps())
grid adjustment successful
aggregating tas...
all variables aggregated to polygons!
Convert
Finally, convert the aggregated data back into the format you would like.
[6]:
# Example as an xarray dataset
ds_out = aggregated.to_dataset()
ds_out
[6]:
<xarray.Dataset> Dimensions: (poly_idx: 3141, month: 12) Coordinates: * poly_idx (poly_idx) int64 0 1 2 3 4 5 6 ... 3135 3136 3137 3138 3139 3140 * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12 Data variables: NAME (poly_idx) object 'Lake of the Woods' 'Ferry' ... 'Broomfield' STATE_NAME (poly_idx) object 'Minnesota' 'Washington' ... 'Colorado' STATE_FIPS (poly_idx) object '27' '53' '53' '53' ... '02' '02' '02' '08' CNTY_FIPS (poly_idx) object '077' '019' '065' '047' ... '240' '068' '014' FIPS (poly_idx) object '27077' '53019' '53065' ... '02068' '08014' tas (poly_idx, month) float64 263.9 268.8 274.0 ... 276.4 270.4
[7]:
# Example as a pandas dataframe
df_out = aggregated.to_dataframe()
df_out
[7]:
NAME | STATE_NAME | STATE_FIPS | CNTY_FIPS | FIPS | tas | ||
---|---|---|---|---|---|---|---|
poly_idx | month | ||||||
0 | 1 | Lake of the Woods | Minnesota | 27 | 077 | 27077 | 263.918943 |
2 | Lake of the Woods | Minnesota | 27 | 077 | 27077 | 268.834073 | |
3 | Lake of the Woods | Minnesota | 27 | 077 | 27077 | 273.977533 | |
4 | Lake of the Woods | Minnesota | 27 | 077 | 27077 | 283.141960 | |
5 | Lake of the Woods | Minnesota | 27 | 077 | 27077 | 290.623952 | |
... | ... | ... | ... | ... | ... | ... | ... |
3140 | 8 | Broomfield | Colorado | 08 | 014 | 08014 | 297.646820 |
9 | Broomfield | Colorado | 08 | 014 | 08014 | 292.368988 | |
10 | Broomfield | Colorado | 08 | 014 | 08014 | 283.544708 | |
11 | Broomfield | Colorado | 08 | 014 | 08014 | 276.383606 | |
12 | Broomfield | Colorado | 08 | 014 | 08014 | 270.444855 |
37692 rows × 6 columns
[ ]: