Base run

A simple run of xagg, aggregating gridded temperature data over US counties. For a deeper dive into xagg’s functionality, see the Detailed Code Run.

[1]:

import xagg as xa
import xarray as xr
import numpy as np
import geopandas as gpd

Import

The sample data in this example are:

gridded: month-of-year average temperature projections for the end-of-century from a climate model (CCSM4)
shapefiles: US counties

[2]:

# Load some climate data as an xarray dataset
ds = xr.open_dataset('../../data/climate_data/tas_Amon_CCSM4_rcp85_monthavg_20700101-20991231.nc')
ds

[3]:

# Load US counties shapefile as a geopandas GeoDataFrame
gdf = gpd.read_file('../../data/geo_data/UScounties.shp')
gdf.head()

[3]:

	NAME	STATE_NAME	STATE_FIPS	CNTY_FIPS	FIPS	geometry
0	Lake of the Woods	Minnesota	27	077	27077	POLYGON ((-95.34283 48.54668, -95.34105 48.715...
1	Ferry	Washington	53	019	53019	POLYGON ((-118.85163 47.94956, -118.84846 48.4...
2	Stevens	Washington	53	065	53065	POLYGON ((-117.43883 48.04412, -117.54219 48.0...
3	Okanogan	Washington	53	047	53047	POLYGON ((-118.97209 47.93915, -118.97406 47.9...
4	Pend Oreille	Washington	53	051	53051	POLYGON ((-117.43858 48.99992, -117.03205 48.9...

Aggregate

Now, aggregate the gridded variable in ds onto the polygons in gdf. Use the option silent=True if you’d like to suppress the status updates.

[4]:

# Calculate overlaps
weightmap = xa.pixel_overlaps(ds,gdf)

creating polygons for each pixel...
calculating overlaps between pixels and output polygons...
success!

[5]:

# Aggregate
aggregated = xa.aggregate(ds,weightmap)

adjusting grid... (this may happen because only a subset of pixels were used for aggregation for efficiency - i.e. [subset_bbox=True] in xa.pixel_overlaps())
grid adjustment successful
aggregating tas...
all variables aggregated to polygons!

Convert

Finally, convert the aggregated data back into the format you would like.

[6]:

# Example as an xarray dataset
ds_out = aggregated.to_dataset()
ds_out

[7]:

# Example as a pandas dataframe
df_out = aggregated.to_dataframe()
df_out

[7]:

		NAME	STATE_NAME	STATE_FIPS	CNTY_FIPS	FIPS	tas
poly_idx	month
0	1	Lake of the Woods	Minnesota	27	077	27077	263.918943
	2	Lake of the Woods	Minnesota	27	077	27077	268.834073
	3	Lake of the Woods	Minnesota	27	077	27077	273.977533
	4	Lake of the Woods	Minnesota	27	077	27077	283.141960
	5	Lake of the Woods	Minnesota	27	077	27077	290.623952
...	...	...	...	...	...	...	...
3140	8	Broomfield	Colorado	08	014	08014	297.646820
	9	Broomfield	Colorado	08	014	08014	292.368988
	10	Broomfield	Colorado	08	014	08014	283.544708
	11	Broomfield	Colorado	08	014	08014	276.383606
	12	Broomfield	Colorado	08	014	08014	270.444855

37692 rows × 6 columns

[ ]: