A part of HyRiver software stack for accessing hydrology data through web services

Taher Chegini

Last update: Dec 10, 2022

Related tags

Third-party APIs Wrappers python data-visualization climate-data usgs webservices hydrology watershed hydrologic-database

Overview

Package	Description	Status
PyNHD	Navigate and subset NHDPlus (MR and HR) using web services
Py3DEP	Access topographic data through National Map's 3DEP web service
PyGeoHydro	Access NWIS, NID, WQP, HCDN 2009, NLCD, and SSEBop databases
PyDaymet	Access Daymet for daily climate data both single pixel and gridded
AsyncRetriever	High-level API for asynchronous requests with persistent caching
PyGeoOGC	Send queries to any ArcGIS RESTful-, WMS-, and WFS-based services
PyGeoUtils	Convert responses from PyGeoOGC's supported web services to datasets

PyGeoHydro: Retrieve Geospatial Hydrology Data

Features

PyGeoHydro (formerly named hydrodata) is a part of HyRiver software stack that is designed to aid in watershed analysis through web services. This package provides access to some public web services that offer geospatial hydrology data. It has three main modules: pygeohydro, plot, and helpers.

The pygeohydro module can pull data from the following web services:

NWIS for daily mean streamflow observations (returned as a pandas.DataFrame or xarray.Dataset with station attributes),
Water Quality Portal for accessing current and historical water quality data from more than 1.5 million sites across the US,
NID for accessing both versions of the National Inventory of Dams web services,
HCDN 2009 for identifying sites where human activity affects the natural flow of the watercourse,
NLCD 2019 for land cover/land use, imperviousness, imperviousness descriptor, and canopy data,
SSEBop for daily actual evapotranspiration, for both single pixel and gridded data.

Also, it has two other functions:

interactive_map: Interactive map for exploring NWIS stations within a bounding box.
cover_statistics: Categorical statistics of land use/land cover data.

The plot module includes two main functions:

signatures: Hydrologic signature graphs.
cover_legends: Official NLCD land cover legends for plotting a land cover dataset.
descriptor_legends: Color map and legends for plotting an imperviousness descriptor dataset.

The helpers module includes:

nlcd_helper: A roughness coefficients lookup table for each land cover and imperviousness descriptor type which is useful for overland flow routing among other applications.
nwis_error: A dataframe for finding information about NWIS requests' errors.

Moreover, requests for additional databases and functionalities can be submitted via issue tracker.

You can find some example notebooks here.

You can also try using PyGeoHydro without installing it on your system by clicking on the binder badge. A Jupyter Lab instance with the HyRiver stack pre-installed will be launched in your web browser, and you can start coding!

Please note that since this project is in early development stages, while the provided functionalities should be stable, changes in APIs are possible in new releases. But we appreciate it if you give this project a try and provide feedback. Contributions are most welcome.

Moreover, requests for additional functionalities can be submitted via issue tracker.

Installation

You can install PyGeoHydro using pip after installing libgdal on your system (for example, in Ubuntu run sudo apt install libgdal-dev). Moreover, PyGeoHydro has an optional dependency for using persistent caching, requests-cache. We highly recommend installing this package as it can significantly speed up send/receive queries. You don't have to change anything in your code, since PyGeoHydro under-the-hood looks for requests-cache and if available, it will automatically use persistent caching:

$ pip install pygeohydro

Alternatively, PyGeoHydro can be installed from the conda-forge repository using Conda:

$ conda install -c conda-forge pygeohydro

Quick start

We can explore the available NWIS stations within a bounding box using interactive_map function. It returns an interactive map and by clicking on a station some of the most important properties of stations are shown.

import pygeohydro as gh

bbox = (-69.5, 45, -69, 45.5)
gh.interactive_map(bbox)

We can select all the stations within this boundary box that have daily mean streamflow data from 2000-01-01 to 2010-12-31:

from pygeohydro import NWIS

nwis = NWIS()
query = {
    **nwis.query_bybox(bbox),
    "hasDataTypeCd": "dv",
    "outputDataTypeCd": "dv",
}
info_box = nwis.get_info(query)
dates = ("2000-01-01", "2010-12-31")
stations = info_box[
    (info_box.begin_date <= dates[0]) & (info_box.end_date >= dates[1])
].site_no.tolist()

Then, we can get the daily streamflow data in mm/day (by default the values are in cms) and plot them:

from pygeohydro import plot

qobs = nwis.get_streamflow(stations, dates, mmd=True)
plot.signatures(qobs)

By default, get_streamflow returns a pandas.DataFrame that has a attrs method containing metadata for all the stations. You can access it like so qobs.attrs. Moreover, we can get the same data as xarray.Dataset as follows:

qobs_ds = nwis.get_streamflow(stations, dates, to_xarray=True)

This xarray.Dataset has two dimensions: time and station_id. It has 10 variables including discharge with two dimensions while other variables that are station attitudes are one dimensional.

We can also get instantaneous streamflow data using get_streamflow. This method assumes that the input dates are in UTC time zone and returns the data in UTC time zone as well.

date = ("2005-01-01 12:00", "2005-01-12 15:00")
qobs = nwis.get_streamflow("01646500", date, freq="iv")

The WaterQuality has a number of convenience methods to retrieve data from the web service. Since there are many parameter combinations that can be used to retrieve data, a general method is also provided to retrieve data from any of the valid endpoints. You can use get_json to retrieve stations info as a geopandas.GeoDataFrame or get_csv to retrieve stations data as a pandas.DataFrame. You can construct a dictionary of the parameters and pass it to one of these functions. For more information on the parameters, please consult the Water Quality Data documentation. For example, let's find all the stations within a bounding box that have Caffeine data:

from pynhd import WaterQuality

bbox = (-92.8, 44.2, -88.9, 46.0)
kwds = {"characteristicName": "Caffeine"}
wq = WaterQuality()
stations = wq.station_bybbox(bbox, kwds)

Or the same criterion but within a 30-mile radius of a point:

stations = wq.station_bydistance(-92.8, 44.2, 30, kwds)

Then we can get the data for all these stations the data like this:

sids = stations.MonitoringLocationIdentifier.tolist()
caff = wq.data_bystation(sids, kwds)

Moreover, we can get land use/land cove data using nlcd function, percentages of land cover types using cover_statistics, and actual ET with ssebopeta_bygeom:

from pynhd import NLDI

geometry = NLDI().get_basins("01031500").geometry[0]
lulc = gh.nlcd(geometry, 100, years={"cover": [2016, 2019]})
stats = gh.cover_statistics(lulc.cover_2016)
eta = gh.ssebopeta_bygeom(geometry, dates=("2005-10-01", "2005-10-05"))

Additionally, we can pull all the US dams data using NID. Let's get dams that are within this bounding box and have a maximum storage larger than 200 acre-feet.

nid = NID()
dams = nid.bygeom(bbox, "epsg:4326", sql_clause="MAX_STORAGE > 200")

We can get all the dams within CONUS using NID and plot them:

import geopandas as gpd

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
conus = world[world.name == "United States of America"].geometry.iloc[0][0]
conus_dams = nid.bygeom(conus, "epsg:4326")

Contributing

Contributions are very welcomed. Please read CONTRIBUTING.rst file for instructions.

Credits

This package was created based on the audreyr/cookiecutter-pypackage project template.

Comments

NLCD by location

For an aviation use case - "Where did the drone launch?' - I'd like to get the land use for a lot of points in the U.S.

A similar use case that I use is "What is the elevation at a particular point?" To answer this, I run an https://open-elevation.com/ docker instance and use the API to pass in thousands of lat/lon pairs. It returns the pairs with the elevation and I take the elevation and add it as a column to my dataframe.

A good solution would be a function that operated on a local copy of the NLCD database, took lat/lon pairs, and returned the text description of the land use. The pairs could be distinct lat / lon values or possibly a geodataframe with one or more points.

If this function was vectorized and could process large numbers of points quickly that would be a bonus but not necessary.

The best option I could come up with involved setting a bounding box around each point. (See Discussions for details.)

opened by kovar-ursa 17
Change to epsg:4269 in WaterData request
It looks like the USGS GeoServer layers were changed in the last 24 hours!! They no longer support epsg:900913 (web mercator), and now only accept epsg:4269. NWIS.get_info now fails because of this. I updated the epsg code on the WaterData call, and that should fix it. I've run WaterData calls with epsg:4269 on other layers (FYI, huc12) and it works.

[ ] Closes #xxxx

[ ] Tests added and passed make coverage

[ ] Passes make lint

[ ] User visible changes (including notable bug fixes) are documented in HISTORY.rst

[ ] After adding new functions/methods ran make apidocs
opened by emiliom 13
NWIS data retrieval enhancement ideas

It would be great to be able to set some parameter to ensure that the retrieved NWIS data are in UTC.

Also it would be nice to have the ability to return the data (along with metadata such as units!) as an xarray dataset instead of a pandas dataframe.

There is an example NWIS code here by @dnowacki-usgs that optionally returns an xarray dataset.
enhancement

opened by rsignell-usgs 8
get_streamflow() to_xarray inconsistent dtypes
What happened: Repeated calls to get_streamflow() returning an xarray DataSet have different dtypes for some fields (notably, strings).

What you expected to happen: The returned encodings/schema would be consistent for all calls, and match the internal schema of the NWIS database from which the data is fetched.

Minimal Complete Verifiable Example:

from pygeohydro import NWIS nwis=NWIS() DATE_RANGE=("2020-01-01", "2020-12-31") site_A = nwis.get_streamflow('USGS-402114105350101', DATE_RANGE, to_xarray=True ) site_B = nwis.get_streamflow('USGS-02277600', DATE_RANGE, to_xarray=True ) assert site_A['station_nm'].dtype == site_B['station_nm'].dtype ## fails assert site_A['alt_datum_cd'].dtype == site_B['alt_datum_cd'].dtype ## fails

Anything else we need to know?: This has come up for me as I try to fetch streamflow data one gage at a time as part of a parallelized workflow -- each worker fetches one streamgage, manipulates it, then appends to a common dataset (in my case, a zarr store). The common zarr store was templated using NWIS.get_streamflow() data, which established the 'standard' dtypes.

The dtypes for these particular fields (station_nm and alt_datum_cd) are unicode strings, with the length of the string (and the dtype) being that of the returned data for a given request. That is, the dtype for Site_A's alt_datum_cd (above) is '<U6' because the data happens to be 6 chars for that gage. For Site_B's alt_datum_cd, the dtype is '<U1'. It isn't just that the string is shorter, the dtype is different, which causes the zarr write to fail.

I can work around this by re-casting in the case of these two strings:

Site_B['alt_datum_cd'] = xr.DataArray(data=Site_B['alt_datum_cd'].values.astype('<U6'), dims='gage_id')

But in the case of the station name field, I don't know what the max length might be from the database. I can cast to '<U46' (the dtype for Site_A's station_nm), but other gages may have longer names, which will be truncated when cast to this dtype.

It would be useful to have get_streamflow() return the same string encoding/dtype in all cases, so that separate calls can be treated identically.

Environment:

Output of pygeohydro.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 5.4.181-99.354.amzn2.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.12.2 libnetcdf: 4.8.1 aiodns: 3.0.0 aiohttp: 3.8.3 aiohttp-client-cache: 0.7.3 aiosqlite: 0.17.0 async-retriever: 0.3.6 bottleneck: 1.3.5 brotli: installed cchardet: 2.1.7 click: 8.0.2 cytoolz: 0.12.0 dask: 2022.04.2 defusedxml: 0.7.1 folium: 0.13.0 geopandas: 0.11.1 lxml: 4.9.1 matplotlib: 3.4.3 netCDF4: 1.6.0 networkx: 2.8.7 numpy: 1.23.3 owslib: 0.27.2 pandas: 1.4.2 py3dep: 0.13.6 pyarrow: 9.0.0 pydantic: 1.10.2 pydaymet: 0.13.6 pygeohydro: 0.13.6 pygeoogc: 0.13.6 pygeos: 0.13 pygeoutils: 0.13.6 pynhd: 0.13.6 pyproj: 3.4.0 pytest: None pytest-cov: None rasterio: 1.3.2 requests: 2.28.1 requests-cache: 0.9.6 richdem: None rioxarray: 0.12.2 scipy: 1.9.1 shapely: 1.8.4 tables: 3.7.0 ujson: 5.5.0 urllib3: 1.26.11 xarray: 2022.9.0 xdist: None yaml: 5.4.1
opened by gzt5142 6
Example code for NWIS query does not work
What happened: Running the example code to generate a list of NWIS sites throws the following error 'NWIS' object has no attribute 'query_bybox'

Minimal Complete Verifiable Example:

from pygeohydro import NWIS nwis = NWIS() query = { **nwis.query_bybox(bbox), "hasDataTypeCd": "dv", "outputDataTypeCd": "dv", } info_box = nwis.get_info(query) dates = ("2000-01-01", "2010-12-31") stations = info_box[ (info_box.begin_date <= dates[0]) & (info_box.end_date >= dates[1]) ].site_no.tolist()

Environment:

Output of pygeohydro.show_versions()
INSTALLED VERSIONS
commit: None python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 4.15.0-167-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US LOCALE: en_US.ISO8859-1 libhdf5: 1.12.1 libnetcdf: 4.8.1

aiodns: 3.0.0 aiohttp: 3.8.1 aiohttp-client-cache: 0.7.1 aiosqlite: 0.17.0 async-retriever: 0.3.3 bottleneck: 1.3.4 brotli: installed cchardet: 2.1.7 click: 6.7 cytoolz: 0.11.2 dask: 2022.6.1 defusedxml: 0.7.1 folium: 0.12.1.post1 geopandas: 0.11.0 lxml: 4.8.0 matplotlib: 3.4.3 netCDF4: 1.6.0 networkx: 2.8.4 numpy: 1.23.0 owslib: 0.25.0 pandas: 1.4.3 py3dep: 0.13.1 pyarrow: 6.0.1 pydantic: 1.9.1 pydaymet: 0.13.1 pygeohydro: 0.13.2 pygeoogc: 0.13.2 pygeos: 0.12.0 pygeoutils: 0.13.2 pynhd: 0.13.2 pyproj: 3.3.0 pytest: None pytest-cov: None rasterio: 1.2.10 requests: 2.28.1 requests-cache: 0.9.4 richdem: 0.3.4 rioxarray: 0.11.1 scipy: 1.8.1 shapely: 1.8.2 tables: 3.7.0 ujson: 5.3.0 urllib3: 1.26.9 xarray: 2022.3.0 xdist: None yaml: 6.0```

</details>
opened by onnyyonn 6
Shapely import issue on Darwin
Hydrodata version: 0.4.4

Python version: 3.7.7

Operating System: macOS 10.14.4

[x] Using Conda

Description

I found an issue when importing any from shapely.geometry packages on Shapely 1.7.0. Getting the following:

File "/Users/austinraney/miniconda3/envs/hydrodata/lib/python3.7/site-packages/shapely/geos.py", line 62, in load_dll libname, fallbacks or [])) OSError: Could not find lib cxx or load any of its variants [].

I found the related issue and it seems that the PR fixed it (I tested the change on my system at least). Just something to know about. Hopefully they will update the package on PyPI soon.

What I Did

python -c "from shapely.geometry import Point"
opened by aaraney 6

Did something change with NWIS?

What happened: my NWIS example stopped working

What you expected to happen:

Minimal Complete Verifiable Example:

from pygeohydro import NWIS

nwis = NWIS()

start = '1979-02-01T01:00:00'
stop =  '2020-12-31T23:00:00'

sta = ['USGS-01030350', 'USGS-01030500']

ds_obs = nwis.get_streamflow(sta, (start,stop), to_xarray=True)

I tried pygeohydro versions 0.13.0 and 0.13.1

opened by rsignell-usgs 4

Monthly & annual SSEBop ET available via OPenDAP from USGS THREDDS server
Re: https://github.com/cheginit/hydrodata/blob/master/hydrodata/datasets.py#L854

Since there's still no web service available for subsetting SSEBop, the data first needs to be downloaded for the requested period then it is masked by the region of interest locally. Therefore, it's not as fast as other functions and the bottleneck could be the download speed.

FYI, there is an OPeNDAP endpoint available from the USGS CIDA THREDDS server (managed by David Blodgett, I think) for monthly and annual SSEBop ET -- though not daily:

THREDDS catalog page for monthly data: https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/ssebopeta/monthly

THREDDS OPeNDAP endpoint: https://cida.usgs.gov/thredds/dodsC/ssebopeta/monthly
opened by emiliom 4
Theoretical meta source class

To preface this, I don't expect this to be pulled in, it was more of a way to showcase an idea and get us thinking. Admittedly, this could be unnecessary for a project of this size where there are 5-10 different datasources, so just no that is too much boilerplate code is also a fine response lol.

So, with that being said, I went through datasets.py and it seems like there is a little bit of code redundancy or maybe just thought redundancy? The platform you have built seems to have a ton of potential to have additional datasets added, so with that in mind I wrote a sort of meta class for sources.

Purpose

Set expectations for future collaborators and provide a framework for adding in other datasets. Basically, let's build a framework for what a source should have and the methods it must have.

This class would be inherited by all other source classes and can either use or override the existing methods in the meta source class to keep a consistent namespace.

/sources/__init__.py can be used to control the public API that a user will interface with.

Known problems/limitations

All datasources are not created equal, or at least they may have differing kinds of data (gridded, point, tabular, etc.) so creating a meta class might need to be divided into a structured hierarchy, where there is a root meta class that is inherited by two to three provider type classes which each inherit the root meta class.

For example, the root meta class would have things that all providers have, their name, url, citation, base_url a __str__ method, plus any other general attributes all sources WILL have. Then a meta gridded class could hold class methods like bybox and bygeom and this class would be inherited by gridded data sources. It should be noted that sources that support multiple data types could utilize multi inheritance and inherit two or more meta classes (i.e. meta gridded and meta tabular).

What I did

The DataSource class is the meta class if you will. There are two examples of using DataSource, the ssebopeta will probably be a more interesting example. I didn't test the bygeom class method, but I think it should work or be very close. Just mainly wanted to showcase the idea more than anything else.
enhancement

opened by aaraney 4
NHDPlus Implementation

Before I offer my suggestion, I may be missing the utility of shipping the NHD with the repo. With that in mind and if you don't mind elaborating later, what are your thought on moving away from shipping the NHDPlus dataset to users and instead relying on the USGS's api to verify and obtain gauge metadata? It should be a straight forward call that doesn't require a key.
enhancement

opened by aaraney 4
Missing examples/tutorial.ipynb

Description

Its seems examples/tutorial.ipynb which was added on commit 68fe37f has been removed. What are the future plans for examples/ and if you would like me to write something up, do you have features in mind you would like me to showcase?
documentation

opened by aaraney 4
Add support for SensorThings

Is your feature request related to a problem? Please describe. No. USGS water data has a new web service called SensorThings that provides access to many USGS datasets.

Describe the solution you'd like A demo repository of its initial implementation is here.

Describe alternatives you've considered N/A

Additional context N/A

opened by cheginit 0

Releases(v0.13.8)

v0.13.8(Dec 9, 2022)
Release Notes

New Features

Add a function called huc_wb_full that returns the full watershed boundary GeoDataFrame of a given HUC level. If only a subset of HUCs is needed the pygeohydro.WBD class should be used. The full dataset is downloaded from the National Maps' WBD staged products.

Add a new function called irrigation_withdrawals for retrieving estimated monthly water use for irrigation by 12-digit hydrologic unit in the CONUS for 2015 from ScienceBase.

Add a new property to NID, called data_units for indicating the units of NID dataset variables.

The get_us_states now accepts conus as a subset_key which is equivalent to contiguous.

Internal Changes

Add get_us_states to __init__ file, so it can be loaded directly, e.g., gh.get_us_states("TX").

Modify the codebase based on Refurb suggestions.

Significant performance improvements in NWIS.get_streamflow especially for large requests by refactoring the timezone handling.

Bug Fixes

Fix the dam types and purposes mapping dictionaries in NID class.

Source code(tar.gz)
Source code(zip)
v0.13.7(Nov 5, 2022)
Release Notes

New Features

Add a two new function for retrieving soil properties across the US:

soil_properties: Porosity, available water capacity, and field capacity,

soil_gnatsgo: Soil properties from the gNATSGO database.

Add a new help function called state_lookup_table for getting a lookup table of US states and their counties. This can be particularly useful for mapping the digit state_cd and county_cd that NWIS returns to state names/codes.

Add support for getting individual state geometries using get_us_states function by passing their two letter state code. Also, use TIGER 2022 data for the US states and counties instead of TIGER 2021.

Internal Changes

Remove proplot as a dependency and use matplotlib instead.

Source code(tar.gz)
Source code(zip)
v0.13.6(Aug 30, 2022)
Release Notes

Internal Changes

Add the missing PyPi classifiers for the supported Python versions.

Source code(tar.gz)
Source code(zip)
v0.13.5(Aug 29, 2022)
Release Notes

Breaking Changes

Append "Error" to all exception classes for conforming to PEP-8 naming conventions.

Deprecate ssebopeta_byloc since it's been replaced with ssebopeta_bycoords since version 0.13.0.

Internal Changes

Bump the minimum versions of pygeoogc and pygeoutils to 0.13.5 and that of async-retriever to 0.3.5.

Source code(tar.gz)
Source code(zip)
v0.13.3(Jul 31, 2022)
Release Notes

New Features

Add a new argument to NID.inventory_byid class for staging the entire NID dataset prior to inventory queries. There a new public method called NID.stage_nid_inventory that can be used to download the entire NID dataset and save it as a feather file. This is useful inventory queries with large number of IDs and is much more efficient than querying the NID web service.

Bug Fixes

The background value in cover_statistics function should have been 127 not 0. Also, dropped the background value from the return statistics.

Source code(tar.gz)
Source code(zip)
v0.13.2(Jun 14, 2022)
Release Notes

Breaking Changes

Set the minimum supported version of Python to 3.8 since many of the dependencies such as xarray, pandas, rioxarray have dropped support for Python 3.7.

Internal Changes

Remove USGS prefixes from the input station IDs in NWIS.get_streamflow function. Also, check if the remaining parts of the IDs are all digits and throw an exception if otherwise. Additionally, make sure that IDs have at least 8 chars by adding leading zeros (:issue_hydro:[99]{.title-ref}).

Use micromamba for running tests and use nox for linting in CI.

Source code(tar.gz)
Source code(zip)
v0.13.1(Jun 12, 2022)
Release Notes

New Features

Add a new function called get_us_states to the helpers module for obtaining a GeoDataFrame of the US states. It has an optional argument for returning the contiguous states, continental states, commonwealths states, or US territories. The data are retrieved from the Census' Tiger 2021 database.

In the NID class keep the valid_fields property as a pandas.Series instead of a list, so it can be searched easier via its str accessor.

Internal Changes

Refactor the plot.signatures function to use proplot instead of matplotlib.

Improve performance of NWIS.get_streamflow by not validating the layer name when instantiating the WaterData class. Also, make the function more robust by checking if streamflow data is available for each station and throw a warning if not.

Bug Fixes

Fix an issue in NWIS.get_streamflow where -9999 values were not being filtered out. According to NWIS, these values are reserved for ice-affected data. This fix sets these values to numpy.nan.

Source code(tar.gz)
Source code(zip)
v0.13.0(Apr 4, 2022)
Release Notes

New Features

Add a new flag to nlcd_* functions called ssl for disabling SSL verification.

Add a new function called get_camels for getting the CAMELS dataset. The function returns a geopandas.GeoDataFrame that includes basin-level attributes for all 671 stations in the dataset and a xarray.Dataset that contains streamflow data for all 671 stations and their basin-level attributes.

Add a new function named overland_roughness for getting the overland roughness values from land cover data.

Add a new class called WBD for getting watershed boundary (HUC) data.

from pygeohydro import WBD wbd = WBD("huc4") hudson = wbd.byids("huc4", ["0202", "0203"])

Breaking Changes

Remove caching-related arguments from all functions since now they can be set globally via three environmental variables:

HYRIVER_CACHE_NAME: Path to the caching SQLite database.

HYRIVER_CACHE_EXPIRE: Expiration time for cached requests in seconds.

HYRIVER_CACHE_DISABLE: Disable reading/writing from/to the cache file.

You can do this like so:

import os os.environ["HYRIVER_CACHE_NAME"] = "path/to/file.sqlite" os.environ["HYRIVER_CACHE_EXPIRE"] = "3600" os.environ["HYRIVER_CACHE_DISABLE"] = "true"

Internal Changes

Write nodata attribute using rioxarray in nlcd_bygeom since the clipping operation of rioxarray uses this value as the fill value.

Source code(tar.gz)
Source code(zip)
v0.12.4(Feb 4, 2022)
Release Notes

Internal Changes

Return a named tuple instead of a dict of percentages in the cover_statistics function. It makes accessing the values easier.

Add pycln as a new pre-commit hooks for removing unused imports.

Remove time zone info from the inputs to plot.signatures to avoid issues with the matplotlib backend.

Bug Fixes

Fix an issue in plot.signatures where the new matplotlib version requires a numpy array instead of a pandas.DataFrame.

Source code(tar.gz)
Source code(zip)
v0.12.3(Jan 15, 2022)
Release Notes

Bug Fixes

Replace no data values of data in ssebopeta_bygeom with np.nan before converting it to mm/day.

Fix an inconsistency issue with CRS projection when using UTM in nlcd_*. Use EPSG:3857 for all reprojections and get the data from NLCD in the same projection. (:issue_hydro:[85]{.title-ref})

Improve performance of nlcd_* functions by reducing number of service calls.

Internal Changes

Add type checking with typeguard and fix type hinting issues raised by typeguard.

Refactor show_versions to ensure getting correct versions of all dependencies.

Source code(tar.gz)
Source code(zip)
v0.12.2(Dec 31, 2021)
Release Notes

New Features

The NWIS.get_info now returns a geopandas.GeoDataFrame instead of a pandas.DataFrame.

Bug Fixes

Fix a bug in NWIS.get_streamflow where the drainage area might not be computed correctly if target stations are not located at the outlet of their watersheds.

Source code(tar.gz)
Source code(zip)
v0.12.1(Dec 31, 2021)
Release Notes

Internal Changes

Use the three new ar.retrieve_* functions instead of the old ar.retrieve function to improve type hinting and to make the API more consistent.

Bug Fixes

Fix an in issue with NWIS.get_streamflow where time zone of the data was not being correctly determined when it was US specific abbreviations such as CST.

Source code(tar.gz)
Source code(zip)
v0.12.0(Dec 28, 2021)
Release Notes

New Features

Add support for getting instantaneous streamflow from NWIS in addition to the daily streamflow by adding freq argument to NWIS.get_streamflow that can be either iv or dv. The default is dv to retain the previous behavior of the function.

Convert the time zone of the streamflow data to UTC.

Add attributes of the requested stations as attrs parameter to the returned pandas.DataFrame. (:issue_hydro:[75]{.title-ref})

Add a new flag to NWIS.get_streamflow for returning the streamflow as xarray.Dataset. This dataset has two dimensions; time and station_id. It has ten variables which includes discharge and nine other station attributes. (:issue_hydro:[75]{.title-ref})

Add drain_sqkm from GagesII to NWIS.get_info.

Show drain_sqkm in the interactive map generated by interactive_map.

Add two new functions for getting NLCD data; nlcd_bygeom and nlcd_bycoords. The new nlcd_bycoords function returns a geopandas.GeoDataFrame with the NLCD layers as columns and input coordinates, which should be a list of (lon, lat) tuples, as the geometry column. Moreover, The new nlcd_bygeom function now accepts a geopandas.GeoDataFrame as the input. In this case, it returns a dict with keys as indices of the input geopandas.GeoDataFrame. (:issue_hydro:[80]{.title-ref})

The previous nlcd function is being deprecated. For now, it calls nlcd_bygeom internally and retains the old behavior. This function will be removed in future versions.

Breaking Changes

Set the request caching's expiration time to never expire. Add two flags to all functions to control the caching: expire_after and disable_caching.

Replace NID class with the new RESTful-based web service of National Inventory of Dams. The new NID service is very different from the old one, so this is considered a breaking change.

Internal Changes

Improve exception handling in NWIS.get_info when NWIS returns an error message rather than 500s web service error.

The NWIS.get_streamflow function now checks if the site info dataset contains any duplicates. Therefore, all the remaining station numbers will be unique. This prevents an issue with setting attrs where duplicate indexes cause an exception when being converted to a dict. (:issue_hydro:[75]{.title-ref})

Add all the missing types so mypy --strict passes.

Source code(tar.gz)
Source code(zip)
v0.11.4(Nov 24, 2021)
Release Notes

New Features

Add support for the Water Quality Portal Web Services. (:issue_hydro:[72]{.title-ref})

Add support for two version of NID web service. The original NID web service is considered version 2 and the new NID is considered version 3. You can pass the version number to the NID like so NID(2). The default version is 2.

Bug Fixes

Fix an issue with background percentage calculation in cover_statistics.

Source code(tar.gz)
Source code(zip)
v0.11.3(Nov 12, 2021)
Release Notes

New Features

Use a new MapService for National Inventory of Dams (NID).

Internal Changes

Use importlib-metadata for getting the version insead of pkg_resources to decrease import time as discussed in this issue.

Source code(tar.gz)
Source code(zip)
v0.11.2(Jul 31, 2021)
Release Notes

Bug Fixes

Refactor cover_statistics to address an issue with wrong category names and also improve performance for large datasets by using numpy's functions.

Fix an issue with detecting wrong number of stations in NWIS.get_streamflow. Also, improve filtering stations that their start/end date don't match the user requested interval.

Source code(tar.gz)
Source code(zip)
v0.11.1(Jul 31, 2021)
Release Notes

The highlight of this release is adding support for NLCD 2019 and singnificant improvements in NWIS support.

New Features

Add support for the recently released version of NLCD (2019), including the impervious descriptor layer. Highlights of the new database are:

NLCD 2019 now offers land cover for years 2001, 2004, 2006, 2008, > 2011, 2013, 2016, 2019, and impervious surface and impervious > descriptor products now updated to match each date of land cover. > These products update all previously released versions of > landcover and impervious products for CONUS (NLCD 2001, NLCD 2006, > NLCD 2011, NLCD 2016) and are not directly comparable to previous > products. NLCD 2019 land cover and impervious surface product > versions of previous dates must be downloaded for proper > comparison. NLCD 2019 also offers an impervious surface descriptor > product that identifies the type of each impervious surface pixel. > This product identifies types of roads, wind tower sites, building > locations, and energy production sites to allow deeper analysis of > developed features. > > -- MRLC

Add support for all the supported regions of NLCD database (CONUS, AK, HI, and PR).

Add support for passing multiple years to the NLCD function, like so {"cover": [2016, 2019]}.

Add plot.descriptor_legends function to plot the legend for the impervious descriptor layer.

New features in NWIS class are:

* Remove query_* methods since it's not convenient to pass them directly as a dictionary.

* Add a new function called get_parameter_codes to query parameters and get information about them.

* To decrease complexity of get_streamflow method add a new private function to handle some of the tasks.

For handling more of NWIS's services make retrieve_rdb more general.

Add a new argument called nwis_kwds to interactive_map so any NWIS specific keywords can be passed for filtering stations.

Improve exception handling in get_info method and simplify and improve its performance for getting HCDN.

Internal Changes

Migrate to using AsyncRetriever for handling communications with web services.

Source code(tar.gz)
Source code(zip)
v0.11.0(Jun 20, 2021)
Release Notes

Breaking Changes

Drop support for Python 3.6 since many of the dependencies such as xarray and pandas have done so.

Remove get_nid and get_nid_codes functions since NID now has a ArcGISRESTFul service.

New Features

Add a new class called NID for accessing the recently released National Inventory of Dams web service. This service is based on ArcGIS's RESTful service. So now the user just need to instantiate the class like so NID() and with three methods of AGRBase class, the user can retrieve the data. These methods are: bygeom, byids, and bysql. Moreover, it has an attrs property that includes descriptions of the database fields with their units.

Refactor NWIS.get_info to be more generic by accepting any valid queries that are documented at USGS Site Web Service.

Allow for passing a list of queries to NWIS.get_info and use async_retriever that significantly improves the network response time.

Add two new flags to interactive_map for limiting the stations to those with daily values (dv=True) and/or instantaneous values (iv=True). This function now includes a link to stations webpage on USGS website.

Internal Changes

Use persistent caching for all send/receive requests that can significantly improve the network response time.

Explicitly include all the hard dependencies in setup.cfg.

Refactor interactive_map and NWIS.get_info to make them more efficient and reduce their code complexity.

Source code(tar.gz)
Source code(zip)
v0.10.2(Mar 27, 2021)
Release Notes

Add annoucement regarding the new name for the softwate stack, HyRiver.

Improve pip installation and release workflow.

Source code(tar.gz)
Source code(zip)
v0.10.1(Mar 6, 2021)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.10.0(Mar 6, 2021)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.9.2(Mar 3, 2021)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.9.1(Feb 22, 2021)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.9.0(Feb 17, 2021)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.8.0(Dec 7, 2020)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.2.0(Dec 7, 2020)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.7.3-beta(Sep 2, 2020)

This release fixes the issue due to recent changes to the WaterData web service.
Source code(tar.gz)
Source code(zip)
0.7.2(Aug 18, 2020)
Enhancements

Replaced simplejson with orjson to speed-up JSON operations.

Explicitly sort the time dimension of the ssebopeta_bygeom function.

Bug fixes

Fix an issue with the nlcd function where high-resolution requests fail.

Source code(tar.gz)
Source code(zip)
v0.7.1(Aug 14, 2020)

This is a bug fix that addresses the nlcd function issue where not all None layers are dropped. Also, a new flag called title_ypos was added to the plot.signatures function for adjusting the vertical position of the supertitle which is useful for multi-line titles.
Source code(tar.gz)
Source code(zip)
v0.7.0(Aug 12, 2020)

This version re-writes almost the whole codebase and divides Hydrodata into six standalone Python libraries. This decision was made for reducing the code complexity and allowing the users to only install the packages that they need without having to install all the Hydrodata dependencies. Please check History for the full changelog.
Source code(tar.gz)
Source code(zip)