PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

Taher Chegini

Last update: Dec 14, 2022

Related tags

Data Analysis python waterdata webservices hydrology nldi wfs-service nhdplus

Overview

Package	Description	Status
PyNHD	Navigate and subset NHDPlus (MR and HR) using web services
Py3DEP	Access topographic data through National Map's 3DEP web service
PyGeoHydro	Access NWIS, NID, HCDN 2009, NLCD, and SSEBop databases
PyDaymet	Access Daymet for daily climate data both single pixel and gridded
AsyncRetriever	High-level API for asynchronous requests with persistent caching
PyGeoOGC	Send queries to any ArcGIS RESTful-, WMS-, and WFS-based services
PyGeoUtils	Convert responses from PyGeoOGC's supported web services to datasets

PyNHD: Navigate and subset NHDPlus database

Features

PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

This package provides access to WaterData, the National Map's NHDPlus HR, NLDI, and PyGeoAPI web services. These web services can be used to navigate and extract vector data from NHDPlus V2 (both medium- and hight-resolution) database such as catchments, HUC8, HUC12, GagesII, flowlines, and water bodies. Moreover, PyNHD gives access to an item on ScienceBase called Select Attributes for NHDPlus Version 2.1 Reach Catchments and Modified Network Routed Upstream Watersheds for the Conterminous United States. This item provides over 30 attributes at catchment-scale based on NHDPlus ComIDs. These attributes are available in three categories:

Local (local): For individual reach catchments,
Total (upstream_acc): For network-accumulated values using total cumulative drainage area,
Divergence (div_routing): For network-accumulated values using divergence-routed.

Moreover, the PyGeoAPI service provides four functionalities:

flow_trace: Trace flow from a starting point to up/downstream direction.
split_catchment: Split the local catchment of a point of interest at the point's location.
elevation_profile: Extract elevation profile along a flow path between two points.
cross_section: Extract cross-section at a point of interest along a flow line.

A list of these attributes for each characteristic type can be accessed using nhdplus_attrs function.

Similarly, PyNHD uses this item on Hydroshare to get ComID-linked NHDPlus Value Added Attributes. This dataset includes slope and roughness, among other attributes, for all the flowlines. You can use nhdplus_vaa function to get this dataset.

Additionally, PyNHD offers some extra utilities for processing the flowlines:

prepare_nhdplus: For cleaning up the dataframe by, for example, removing tiny networks, adding a to_comid column, and finding a terminal flowlines if it doesn't exist.
topoogical_sort: For sorting the river network topologically which is useful for routing and flow accumulation.
vector_accumulation: For computing flow accumulation in a river network. This function is generic and any routing method can be plugged in.

These utilities are developed based on an R package called nhdplusTools.

You can find some example notebooks here.

Please note that since this project is in early development stages, while the provided functionalities should be stable, changes in APIs are possible in new releases. But we appreciate it if you give this project a try and provide feedback. Contributions are most welcome.

Moreover, requests for additional functionalities can be submitted via issue tracker.

Installation

You can install PyNHD using pip after installing libgdal on your system (for example, in Ubuntu run sudo apt install libgdal-dev). Moreover, PyNHD has an optional dependency for using persistent caching, requests-cache. We highly recommend to install this package as it can significantly speedup send/receive queries. You don't have to change anything in your code, since PyNHD under-the-hood looks for requests-cache and if available, it will automatically use persistent caching:

$ pip install pynhd

Alternatively, PyNHD can be installed from the conda-forge repository using Conda:

$ conda install -c conda-forge pynhd

Quick start

Let's explore the capabilities of NLDI. We need to instantiate the class first:

from pynhd import NLDI, WaterData, NHDPlusHR
import pynhd as nhd

First, let’s get the watershed geometry of the contributing basin of a USGS station using NLDI:

nldi = NLDI()
station_id = "01031500"

basin = nldi.get_basins(station_id)

The navigate_byid class method can be used to navigate NHDPlus in both upstream and downstream of any point in the database. Let’s get ComIDs and flowlines of the tributaries and the main river channel in the upstream of the station.

flw_main = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamMain",
    source="flowlines",
    distance=1000,
)

flw_trib = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamTributaries",
    source="flowlines",
    distance=1000,
)

We can get other USGS stations upstream (or downstream) of the station and even set a distance limit (in km):

st_all = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamTributaries",
    source="nwissite",
    distance=1000,
)

st_d20 = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamTributaries",
    source="nwissite",
    distance=20,
)

Now, let’s get the HUC12 pour points:

pp = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamTributaries",
    source="huc12pp",
    distance=1000,
)

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/nhdplus_navigation.png

Also, we can get the slope data for each river segment from NHDPlus VAA database:

vaa = nhd.nhdplus_vaa("input_data/nhdplus_vaa.parquet")

flw_trib["comid"] = pd.to_numeric(flw_trib.nhdplus_comid)
slope = gpd.GeoDataFrame(
    pd.merge(flw_trib, vaa[["comid", "slope"]], left_on="comid", right_on="comid"),
    crs=flw_trib.crs,
)
slope[slope.slope < 0] = np.nan

Now, let's explore the PyGeoAPI capabilities:

pygeoapi = PyGeoAPI()

trace = pygeoapi.flow_trace(
    (1774209.63, 856381.68), crs="ESRI:102003", raindrop=False, direction="none"
)

split = pygeoapi.split_catchment((-73.82705, 43.29139), crs="epsg:4326", upstream=False)

profile = pygeoapi.elevation_profile(
    [(-103.801086, 40.26772), (-103.80097, 40.270568)], numpts=101, dem_res=1, crs="epsg:4326"
)

section = pygeoapi.cross_section((-103.80119, 40.2684), width=1000.0, numpts=101, crs="epsg:4326")

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/split_catchment.png

Next, we retrieve the medium- and high-resolution flowlines within the bounding box of our watershed and compare them. Moreover, Since several web services offer access to NHDPlus database, NHDPlusHR has an argument for selecting a service and also an argument for automatically switching between services.

mr = WaterData("nhdflowline_network")
nhdp_mr = mr.bybox(basin.geometry[0].bounds)

hr = NHDPlusHR("networknhdflowline", service="hydro", auto_switch=True)
nhdp_hr = hr.bygeom(basin.geometry[0].bounds)

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/hr_mr.png

Moreover, WaterData can find features within a given radius (in meters) of a point:

eck4 = "+proj=eck4 +lon_0=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
coords = (-5727797.427596455, 5584066.49330473)
rad = 5e3
flw_rad = mr.bydistance(coords, rad, loc_crs=eck4)
flw_rad = flw_rad.to_crs(eck4)

Instead of getting all features within a radius of the coordinate, we can snap to the closest flowline using NLDI:

comid_closest = nldi.comid_byloc((x, y), eck4)
flw_closest = nhdp_mr.byid("comid", comid_closest.comid.values[0])

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/nhdplus_radius.png

Since NHDPlus HR is still at the pre-release stage let's use the MR flowlines to demonstrate the vector-based accumulation. Based on a topological sorted river network pynhd.vector_accumulation computes flow accumulation in the network. It returns a dataframe which is sorted from upstream to downstream that shows the accumulated flow in each node.

PyNHD has a utility called prepare_nhdplus that identifies such relationship among other things such as fixing some common issues with NHDPlus flowlines. But first we need to get all the NHDPlus attributes for each ComID since NLDI only provides the flowlines’ geometries and ComIDs which is useful for navigating the vector river network data. For getting the NHDPlus database we use WaterData. Let’s use the nhdflowline_network layer to get required info.

wd = WaterData("nhdflowline_network")

comids = flw_trib.nhdplus_comid.to_list()
nhdp_trib = wd.byid("comid", comids)
flw = nhd.prepare_nhdplus(nhdp_trib, 0, 0, purge_non_dendritic=False)

To demonstrate the use of routing, let's use nhdplus_attrs function to get list of available NHDPlus attributes

char = "CAT_RECHG"
area = "areasqkm"

local = nldi.getcharacteristic_byid(comids, "local", char_ids=char)
flw = flw.merge(local[char], left_on="comid", right_index=True)

def runoff_acc(qin, q, a):
    return qin + q * a

flw_r = flw[["comid", "tocomid", char, area]]
runoff = nhd.vector_accumulation(flw_r, runoff_acc, char, [char, area])

def area_acc(ain, a):
    return ain + a

flw_a = flw[["comid", "tocomid", area]]
areasqkm = nhd.vector_accumulation(flw_a, area_acc, area, [area])

runoff /= areasqkm

Since these are catchment-scale characteristic, let’s get the catchments then add the accumulated characteristic as a new column and plot the results.

wd = WaterData("catchmentsp")
catchments = wd.byid("featureid", comids)

c_local = catchments.merge(local, left_on="featureid", right_index=True)
c_acc = catchments.merge(runoff, left_on="featureid", right_index=True)

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/flow_accumulation.png

More examples can be found here.

Contributing

Contributions are very welcomed. Please read CONTRIBUTING.rst file for instructions.

Comments

Pynhd: GeoConnex NameError

I'm using PythonWin with Python version 3.10 on a windows PC. I was trying out the quick start code for the pynhd package. Once I hit the code: gcx = GeoConnex("gages"). I get a NameError: name GeoConnex is not defined.

I expected my code to assign the value GeoConnex("gages") to the variable gcx.

import pynhd gcx = GeoConnex("gages")

opened by nluft2 9
Example notebook import errors

I am trying to use the example Jupyter Notebook for PyNHD.

After commenting out the import of a colormap that isn't present:

If I change the import of pynhd to pynhd.pynhd, I get the following message:

opened by jjstarn-usgs 9
WBD watershed access using pynhd.WaterData vs pygeoogc.ArcGISRESTful

Currently we can get WBD data using either pynhd.WaterData or pygeoogc.ArcGISRESTful().restful.wbd. The former accesses the USGS "waterlabs" GeoServer (which presumably is more experimental and less publicly supported) and the latter apparently accesses the official USGS TNM source. The former provides access to HUC08 and HUC12 layers, while the latter apparently only provides access to a HUC12 layer -- at least that's what I see in your pygeoogc example.

Which one would you recommend to a new user of the hydrodata ecosystem? I think that choice is somewhat confusing to users unfamiliar with web services. Personally, I like having the option to access a HUC08 layer directly, but at the same time the projection inconsistencies in the waterlabs Geoserver layers are not user friendly.

For WaterHackWeek I'm using pynhd.WaterData exclusively. FYI, here's the current state of my WHW tutorial notebook that uses the hydrodata suite to query for HUC layers. It's almost done, but it'll continue to change for the next couple of days.

opened by emiliom 9
Bad Gateway for WaterData() URL

Thanks for this library! I am on a Windows 10 and Python 3.8.10

I was running through the example contained in the phynhd readme file for USGS station ID 01482100. I get an error when I run:

wd_cat = WaterData("catchmentsp")

catchments = wd_cat.byid("featureid", comids)

It seems like the URL provided by WaterData() is dead. Do you know if this is a temporary problem, or a permanent change? I get the following error:

`--------------------------------------------------------------------------- ClientResponseError Traceback (most recent call last) ~\Anaconda3\envs\pangeo\lib\site-packages\async_retriever\async_retriever.py in _retrieve(uid, url, session, read_type, s_kwds, r_kwds) 59 try: ---> 60 response.raise_for_status() 61 resp = await getattr(response, read_type)(**r_kwds)

~\Anaconda3\envs\pangeo\lib\site-packages\aiohttp\client_reqrep.py in raise_for_status(self) 999 self.release() -> 1000 raise ClientResponseError( 1001 self.request_info,

ClientResponseError: 502, message='Bad Gateway', url=URL('https://labs.waterdata.usgs.gov/geoserver/wmadata/ows')

The above exception was the direct cause of the following exception:

ServiceError Traceback (most recent call last) in ----> 1 catchments = wd_cat.byid("featureid", comids)

~\Anaconda3\envs\pangeo\lib\site-packages\pynhd\pynhd.py in byid(self, featurename, featureids) 233 def byid(self, featurename: str, featureids: Union[List[str], str]) -> gpd.GeoDataFrame: 234 """Get features based on IDs.""" --> 235 resp = self.wfs.getfeature_byid(featurename, featureids) 236 return self._to_geodf(resp) 237

~\Anaconda3\envs\pangeo\lib\site-packages\pygeoogc\pygeoogc.py in getfeature_byid(self, featurename, featureids) 502 503 if len(featureids) > 200: --> 504 return self.getfeature_byfilter(f"{featurename} IN ({fid_list})", method="POST") 505 506 return self.getfeature_byfilter(f"{featurename} IN ({fid_list})")

~\Anaconda3\envs\pangeo\lib\site-packages\pygeoogc\pygeoogc.py in getfeature_byfilter(self, cql_filter, method) 550 elif method == "POST": 551 headers = {"content-type": "application/x-www-form-urlencoded"} --> 552 resp = ar.retrieve( 553 [self.url], self.read_method, [{"data": payload, "headers": headers}], "POST" 554 )

~\Anaconda3\envs\pangeo\lib\site-packages\async_retriever\async_retriever.py in retrieve(urls, read, request_kwds, request_method, max_workers, cache_name, family) 190 ) 191 --> 192 return [r for _, r in sorted(tlz.concat(results))] 193 194

~\Anaconda3\envs\pangeo\lib\site-packages\async_retriever\async_retriever.py in (.0) 184 chunked_reqs = tlz.partition_all(max_workers, inp.url_kwds) 185 results = ( --> 186 loop.run_until_complete( 187 async_session(c, inp.read, inp.r_kwds, inp.request_method, inp.cache_name, inp.family), 188 )

~\Anaconda3\envs\pangeo\lib\site-packages\nest_asyncio.py in run_until_complete(self, future) 68 raise RuntimeError( 69 'Event loop stopped before Future completed.') ---> 70 return f.result() 71 72 def _run_once(self):

~\Anaconda3\envs\pangeo\lib\asyncio\futures.py in result(self) 176 self.__log_traceback = False 177 if self._exception is not None: --> 178 raise self._exception 179 return self._result 180

~\Anaconda3\envs\pangeo\lib\asyncio\tasks.py in __step(failed resolving arguments) 278 # We use the send method directly, because coroutines 279 # don't have __iter__ and __next__ methods. --> 280 result = coro.send(None) 281 else: 282 result = coro.throw(exc)

~\Anaconda3\envs\pangeo\lib\site-packages\async_retriever\async_retriever.py in async_session(url_kwds, read, r_kwds, request_method, cache_name, family) 115 request_func = getattr(session, request_method.lower()) 116 tasks = (_retrieve(uid, u, request_func, read, kwds, r_kwds) for uid, u, kwds in url_kwds) --> 117 return await asyncio.gather(*tasks) 118 119

~\Anaconda3\envs\pangeo\lib\asyncio\tasks.py in __wakeup(self, future) 347 def __wakeup(self, future): 348 try: --> 349 future.result() 350 except BaseException as exc: 351 # This may also be a cancellation.

~\Anaconda3\envs\pangeo\lib\asyncio\tasks.py in __step(failed resolving arguments) 278 # We use the send method directly, because coroutines 279 # don't have __iter__ and __next__ methods. --> 280 result = coro.send(None) 281 else: 282 result = coro.throw(exc)

~\Anaconda3\envs\pangeo\lib\site-packages\async_retriever\async_retriever.py in _retrieve(uid, url, session, read_type, s_kwds, r_kwds) 61 resp = await getattr(response, read_type)(**r_kwds) 62 except (ClientResponseError, ContentTypeError) as ex: ---> 63 raise ServiceError(await response.text()) from ex 64 else: 65 return uid, resp

ServiceError:
502 Bad Gateway
502 Bad Gateway
` bug

opened by sjordan29 7

InvalidInputValue Error Using NLDI navigate_byid()

What happened: Following an example notebook - dam_impact.ipynb fails at the cell (17) using navigate_byid(). All previous cells executed successfully and in sequence. Returns the following error -

---------------------------------------------------------------------------
InvalidInputValue                         Traceback (most recent call last)
c:\Users\keonm\Documents\GitHub\HyRiver-examples\Geospatial Hydrologic Data Using Web Services.ipynb Cell 20' in <cell line: 3>()
      [3](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=2) for agency, fid in sites[["agency_cd", "site_no"]].itertuples(index=False, name=None):
      [4](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=3)     try:
----> [5](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=4)         flw_up[fid] = nldi.navigate_byid(
      [6](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=5)             fsource="nwissite",
      [7](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=6)             fid=f"{agency}-{fid}",
      [8](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=7)             navigation="upstreamTributaries",
      [9](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=8)             source="flowlines",
     [10](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=9)             distance=10)
     [11](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=10)     except ZeroMatched:
     [12](vscode-notebook-cell:/c%3A/Users/keonm/Documents/GitHub/HyRiver-examples/Geospatial%20Hydrologic%20Data%20Using%20Web%20Services.ipynb#ch0000019?line=11)         noflw.append(fid)

File ~\miniconda3\envs\pygeo-hyriver\lib\site-packages\pynhd\pynhd.py:945, in NLDI.navigate_byid(self, fsource, fid, navigation, source, distance, trim_start)
    [942](file:///~/miniconda3/envs/pygeo-hyriver/lib/site-packages/pynhd/pynhd.py?line=941)     raise ZeroMatched
    [944](file:///~/miniconda3/envs/pygeo-hyriver/lib/site-packages/pynhd/pynhd.py?line=943) if navigation not in valid_navigations.keys():
--> [945](file:///~/miniconda3/envs/pygeo-hyriver/lib/site-packages/pynhd/pynhd.py?line=944)     raise InvalidInputValue("navigation", list(valid_navigations.keys()))
    [947](file:///~/miniconda3/envs/pygeo-hyriver/lib/site-packages/pynhd/pynhd.py?line=946) url = valid_navigations[navigation]
    [949](file:///~/miniconda3/envs/pygeo-hyriver/lib/site-packages/pynhd/pynhd.py?line=948) r_json = self._get_url(url)

InvalidInputValue: Given navigation is invalid. Valid options are:
description
type

What you expected to happen: Cell executes + calls NLDI web service using function

Environment: Created conda environment using repo's .yml. Installed Jupyter Lab & running in VSCode.

opened by kdmonroe 5

Invalid projection issue while importing "get_basins()" method from "pynhd".
What happened:

What you expected to happen:

Minimal Complete Verifiable Example:

# Put your MCVE code here

Anything else we need to know?: I am using Python 3.8 for which I get errors of failing to load DLL files. Do I need to use earlier version of Python?

Environment:

Output of pygeohydro.show_versions()
```
</details>
opened by rezaulwre 5
Error while using pynhd.
What happened: I am trying to use pynhd but it gives the error: "ImportError: DLL load failed while importing lib: The specified module could not be found." How to solve this?

pynhd.show_versions()

ImportError Traceback (most recent call last) in ----> 1 pynhd.show_versions()

~\anaconda3\lib\site-packages\pynhd\print_versions.py in show_versions(file) 168 for (modname, ver_f) in deps: 169 try: --> 170 mod = _get_mod(modname) 171 except ModuleNotFoundError: 172 deps_blob.append((modname, None))

~\anaconda3\lib\site-packages\pynhd\print_versions.py in get_mod(modname) 94 return sys.modules[modname] 95 try: ---> 96 return importlib.import_module(modname) 97 except ModuleNotFoundError: 98 return importlib.import_module(modname.replace("-", ""))

~\anaconda3\lib\importlib_init_.py in import_module(name, package) 125 break 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) 128 129

~\anaconda3\lib\importlib_bootstrap.py in _gcd_import(name, package, level)

~\anaconda3\lib\importlib_bootstrap.py in find_and_load(name, import)

~\anaconda3\lib\importlib_bootstrap.py in find_and_load_unlocked(name, import)

~\anaconda3\lib\importlib_bootstrap.py in _load_unlocked(spec)

~\anaconda3\lib\importlib_bootstrap_external.py in exec_module(self, module)

~\anaconda3\lib\importlib_bootstrap.py in _call_with_frames_removed(f, *args, **kwds)

~\anaconda3\lib\site-packages\pygeos_init_.py in 32 # end delvewheel patch 33 ---> 34 from .lib import GEOSException # NOQA 35 from .lib import Geometry # NOQA 36 from .lib import geos_version, geos_version_string # NOQA

ImportError: DLL load failed while importing lib: The specified module could not be found.

========================= What you expected to happen:

Minimal Complete Verifiable Example:

# Put your MCVE code here

Anything else we need to know?:

Environment: I am using Python 3.7.12 with anaconda.

Output of pynhd.show_versions()
opened by rezaulwre 4
flow_trace() only returns one upstream river reach
I expected the flow_trace() function to return all of the upstream river reaches, but it is only returning one. I think this is because the NLDI API changed, and now requires distance as a parameter. https://waterdata.usgs.gov/blog/nldi_update/#distance-is-now-a-required-query-parameter

import pynhd pygeoapi = pynhd.PyGeoAPI() lng, lat = -73.82705, 43.29139 trace = pygeoapi.flow_trace((lng, lat), crs="epsg:4326", direction="up") print(len(trace))

returns 1 (expected this watershed to contain dozens or even hundreds of river reaches).
opened by mheberger 3
Add support for StreamStat

Is your feature request related to a problem? Please describe. Add support for StreamStat following the suggestion in cheginit/pygeohydro#38

Describe the solution you'd like

NLDI and StreamStats are working together to revise the NLDI delineation tools so they will delineate from a click point not just from the catchment. The data processing steps and quality assurance work, as well as the underlying data in StreamStats, typically mean that delineations from StreamStats will be more accurate than from the NHDPlus datasets being queried in NLDI. For example, South Carolina data is based on lidar data, we're currently working on 3-meter lidar data in Nebraska. Thus, depending on the use, you may want to include the option of using StreamStats as well as NLDI.

Describe alternatives you've considered We need to figure out a way to implement StreamStat that can either complement NLDI and/or work with a similar API to NLDI.

Additional context @USGSPMM3, I went through the documentation and it seems that it's designed with a specific workflow in mind. I was wondering if you can provide a common example. Also, can you explain the importance of rcode? I don't understand the reason behind rcode being mandatory when you can provide lon and lat.
enhancement

opened by cheginit 3

Error when using nhd.byids("COMID", main.index.tolist()) (River Elevation and Cross-Section example)

What happened: I'm just trying to run the example notebook "River Elevation and Cross-Section"

I got an error at cell 13 when trying to query the flowlines by COMID

I get a JSONDecodeError

Minimal Complete Verifiable Example: This simple code produces the same error:

nhd = NHD("flowline_mr")
main_nhd = nhd.byids('COMID',['1722317'])

Environment:

Output of pynhd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:50:36) [MSC v.1929 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 141 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: English_United States.1252
libhdf5: 1.12.1
libnetcdf: 4.8.1

aiodns: 3.0.0
aiohttp: 3.8.1
aiohttp-client-cache: 0.7.1
aiosqlite: 0.17.0
async-retriever: 0.3.3
bottleneck: 1.3.4
brotli: installed
cchardet: 2.1.7
click: 8.1.3
cytoolz: 0.11.2
dask: 2022.6.1
defusedxml: 0.7.1
folium: 0.12.1.post1
geopandas: 0.11.0
lxml: 4.9.0
matplotlib: 3.4.3
netCDF4: 1.5.8
networkx: 2.8.4
numpy: 1.23.0
owslib: 0.25.0
pandas: 1.4.3
py3dep: 0.0
pyarrow: 6.0.0
pydantic: 1.9.1
pydaymet: 0.13.2
pygeohydro: 0.13.2
pygeoogc: 0.13.2
pygeos: 0.12.0
pygeoutils: 0.13.2
pynhd: 0.13.2
pyproj: 3.3.1
pytest: None
pytest-cov: None
rasterio: 1.2.10
requests: 2.28.0
requests-cache: 0.9.4
richdem: None
rioxarray: 0.11.1
scipy: 1.8.1
shapely: 1.8.2
tables: None
ujson: 5.3.0
urllib3: 1.26.9
xarray: 2022.3.0
xdist: None
yaml: 6.0

opened by LucRSquared 2

BOT: [skip ci] Bump styfle/cancel-workflow-action from 0.9.1 to 0.10.0
Bumps styfle/cancel-workflow-action from 0.9.1 to 0.10.0.

Release notes

Sourced from styfle/cancel-workflow-action's releases.

0.10.0

Changes

Feat(all):support for considering all workflows with one term: #165

Chore: rebuild: 74a81dc1a9321342ebc12fa8670cc91600c8c494

Chore: update main.yml: #78

Bump @vercel/ncc from 0.28.6 to 0.29.1: #106

Bump @vercel/ncc from 0.29.1 to 0.29.2: #109

Bump @vercel/ncc from 0.29.2 to 0.30.0: #112

Bump husky from 7.0.1 to 7.0.2: #110

Bump prettier from 2.3.2 to 2.4.0: #116

Bump @vercel/ncc from 0.30.0 to 0.31.1: #115

Bump typescript from 4.3.5 to 4.4.3: #114

Bump prettier from 2.4.0 to 2.4.1: #117

Bump @actions/github from 4.0.0 to 5.0.0: #89

Bump @actions/core from 1.3.0 to 1.6.0: #118

Bump typescript from 4.4.3 to 4.4.4: #119

Bump husky from 7.0.2 to 7.0.4: #120

Bump typescript from 4.4.4 to 4.5.2: #124

Bump @vercel/ncc from 0.31.1 to 0.32.0: #123

Bump prettier from 2.4.1 to 2.5.0: #125

Bump prettier from 2.5.0 to 2.5.1: #126

Bump @vercel/ncc from 0.32.0 to 0.33.0: #127

Bump typescript from 4.5.2 to 4.5.3: #128

Bump @vercel/ncc from 0.33.0 to 0.33.1: #130

Bump typescript from 4.5.3 to 4.5.4: #129

Bump typescript from 4.5.4 to 4.5.5: #131

Bump node-fetch from 2.6.5 to 2.6.7: #132

Bump @vercel/ncc from 0.33.1 to 0.33.3: #138

Bump actions/setup-node from 2 to 3.0.0: #140

Bump actions/checkout from 2 to 3: #141

Bump typescript from 4.5.5 to 4.6.2: #142

Bump prettier from 2.5.1 to 2.6.0: #143

Bump prettier from 2.6.0 to 2.6.1: #145

Bump actions/setup-node from 3.0.0 to 3.1.0: #146

Bump typescript from 4.6.2 to 4.6.3: #144

Bump prettier from 2.6.1 to 2.6.2: #147

Bump @actions/github from 5.0.0 to 5.0.1: #148

Bump actions/setup-node from 3.1.0 to 3.1.1: #149

Bump @vercel/ncc from 0.33.3 to 0.33.4: #151

Bump @actions/core from 1.6.0 to 1.7.0: #153

Bump typescript from 4.6.3 to 4.6.4: #154

Bump husky from 7.0.4 to 8.0.1: #155

Bump @actions/core from 1.7.0 to 1.8.0: #156

Bump actions/setup-node from 3.1.1 to 3.2.0: #159

Bump @actions/github from 5.0.1 to 5.0.3: #157

Bump @actions/core from 1.8.0 to 1.8.2: #158

Bump typescript from 4.6.4 to 4.7.2: #160

Bump @vercel/ncc from 0.33.4 to 0.34.0: #161

Bump typescript from 4.7.2 to 4.7.3: #163

... (truncated)

Commits

bb6001c 0.10.0

74a81dc chore: rebuild

d2d941c feat(all):support for considering all workflows with one term (#165)

9cd53ca Bump @actions/core from 1.8.2 to 1.9.0 (#166)

4d9b633 Bump prettier from 2.6.2 to 2.7.1 (#168)

89c9307 Bump typescript from 4.7.3 to 4.7.4 (#167)

f680a01 Bump actions/setup-node from 3.2.0 to 3.3.0 (#164)

4ba58d5 Bump typescript from 4.7.2 to 4.7.3 (#163)

0d0a9a5 Bump @vercel/ncc from 0.33.4 to 0.34.0 (#161)

8ca5a00 Bump typescript from 4.6.4 to 4.7.2 (#160)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 2

Releases(v0.13.8)

v0.13.8(Dec 9, 2022)
Release Notes

New Features

Add a new function, called nhdplus_attrs_s3, for accessing the recently released NHDPlus derived attributes on a USGS's S3 bucket. The attributes are provided in parquet files, so getting them is faster than nhdplus_attrs. Also, you can request for multiple attributes at once whereas in nhdplus_attrs you had to request for each attribute one at a time. This function will replace nhdplus_attrs in a future release, as soon as all data that are available on the ScienceBase version are also accessible from the S3 bucket.

Add two new functions called mainstem_huc12_nx and enhd_flowlines_nx. These functions generate a networkx directed graph object of NHD HUC12 water boundaries and flowlines, respectively. They also return a dictionary mapping of COMID and HUC12 to the corresponding networkx node. Additionally, a topologically sorted list of COMIDs/HUC12s are returned. The generated data are useful for doing US-scale network analysis and flow accumulation on the NHD network. The NHD graph has about 2.7 million edges and the mainstem HUC12 graph has about 80K edges.

Add a new function for getting the entire NHDPlus dataset for CONUS (Lower 48), called nhdplus_l48. The entire NHDPlus dataset is downloaded from here. This 7.3 GB file will take a while to download, depending on your internet connection. The first time you run this function, the file will be downloaded and stored in the ./cache directory. Subsequent calls will use the cached file. Moreover, there are two additional dependencies for using this function: pyogrio and py7zr. These dependencies can be installed using pip install pyogrio py7zr or conda install -c conda-forge pyogrio py7zr.

Internal Changes

Refactor vector_accumulation for significant performance improvements.

Modify the codebase based on Refurb suggestions.

Source code(tar.gz)
Source code(zip)
v0.13.7(Nov 4, 2022)
Release Notes

New Features

Add a new function called epa_nhd_catchments to access one of the EPA's HMS endpoints called WSCatchment. You can use this function to access 414 catchment-scale characteristics for all the NHDPlus catchments including 16-day average curve number. More information on the curve number dataset can be found at its project page here.

Bug Fixes

Fix a bug in NHDTools where due to the recent changes in pandas exception handling, the NHDTools fails in converting columns with NaN values to integer type. Now, pandas throws IntCastingNaNError instead of TypeError when using astype method on a column.

Internal Changes

Use pyupgrade package to update the type hinting annotations to Python 3.10 style.

Source code(tar.gz)
Source code(zip)
v0.13.6(Aug 30, 2022)
Release Notes

Internal Changes

Add the missing PyPi classifiers for the supported Python versions.

Source code(tar.gz)
Source code(zip)
v0.13.5(Aug 29, 2022)
Release Notes

Breaking Changes

Append "Error" to all exception classes for conforming to PEP-8 naming conventions.

Internal Changes

Bump the minimum versions of pygeoogc and pygeoutils to 0.13.5 and that of async-retriever to 0.3.5.

Bug Fixes

Fix an issue in nhdplus_vaa and enhd_attrs functions where if cache folder does not exist, it would not have been created, thus resulting to an error.

Source code(tar.gz)
Source code(zip)
v0.13.3(Jul 31, 2022)
Release Notes

Internal Changes

Use the new async_retriever.stream_write function to download files in nhdplus_vaa and enhd_attrs functions. This is more memory efficient.

Convert the type of list of not found items in NLDI.comid_byloc and NLDI.feature_byloc to list of tuples of coordinates from list of strings. This matches the type of returned not found coordinates to that of the inputs.

Fix an issue with NLDI that was caused by the recent changes in the NLDI web service's error handling. The NLDI web service now returns more descriptive error messages in a json format instead of returning the usual status errors.

Slice the ENHD dataframe in NHDTools.clean_flowlines before updating the flowline dataframe to reduce the required memory for the update operation.

Source code(tar.gz)
Source code(zip)
v0.13.2(Jun 14, 2022)
Release Notes

Breaking Changes

Set the minimum supported version of Python to 3.8 since many of the dependencies such as xarray, pandas, rioxarray have dropped support for Python 3.7.

Internal Changes

Use micromamba for running tests and use nox for linting in CI.

Source code(tar.gz)
Source code(zip)
v0.13.1(Jun 12, 2022)
Release Notes

New Features

Add support for all the GeoConnex web service endpoints. There are two ways to use it. For a single query, you can use the geoconnex function and for multiple queries, it's more efficient to use the GeoConnex class.

Add support for passing any of the supported NLDI feature sources to the get_basins method of the NLDI class. The default is nwissite to retain backward compatibility.

Bug Fixes

Set the type of "ReachCode" column to str instead of int in pygeoapi and nhdplus_vaa functions.

Source code(tar.gz)
Source code(zip)
v0.13.0(Apr 4, 2022)
Release Notes

New Features

Add two new functions called flowline_resample and network_resample for resampling a flowline or network of flowlines based on a given spacing. This is useful for smoothing jagged flowlines similar to those in the NHDPlus database.

Add support for the new NLDI endpoint called "hydrolocation". The NLDI class now has two methods for getting features by coordinates: feature_byloc and comid_byloc. The feature_byloc method returns the flowline that is associated with the closest NHDPlus feature to the given coordinates. The comid_byloc method returns a point on the closest downstream flowline to the given coordinates.

Add a new function called pygeoapi for calling the API in batch mode. This function accepts the input coordinates as a geopandas.GeoDataFrame. It is more performant than calling its counteract PyGeoAPI multiple times. It's recommended to switch to using this new batch function instead of the PyGeoAPI class. Users just need to prepare an input data frame that has all the required service parameters as columns.

Add a new step to prepare_nhdplus to convert MultiLineString to LineString.

Add support for the simplified flag of NLDI's get_basins function. The default value is True to retain the old behavior.

Breaking Changes

Remove caching-related arguments from all functions since now they can be set globally via three environmental variables:

HYRIVER_CACHE_NAME: Path to the caching SQLite database.

HYRIVER_CACHE_EXPIRE: Expiration time for cached requests in seconds.

HYRIVER_CACHE_DISABLE: Disable reading/writing from/to the cache file.

You can do this like so:

import os os.environ["HYRIVER_CACHE_NAME"] = "path/to/file.sqlite" os.environ["HYRIVER_CACHE_EXPIRE"] = "3600" os.environ["HYRIVER_CACHE_DISABLE"] = "true"
Source code(tar.gz)
Source code(zip)
v0.12.2(Feb 4, 2022)
Release Notes

New Features

Add a new class called NHD for accessing the latest National Hydrography Dataset. More info regarding this data can be found here.

Add two new functions for getting cross-sections along a single flowline via flowline_xsection or throughout a network of flowlines via network_xsection. You can specify spacing and width parameters to control their location. For more information and examples please consult the documentations.

Add a new property to AGRBase called service_info to include some useful info about the service including feature_types which can be handy for converting numeric values of types to their string equivalent.

Internal Changes

Use the new PyGeoAPI API.

Refactor prepare_nhdplus for improving the performance and robustness of determining tocomid within a network of NHD flowlines.

Add empty geometries that NLDI.getbasins returns to the list of not found IDs. This is because the NLDI service does not include non-network flowlines and instead returns an empty geometry for these flowlines. (:issue_nhd:[#48]{.title-ref})

Source code(tar.gz)
Source code(zip)
v0.12.1(Dec 31, 2021)
Release Notes

Internal Changes

Use the three new ar.retrieve_* functions instead of the old ar.retrieve function to improve type hinting and to make the API more consistent.

Revert to the original PyGeoAPI base URL.

Source code(tar.gz)
Source code(zip)
v0.12.0(Dec 28, 2021)
Release Notes

Breaking Changes

Rewrite ScienceBase to make it generally usable for working with other ScienceBase items. A new function has been added for staging the Additional NHDPlus attributes items called stage_nhdplus_attrs.

Refactor AGRBase to remove unnecessary functions and make it more general.

Update PyGeoAPI class to conform to the new pygeoapi API. This web service is undergoing some changes at the time of this release and API is not stable, might not work as expected. As soon as the web service is stable, a new version will be released.

New Features

In WaterData.byid show a warning if there are any missing feature IDs that are requested but are not available in the dataset.

For all by* methods of WaterData throw a ZeroMatched exception if no features are found.

Add expire_after and disable_caching arguments to all functions that use async_retriever. Set the default request caching expiration time to never expire. You can use disable_caching if you don't want to use the cached responses. Please refer to documentations of the functions for more details.

Internal Changes

Refactor prepare_nhdplus to reduce code complexity by grouping all the NHDPlus tools as a private class.

Modify AGRBase to reflect the latest API changes in pygeoogc.ArcGISRESTfull class.

Refactor prepare_nhdplus by creating a private class that include all the previously used private functions. This will make the code more readable and easier to maintain.

Add all the missing types so mypy --strict passes.

Source code(tar.gz)
Source code(zip)
v0.11.4(Nov 12, 2021)
Release Notes

New Features

Add a new argument to NLDI.get_basins called split_catchment which if set to True will split the basin geometry at the watershed outlet.

Internal Changes

Catch service errors in PyGeoAPI and show useful error messages.

Use importlib-metadata for getting the version insead of pkg_resources to decrease import time as discussed in this issue.

Source code(tar.gz)
Source code(zip)
v0.11.3(Sep 11, 2021)
Release Notes

Internal Changes

More robust handling of inputs and outputs of NLDI's methods.

Use an alternative download link for NHDPlus VAA file on Hydroshare.

Restructure the code base to reduce the complexity of pynhd.py file by dividing it into three files: pynhd all classes that provide access to the supported web services, core that includes base classes, and nhdplus_derived that has functions for getting databases that provided additional attributes for the NHDPlus database.

Source code(tar.gz)
Source code(zip)
v0.11.2(Aug 27, 2021)
Release Notes

New Features

Add support for PyGeoAPI. It offers four functionalities: flow_trace, split_catchment, elevation_profile, and cross_section.

Source code(tar.gz)
Source code(zip)
v0.11.1(Jul 31, 2021)
Release Notes

New Features

Add a function for getting all NHD Fcodes as a dataframe, called nhd_fcode.

Improve prepare_nhdplus function by removing all coastlines and better detection of the terminal point in a network.

Internal Changes

Migrate to using AsyncRetriever for handling communications with web services.

Catch the ConnectionError separately in NLDI and raise a ServiceError instead. So user knows that data cannot be returned due to the out of service status of the server not ZeroMatched.

Source code(tar.gz)
Source code(zip)
v0.11.0(Jun 19, 2021)
Release Notes

New Features

Add nhdplus_vaa to access NHDPlus Value Added Attributes for all its flowlines.

To see a list of available layers in NHDPlus HR, you can instantiate its class without passing any argument like so NHDPlusHR().

Breaking Changes

Drop support for Python 3.6 since many of the dependencies such as xarray and pandas have done so.

Internal Changes

Use persistent caching for all requests which can help speed up network responses significantly.

Improve documnetation and testing.

Source code(tar.gz)
Source code(zip)
v0.10.1(Mar 27, 2021)
Release Notes

Add annoucement regarding the new name for the softwate stack, HyRiver.

Improve pip installation and release workflow.

Source code(tar.gz)
Source code(zip)
v0.10.0(Mar 6, 2021)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.9.0(Feb 17, 2021)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.2.0(Dec 7, 2020)

Please check HISTORY.rst file for a detailed list of changes.
Source code(tar.gz)
Source code(zip)
v0.1.3(Aug 18, 2020)

Replaced simplejson with orjson to speed-up JSON operations.
Source code(tar.gz)
Source code(zip)
v0.1.2(Aug 12, 2020)

Added show_versions function for showing versions of all the installed deps and improved the documentation.
Source code(tar.gz)
Source code(zip)
v0.1.1(Aug 4, 2020)
Improved documentation

Refactored WaterData to improve readability.

This release will be a part of Hydrodata 0.7.0.

Source code(tar.gz)
Source code(zip)
v0.1.0(Jul 24, 2020)

Initial release to pypi.
Source code(tar.gz)
Source code(zip)