Python for downloading model data (HRRR, RAP, GFS, NBM, etc.) from NOMADS, NOAA's Big Data Program partners (Amazon, Google, Microsoft), and the University of Utah Pando Archive System.

Brian Blaylock

Last update: Jan 2, 2023

Related tags

Miscellaneous python download xarray grib rap gfs grib2 hrrr noaa-data nomads cfgrib big-data-program

Overview

Herbie: Retrieve NWP Model Data

The NOAA Big Data Program has made weather data more accessible than ever before. Herbie is a python package that downloads recent and archived numerical weather prediction (NWP) model output from different cloud archive sources. Herbie helps you discover and download High Resolution Rapid Refresh (HRRR), Rapid Refresh (RAP), Global Forecast System (GFS), National Blend of Models (NBM), and Rapid Refresh Forecast System - Prototype (RRFS). NWP data is usually in GRIB2 format and can be read with xarray/cfgrib.

📔 Herbie Documentation

Install

Requires cURL and Python 3.8+ with requests, numpy, pandas, xarray, and cfgrib. Optional packages are matplotlib, cartopy, and Carpenter Workshop.

pip install herbie-data

pip install git+https://github.com/blaylockbk/Herbie.git

or, create the provided conda environment.

Capabilities

Search different data sources for model output.
Download full GRIB2 files
Download subset GRIB2 files (by grib field)
Read data with xarray
Plot data with Cartopy (very early development)

from herbie.archive import Herbie

# Herbie object for the HRRR model 6-hr surface forecast product
H = Herbie('2021-01-01 12:00',
           model='hrrr',
           product='sfc',
           fxx=6)

# Download the full GRIB2 file
H.download()

# Download a subset, like all fields at 500 mb
H.download(":500 mb")

# Read subset with xarray, like 2-m temperature.
H.xarray("TMP:2 m")

Data Sources

Herbie downloads model data from the following sources, but can be extended to include others:

NOMADS
Big Data Program Partners (AWS, Google, Azure)
University of Utah CHPC Pando archive

History

During my PhD at the University of Utah, I created, at the time, the only publicly-accessible archive of HRRR data. In the later half of 2020, this data was made available through the NOAA Big Data Program. This package organizes and expands my original download scripts into a more coherent package with the ability to download HRRR and RAP model data from different data sources. It will continue to evolve at my own leisure.

I originally released this package under the name "HRRR-B" because it only dealt with the HRRR data set, but I have addeed ability to download RAP data. Thus, it was rebranded with the name "Herbie" as a model download assistant. For now, it is still called "hrrrb" on PyPI because "herbie" is already taken. Maybe someday, with some time and an enticing reason, I'll add additional download capabilities.

Alternative Download Tools

As an alternative you can use rclone to download files from AWS or GCP. I quite like rclone. Here is a short rclone tutorial

Thanks for using Herbie, and Happy Racing 🏎 🏁

- Brian

👨🏻‍💻 Contributing Guidelines
💬 GitHub Discussions
🚑 GitHub Issues
🌐 Personal Webpage
🌐 University of Utah HRRR archive

✒ Pando HRRR Archive citation:

Blaylock B., J. Horel and S. Liston, 2017: Cloud Archiving and Data Mining of High Resolution Rapid Refresh Model Output. Computers and Geosciences. 109, 43-50. https://doi.org/10.1016/j.cageo.2017.08.005

P.S. If you like Herbie, check out my GOES-2-go package to download GOES-East/West data and SynopticPy to download mesonet data from the Synoptic API.

Comments

HRRR as Zarr on AWS

@blaylockbk , this is probably the wrong place to raise this, but I saw in your HRRR Archive FAQ, you said:

One day, we hope this data will be archived elsewhere that is more accessible to everyone. Perhaps soon it will be hosted by Amazon by their Opendata initiative. I would advocate to keep it in the GRIB2 format (the original format it is output as), but it would also be nice to store the data in a "cloud-friendly" format such as zarr.

To have archived HRRR data in Zarr would be AMAZING. We were trying to figure out how to download 1 year of HRRR surface fields to drive a Delaware Bay hydrodynamics simulation, and thinking how useful it would be to have the data on AWS. We could store as Zarr but create GRIB-on-demand service for those who need it. I've been active on the Pangeo project, and we have some tools now that could make the conversion, chunking and upload to cloud much easier. And I'd be happy to help out.

@zflamig, you guys would be up for a proposalon this, right ?

opened by rsignell-usgs 21

Convert PosixPath to str before passing to cfgrib

I was just following the basic setup guide. Installed with conda using the example environment.yaml file, and ran into this issue:

>>> H.xarray('TMP:2 m')
indexpath value  is ignored
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/herbie/archive.py", line 900, in xarray
    Hxr = cfgrib.open_datasets(
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 105, in open_datasets
    datasets = open_variable_datasets(path, backend_kwargs=backend_kwargs, **kwargs)
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 93, in open_variable_datasets
    datasets.extend(raw_open_datasets(path, bk, **kwargs))
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 66, in raw_open_datasets
    datasets.append(open_dataset(path, backend_kwargs=backend_kwargs, **kwargs))
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 39, in open_dataset
    return xr.open_dataset(path, **kwargs)  # type: ignore
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/xarray/backends/api.py", line 495, in open_dataset
    backend_ds = backend.open_dataset(
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_plugin.py", line 99, in open_dataset
    store = CfGribDataStore(
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_plugin.py", line 39, in __init__
    self.ds = opener(filename, **backend_kwargs)
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/dataset.py", line 728, in open_fieldset
    index = messages.FieldsetIndex.from_fieldset(fieldset, index_keys, computed_keys)
  File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/messages.py", line 365, in from_fieldset
    iteritems = enumerate(fieldset)
TypeError: 'PosixPath' object is not iterable

It seems like that cfgrib either expects a str filename, or an already opened object: https://github.com/ecmwf/cfgrib/blob/a0d9763/cfgrib/xarray_plugin.py#L35-L38

Converting the path to str here makes the example pass for me.

opened by WToma 6

GFS data downloaded with searchString is not complete. Radiation variables are omitted.

Hi!

I have tried to download some radiation variables from GFS with no success. Some months ago I was able to download this data with the same version (Herbie 0.0.6), however, now it only gives me the following set of allowed variables:

array(['PRMSL', 'CLWMR', 'ICMR', 'RWMR', 'SNMR', 'GRLE', 'REFD', 'REFC',
       'VIS', 'UGRD', 'VGRD', 'VRATE', 'GUST', 'HGT', 'TMP', 'RH', 'SPFH',
       'VVEL', 'DZDT', 'ABSV', 'O3MR', 'TCDC', 'HINDEX', 'MSLET', 'PRES',
       'TSOIL', 'SOILW', 'SOILL', 'CNWAT', 'WEASD', 'SNOD', 'ICETK',
       'DPT', 'APTMP', 'ICEG', 'CPOFP', 'PRATE', 'CSNOW', 'CICEP',
       'CFRZR', 'CRAIN', 'SFCR', 'FRICV', 'VEG', 'SOTYP', 'WILT', 'FLDCP',
       'SUNSD', 'LFTX', 'CAPE', 'CIN', 'PWAT', 'CWAT', 'TOZNE', 'LCDC',
       'MCDC', 'HCDC', 'HLCY', 'USTM', 'VSTM', 'ICAHT', 'VWSH', '4LFTX',
       'HPBL', 'POT', 'PLPL', 'LAND', 'ICEC', 'ICETMP'], dtype=object)

It is only a small subset of the full set of variables in NOAA GFS. Is not possible to download radiation variables from GFS anymore with Herbie?

GFS

opened by sramirez 5

How to dump GRIB data into a text file.

I am truly aweful at python I can't seem to work with grib outside of python so please don't laugh too hard when you read my code:

from herbie.archive import Herbie
import numpy as np

H = Herbie('2022-01-26', model='ecmwf', product='oper', fxx=12)

ds = H.xarray(':2t:', remove_grib=False)

dsw = H.xarray(':10(u|v):', remove_grib=False)
ds['spd'] = np.sqrt(dsw['u10'] ** 2 + dsw['v10'] ** 2)

dsp = H.xarray(':tp:', remove_grib=False)
ds['tp'] = dsp['tp']

file = open('test.txt', 'a')
for lon in ds['longitude']:
    for lat in ds['latitude']:
        point = ds.sel(longitude=lon, latitude=lat, method='nearest')
        line = str(point['longitude'].values) + ',' + str(point['latitude'].values) + ',' + str(point['t2m'].values) + ',' + str(point['spd'].values) + ',' + str(point['tp'].values) + '\n'
        file.write(line)
file.close()

After 5 minutes it was like 5% done. I get why it's bad, but I honestly just don't want to spend a month learning python.

I prefer to just make like a (405900, 5) array and store a raw blob file of float32s like so:

lon1,lat1,t2m1,spd1,tp1,.....,lonN,latN,t2mN,spdN,tpN

Any advice would be amazing.

opened by CraigglesO 5

Carpenter Workshop instructions missing from tutorials

First, this tool is awesome. Thanks for publishing and maintaining it!

I installed your package using conda install -c conda-forge herbie-data, but wasn't able to run this portion of your tutorial https://blaylockbk.github.io/Herbie/_build/html/user_guide/notebooks/data_hrrr.html without first installing Carpenter Workshop, e.g.,

pip install git+https://github.com/blaylockbk/Carpenter_Workshop.git

It might be helpful to add some additional instruction/details in the tutorials or here https://github.com/blaylockbk/Herbie#installation.
documentation

opened by williamhobbs 4

conda environment.yml install

I am trying to install Herbie using the conda environment.yml file but conda is finding conflicts that are causing the install to fail:

(base) jmiller@ubuntu:~$ conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: \
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:



Package geos conflicts for:
metpy -> cartopy[version='>=0.15.0'] -> shapely[version='>=1.6.4'] -> geos[version='>=3.4']
cartopy[version='>=0.20.3'] -> geos[version='>=3.10.3,<3.10.4.0a0|>=3.11.0,<3.11.1.0a0']
metpy -> cartopy[version='>=0.15.0'] -> geos[version='3.6.2|>=3.10.0,<3.10.1.0a0|>=3.10.1,<3.10.2.0a0|>=3.10.2,<3.10.3.0a0|>=3.10.3,<3.10.4.0a0|>=3.11.0,<3.11.1.0a0|>=3.6.2,<3.6.3.0a0|>=3.7.0,<3.7.1.0a0|>=3.7.1,<3.7.2.0a0|>=3.7.2,<3.7.3.0a0|>=3.8.0,<3.8.1.0a0|>=3.8.1,<3.8.2.0a0|>=3.9.0,<3.9.1.0a0|>=3.9.1,<3.9.2.0a0']
geopandas -> shapely -> geos[version='3.6.2|>=3.10.0,<3.10.1.0a0|>=3.10.1,<3.10.2.0a0|>=3.10.2,<3.10.3.0a0|>=3.10.3,<3.10.4.0a0|>=3.11.0,<3.11.1.0a0|>=3.4|>=3.6.2,<3.6.3.0a0|>=3.7.0,<3.7.1.0a0|>=3.7.1,<3.7.2.0a0|>=3.7.2,<3.7.3.0a0|>=3.8.0,<3.8.1.0a0|>=3.8.1,<3.8.2.0a0|>=3.9.0,<3.9.1.0a0|>=3.9.1,<3.9.2.0a0']

I was able to manually create a conda environment and then install using pip

UPDATE: Turns out the conda environment I set up using pip actually doesn't want to work either. I'll keep trying and see if I can get it working.

opened by jjm0022 4

Try this: Multithreading for downloading speedup
Would Multithreading help download many files (and many parts of files) quickly?

This would be a helper tool used in herbie.tools (in bulk_download)

Check out this article for some inspiration.

https://superfastpython.com/threadpoolexecutor-download-books/

https://superfastpython.com/threadpoolexecutor-map-vs-submit/

It seems like simply downloading files are parts of files is an IO-bound task that could see some speedup fro multi threading.

Possiblly could see speedup when iterating on downloading chunks of the file.
💡 Idea
opened by blaylockbk 4

Using Username/Password authorization when extending Herbie

I'm trying to add IMERG to the list of models. To download the data one must register with NASA and then use a username/password to access the data.

Typically I've gotten the data using python through a url using this format https://username:password@URL however when I add that style of url to the model's self.SOURCES I get the following Traceback

Traceback (most recent call last):
  File "herbie_methods.py", line 37, in <module>
    H.download(verbose=False)
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/site-packages/herbie/archive.py", line 642, in download
    urllib.request.urlretrieve(self.grib, outFile, _reporthook)
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 1397, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 1323, in do_open
    h = http_class(host, timeout=req.timeout, **http_conn_args)
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/http/client.py", line 1383, in __init__
    super(HTTPSConnection, self).__init__(host, port, timeout,
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/http/client.py", line 834, in __init__
    (self.host, self.port) = self._get_hostport(host, port)
  File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/http/client.py", line 877, in _get_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
http.client.InvalidURL: nonnumeric port: '[email protected]'

Is there a way around this, or should I add the username/password differently?

opened by isodrosotherm 4

Create a conda-forge recipe so Herbie can be installed via conda
It would be nice if Herbie could be installed with Conda directly, instead of using Pip. Especially since Herbie depends on cfgrib and cartopy, which have dependencies that can't be installed with pip (proj, GEOS, eccodes)

I don't have experience with this, but want to learn. If anyone can help out with this, that would be awesome 😁

https://blog.gishub.org/how-to-publish-a-python-package-on-conda-forge

https://docs.conda.io/projects/conda-build/en/latest/user-guide/tutorials/build-pkgs-skeleton.html

https://github.com/conda-forge/staged-recipes/blob/main/recipes/example/meta.yaml

https://github.com/blaylockbk/staged-recipes#getting-started

help wanted
opened by blaylockbk 3

fast_Herbie_xarray() does not work with hrrr's subh product

In attempting to run

dates = pd.date_range('2022-01-01 1:00', 
                      '2022-01-01 3:00',
                      freq='1H')
fxx = 1
h_list = fast_Herbie_xarray(DATES=dates, fxx=fxx, model='hrrr', product='subh', searchString=':PRES:surface')

I get the following traceback

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Untitled-1.ipynb Cell 5' in <cell line: 1>()
----> [1](vscode-notebook-cell:Untitled-1.ipynb?jupyter-notebook#ch0000011untitled?line=0) hh = fast_Herbie_xarray(DATES=dates, fxx=fxx, model='hrrr', product='subh', searchString=':PRES:surface')

File ~/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py:228, in fast_Herbie_xarray(DATES, searchString, fxx, max_threads, xarray_kw, **kwargs)
    [225](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=224)     ds_list = [future.result() for future in as_completed(futures)]
    [227](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=226) # Sort the DataSets, first by lead time (step), then by run time (time)
--> [228](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=227) ds_list.sort(key=lambda ds: ds.step.item())
    [229](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=228) ds_list.sort(key=lambda ds: ds.time.item())
    [231](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=230) # Reshape list with dimensions (len(DATES), len(fxx))

File ~/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py:228, in fast_Herbie_xarray.<locals>.<lambda>(ds)
    [225](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=224)     ds_list = [future.result() for future in as_completed(futures)]
    [227](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=226) # Sort the DataSets, first by lead time (step), then by run time (time)
--> [228](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=227) ds_list.sort(key=lambda ds: ds.step.item())
    [229](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=228) ds_list.sort(key=lambda ds: ds.time.item())
    [231](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=230) # Reshape list with dimensions (len(DATES), len(fxx))

AttributeError: 'list' object has no attribute 'step'

When I just get one herbie obj and read data into xarray I get the following note

Note: Returning a list of [2] xarray.Datasets because of multiple hypercubes.

and H.xarray() ends up returning a list of two xarray.Datasets (one with the 15min,30min,45min forecast and one with the 1hr fcst). Pretty sure this is what's causing the issue of not being able to use fast_Herbie_xarray() with hrrr subh. Not sure if there's a way around it?

opened by isodrosotherm 3

Multiple Levels Subsetting

When using the subset function and attempting to subset multiple levels (t2m, surface, 10m wind, etc), xarray does not open all variables, only attained surface variables. Wondering if this has to do with Herbie or xarray. Thought I would make you aware of this.

opened by FireWx42 3

Improve config file usage

I would like to make changes to the way the config file is used, like as the ability to change the config values temporarily (adjust the defaults inside a single script)

Here is what ChatGPT says I could do:

To allow a user to temporarily update config attributes without modifying the config file, you can modify the Config class to store the temporary updates in a separate dictionary. Here's an example of how you can do this:

class Config:
    def __init__(self, config_file):
        self.config_file = config_file
        self.config = {}
        self.temp_updates = {}
        with open(self.config_file, 'r') as f:
            for line in f:
                key, value = line.strip().split('=')
                self.config[key] = value

    def get(self, key):
        if key in self.temp_updates:
            return self.temp_updates[key]
        return self.config[key]

    def set(self, key, value, temporary=False):
        if temporary:
            self.temp_updates[key] = value
        else:
            self.config[key] = value
            with open(self.config_file, 'w') as f:
                for k, v in self.config.items():
                    f.write(f'{k}={v}\n')

    def reset(self):
        self.temp_updates = {}

With this modified Config class, you can use the set method to temporarily update a config value by setting the temporary argument to True. The temporary update will be stored in the temp_updates dictionary, and the get method will return the temporary value if it exists.

To reset all temporary updates and restore the original config values, you can use the reset method. This will clear the temp_updates dictionary.

Here's an example of how you might use these new methods:

# Load the config file
config = Config('/path/to/config.txt')

# Get the original value of a config key
value = config.get('key')

# Temporarily update the value of the key
config.set('key', 'new_value', temporary=True)

# Get the temporary updated value of the key
temp_value = config.get('key')

# Reset the temporary updates and restore the original value
config.reset()

# Get the original value of the key again
value = config.get('key')

There, that might give you a starting place.

opened by blaylockbk 1

Test if Beltzer package can be used to create GRIB idx files without downloading fill.

Ok, so I saw this on LinkedIn and it would be super helpful for Herbie...

This python package would enable Herbie to subset data even when an index file isn't available for the file.

https://github.com/joxtoby/beltzer

opened by blaylockbk 1
FastHerbie error when reading ECMWF ensemble in xarray.
Discussed in https://github.com/blaylockbk/Herbie/discussions/116

^{Originally posted by csteele2 November 1, 2022} I was trying to use Herbie to easily download and process the european ensemble data. Not sure if I don't understand Herbie-fast or what, because Herbie fast appears to download the entire dataset, and seems like it takes way longer to do one timestep than my loop for 6 days. I have the sample of code I am using below. The fast herbie that is commented out took an hour for maybe one time step? Not sure, when it started another loop, I killed it, because my other loop takes 45 minutes, however, I have not been able to download a complete dataset in the three days I have been trying, for any cycle.

variable = "tp" #tp for precipitation tp_all = [] valid_times = [] forecast_hours_qpf = range(3,147,3) model_search_string = ":"+variable+":sfc:" #forecast_hours_qpf = range(3,147,3) #ptotal = fast_Herbie_xarray(DATES=model_run.strftime('%Y-%m-%d %H:00'), model="ecmwf", product="enfo", fxx=forecast_hours_qpf, max_threads=5, search_string=model_search_string) for t in forecast_hours_qpf: H = Herbie(model_run.strftime('%Y-%m-%d %H:00'), model="ecmwf", product="enfo", fxx=t) tp = H.xarray(":"+variable+":sfc:")[0] #tp = tp.rename({"number":"pertubation"}) tp_all.append(tp) valid_times.append(model_run + timedelta(hours=t)) ptotal = xr.concat([tp_all[i] for i in range(0,len(forecast_hours_qpf))], dim='step')

The most common problem is one or more timesteps will have the number (member/pertubation) coordinates as 0 instead of a an array of length 50. If I go back an assign those 50, it's clear something weird happened as revealed by this spot check of a single point (look at the 05-03Z column):

I have not yet attempted to just download those times separately, but I would think this has to be a problem with the processing vs data, right? This is probably way more data than with a typical use-case, but I like me my ensemble data.

Other than these issues, kudos on this though, it makes dealing with this big data so so so so so so so much easier, and really helps elevate some serious science game.

bug ECMWF
opened by blaylockbk 0
create a codespaces container environment

Codespaces just rolled out. Let's see what I can use it for.

60 free hours seems like plenty of free compute time to check bugs and some development work...without needing to install the python environment on my laptop.

opened by blaylockbk 0
Add interface to wgrib2 to produce subset

Requested more than once...

it is possible to set a region and some variables names to download.

No, it's not possible, but wgrib2 can subset a grib file for you after it is downloaded.

opened by blaylockbk 0

Releases(2022.09.0)

2022.09.0(Sep 11, 2022)
Huge shoutout to Ryan May for instructing me on how to publish Herbie on conda-forge. You can now install Herbie 2022.9.0.post1 with Conda

conda install -c conda-forge herbie-data

What's Changed

Changed to CalVer versioning. Release versions will follow the YYYY.mm.micro format as I felt this more clearly highlights when the package was last updated.

Changed default branch from master to main

Add missing pygrib dependency by @haim0n in https://github.com/blaylockbk/Herbie/pull/74

Remove unused import by @haim0n in https://github.com/blaylockbk/Herbie/pull/78

Changes subset hash label to include date and fxx by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/96

Blaylockbk/issue98 change setup.py to setup.cfg and pyproject.toml by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/99

Added pygrib and cartopy as a dependency.

Added NAM model #93

Documentation: added dark mode documentation via PyData's Sphinx theme

Let Herbie find the most recent GRIB2 file. by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/103

New Contributors

@haim0n made their first contribution in https://github.com/blaylockbk/Herbie/pull/74

Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.10...2022.09.0
Source code(tar.gz)
Source code(zip)
0.0.10(May 7, 2022)
⭐ Over 110 GitHub stars!

Wow, I had no idea so many people would find Herbie useful. Thanks everyone.

What's Changed

Now you can import the Herbie class easier with from herbie import Herbie instead of from herbie.archive import Herbie

Add template for GEFS by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/56

Changed how extending Herbie with custom templates works by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/55

Herbie helper tool to make an index file for all grib2 files in a directory. by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/59

Smarter cURL ranges for faster subset downloads by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/62

Moved the grib and index search in the Herbie class to their own functions (find_grib and find_idx)

Cache the full index file Dataframe first time it is read.

Added some ASCII art to __init__.py and some Easter eggs (because every project needs cool ASCII art and Easter eggs 😎)

Add some multithreading tools to Herbie by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/69 -- have not done extensive tests; watch your memory

Fixed subset filename bug (https://github.com/blaylockbk/Herbie/commit/de70cb7b70f899431e5481e28ed4952aab3ef4b6) that affected GFS files.

Herbie can make an index file if one does not exist on the remote archive if the full grib2 file is downloaded locally and if wgrib2 is installed. https://github.com/blaylockbk/Herbie/commit/86535735a7ee0cd9045f51fd79d85a03873b3c84

Concat HRRR subh fields, if possible https://github.com/blaylockbk/Herbie/commit/8a9a862392d51ee8666647473d7ae4d984f1182d

Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.9...0.0.10
Source code(tar.gz)
Source code(zip)
0.0.9(Mar 8, 2022)
Changelog

Fixed #42: bug where GFS index files could not be found.

Speed up Herbie.xarray by Reusing local_file by @WToma in https://github.com/blaylockbk/Herbie/pull/46

Removed old hrrrb API and old documentation.

Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.8...0.0.9
Source code(tar.gz)
Source code(zip)
0.0.8(Jan 27, 2022)
Add access to ECMWF open data forecast products

The main feature of this release is the ability to retrieve ECMWF open data forecast products (see tweet)

The big change was implementing a method to read the grib_ls-style index files to get the byte ranges for specific variables/parameters. This means that the searchString argument will need to be specified differently than that used for other models. Read more about the searchString argument for grib_ls-style index files: https://blaylockbk.github.io/Herbie/_build/html/user_guide/searchString.html#grib-ls-style-index-files

For example:

from herbie.archive import Herbie # Create Herbie object to discover ECMWF operational forecast product H = Herbie("2022-01-26", model="ecmwf", product="oper", fxx=12) # Download the full grib2 file H.download() # Download just the 10-m u and v winds H.download(searchString=":10(u|v):") # Retrieve the 500 hPa temperature as an xarray.Dataset ds = H.xarray(searchString=":t:500:")

🏹 More examples for retrieving ECMWF open data

Changelog

Added access to ECMWF's open data products by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/37

Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.7...0.0.8

Pretty Pictures

Source code(tar.gz)
Source code(zip)
0.0.7(Jan 21, 2022)
Changelog

Default config value for priority is now None, which will make the download source priority to be the order of the SOURCE in the models template files.

isort all imports

Fixed Issue #24: Use hash labels to name subset files to make unique file names so filenames don't get too long (Pull Request #26).

When loading data into xarray, Herbie will parse CF grid_mapping from grib file with pygrib/pyproj/metpy. (see CF Appendix F: Grid Mapping for more info). 2789859120883e32592f5559a1e211696b58cf3e

Lay the groundwork for local GRIB2 files that are not downloaded from remote sources. Suppose you ran a model locally (like WRF) and have that data stored locally. Herbie can be configured to find and read those local files! This assumes an index file also exists in the same directory with .idx appended to the file name.

Documentation needed.

Added model template for rap_ncei. This is poorly implemented because the NCEI data doesn't match the other RAP model URLs.

#24, implemented hashed filename for subset by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/26

Add a Gitter chat badge to README.md by @gitter-badger in https://github.com/blaylockbk/Herbie/pull/21

Remove Gitter chat badge. I'm not committed to chatting on Gitter; just focus on discussions.

Add support for NCEI historical RAP analyses by @djgagne in https://github.com/blaylockbk/Herbie/pull/30

Handle different idx styles by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/35 (more changes in addition to those by djgagne)

Fixed #33 so that the herbie.tools.bulk_download will return a dict of Herbie objects that were successful and those that failed.

Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.6...0.0.7
Source code(tar.gz)
Source code(zip)
0.0.6(Aug 30, 2021)
Changelog

#18 Use TOML as config file format instead of INI

Expanded setting that can be set in the configuration file

Adopt Black formatting

Moved PyPI project from hrrrb to herbie-data

Source code(tar.gz)
Source code(zip)
0.0.5(Jul 28, 2021)
New Name! HRRR-B 🠖 Herbie

I updated the GitHub repository name to Herbie and I'm slowly removing the old hrrrb API (but it's still there).

The most significant change is that the vision of Herbie has expanded. Herbie is being built do download many different model types, not just the HRRR model.

Rename package to herbie. "Herbie is your model output download assistant with a mind of its own." Yes, this is named after a favorite childhood movie series.

Implement new Herbie class

Drop support for hrrrx (experimental HRRR no longer archived on Pando and ESRL is now developing RRFS)

Added ability to download and read RAP model GRIB2 files.

Less reliance on Pando, more on aws and google.

New method for searchString index file search. Uses same regex search patterns as old API.

Filename for GRIB2 subset includes all GRIB message numbers.

Moved default download source to config file setting.

Check local file copy on init. (Don't need to look for file on remote if we have local copy)

Option to remove grib2 file when reading xarray if didn't already exist locally (don't clutter local disk).

Attach index file DataFrame to object if it exists.

If full file exists locally, use remote idx file to cURL local file instead of remote. (Can't create idx file locally because wgrib2 not available on windows)

Added GFS data, though it isn't implemented as cleanly as HRRR or RAP

Renamed 'field' argument to 'product'

✨ Moved the source URL templates to their own classes in the models folder

Renamed GitHub repository to Herbie (changed from HRRR_archive_download)

Added RRFS, NBM, GFS, RAP as models Herbie can download

Reworked read_idx() to support index files with additional info (specifically for the NBM).

Source code(tar.gz)
Source code(zip)
0.0.4(Jun 4, 2021)

New Herbie API

There are a few things about the hrrrb API that make it difficult to update, so I started changing things under the new name "herbie." But don't worry, both the hrrrb and herbie APIs are included. The setup.py file also is fixed.

To use the new Herbie API, refer to the documentation for some usage examples.
Source code(tar.gz)
Source code(zip)
0.0.3(Feb 26, 2021)

Welcome to the world, HRRR-B 🎂

This is my first initial GitHub release ever! I have published on PyPi before, but this is my first here on GitHub. Certainly, a happy birthday for the HRRR-B package.

Be aware, this is v0.0.3, meaning it is subject to change at my leisure. The purpose of this repository is to serve as an example of how you can download HRRR data from archives, but I try to keep this package in a workable state that might be useful for you.

📔 Documentation
Source code(tar.gz)
Source code(zip)

Owner

Brian Blaylock

Atmospheric scientist. Post-doc at Naval Research Laboratory

GitHub https://blaylockbk.github.io/Herbie/_build/html/

The LiberaPay archive module for the SeanPM life archive project.

By: Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans | sq Shqiptare Albania

1 Aug 26, 2022

A python package template that can be adapted for RAP projects

Warning - this repository is a snapshot of a repository internal to NHS Digital. This means that links to videos and some URLs may not work. Repositor

3 Nov 8, 2022

An awesome script to convert the University Of Oviedo web calendar to Google or Outlook calendars.

autoUniCalendar Un script en Python para convertir el calendario de la intranet de la Universidad de Oviedo en un calendario de Outlook o Google Calen

14 Sep 28, 2022

A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.

DeltaCAT DeltaCAT is a Pythonic Data Catalog powered by Ray. Its data storage model allows you to define and manage fast, scalable, ACID-compliant dat

45 Oct 15, 2022

Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.

Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big

88 Dec 16, 2022

Urban Big Data Centre Housing Sensor Project

Housing Sensor Project The Urban Big Data Centre is conducting a study of indoor environmental data in Scottish houses. We are using Raspberry Pi devi

2 Dec 13, 2021

It's just a simple script to add all contest from site to your Google Calendar and make two reminder for them one before the contest one day, and another before half an hour, the event on Google Calendar have the registration link of the contest.

CP-Calendar It's just a simple script to add all contest from site to your Google Calendar and make two reminder for them one before the contest one d

12 Oct 17, 2022

Python for downloading model data (HRRR, RAP, GFS, NBM, etc.) from NOMADS, NOAA's Big Data Program partners (Amazon, Google, Microsoft), and the University of Utah Pando Archive System.

Related tags

Overview

Herbie: Retrieve NWP Model Data

📔 Herbie Documentation

Install

Capabilities

Data Sources

History

Alternative Download Tools

✒ Pando HRRR Archive citation:

Comments

Discussed in https://github.com/blaylockbk/Herbie/discussions/116

Releases(2022.09.0)

2022.09.0(Sep 11, 2022)

What's Changed

New Contributors

0.0.10(May 7, 2022)

⭐ Over 110 GitHub stars!

What's Changed

0.0.9(Mar 8, 2022)

Changelog

0.0.8(Jan 27, 2022)

Add access to ECMWF open data forecast products

Changelog

Pretty Pictures

0.0.7(Jan 21, 2022)

Changelog

0.0.6(Aug 30, 2021)

Changelog

0.0.5(Jul 28, 2021)

New Name! HRRR-B 🠖 Herbie

0.0.4(Jun 4, 2021)

New Herbie API

0.0.3(Feb 26, 2021)

Welcome to the world, HRRR-B 🎂

Owner

Brian Blaylock

The LiberaPay archive module for the SeanPM life archive project.

A python package template that can be adapted for RAP projects

An awesome script to convert the University Of Oviedo web calendar to Google or Outlook calendars.

A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.

Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.

Urban Big Data Centre Housing Sensor Project

It's just a simple script to add all contest from site to your Google Calendar and make two reminder for them one before the contest one day, and another before half an hour, the event on Google Calendar have the registration link of the contest.

Bookmarkarchiver - Python script that archives all of your bookmarks on the Internet Archive

Download and archive entire usenet newsgroups over NNTP.

Your self-hosted bookmark archive. Free and open source.

Archive, organize, and watch for changes to publicly available information.

Automate your Microsoft Learn Student Ambassadors event certificate with Python

A simple python project which control paint brush in microsoft paint app

This repository is an archive of emails that are sent by the awesome Quincy Larson every week.

Metal Gear Rising: Revengeance's DAT archive (un)packer

Tools for downloading and processing numerical weather predictions

Wunderland desktop wallpaper and Microsoft Teams background.

How to use Microsoft Bing to search for leaks?

Project 2 for Microsoft Azure on WUT

New Name! `HRRR-B` 🠖 `Herbie`