Python for downloading model data (HRRR, RAP, GFS, NBM, etc.) from NOMADS, NOAA's Big Data Program partners (Amazon, Google, Microsoft), and the University of Utah Pando Archive System.

Overview

Herbie: Retrieve NWP Model Data

DOI

The NOAA Big Data Program has made weather data more accessible than ever before. Herbie is a python package that downloads recent and archived numerical weather prediction (NWP) model output from different cloud archive sources. Herbie helps you discover and download High Resolution Rapid Refresh (HRRR), Rapid Refresh (RAP), Global Forecast System (GFS), National Blend of Models (NBM), and Rapid Refresh Forecast System - Prototype (RRFS). NWP data is usually in GRIB2 format and can be read with xarray/cfgrib.

πŸ“” Herbie Documentation

Install

Requires cURL and Python 3.8+ with requests, numpy, pandas, xarray, and cfgrib. Optional packages are matplotlib, cartopy, and Carpenter Workshop.

pip install herbie-data

or

pip install git+https://github.com/blaylockbk/Herbie.git

or, create the provided conda environment.

Capabilities

  • Search different data sources for model output.
  • Download full GRIB2 files
  • Download subset GRIB2 files (by grib field)
  • Read data with xarray
  • Plot data with Cartopy (very early development)
from herbie.archive import Herbie

# Herbie object for the HRRR model 6-hr surface forecast product
H = Herbie('2021-01-01 12:00',
           model='hrrr',
           product='sfc',
           fxx=6)

# Download the full GRIB2 file
H.download()

# Download a subset, like all fields at 500 mb
H.download(":500 mb")

# Read subset with xarray, like 2-m temperature.
H.xarray("TMP:2 m")

Data Sources

Herbie downloads model data from the following sources, but can be extended to include others:

  • NOMADS
  • Big Data Program Partners (AWS, Google, Azure)
  • University of Utah CHPC Pando archive

History

During my PhD at the University of Utah, I created, at the time, the only publicly-accessible archive of HRRR data. In the later half of 2020, this data was made available through the NOAA Big Data Program. This package organizes and expands my original download scripts into a more coherent package with the ability to download HRRR and RAP model data from different data sources. It will continue to evolve at my own leisure.

I originally released this package under the name "HRRR-B" because it only dealt with the HRRR data set, but I have addeed ability to download RAP data. Thus, it was rebranded with the name "Herbie" as a model download assistant. For now, it is still called "hrrrb" on PyPI because "herbie" is already taken. Maybe someday, with some time and an enticing reason, I'll add additional download capabilities.

Alternative Download Tools

As an alternative you can use rclone to download files from AWS or GCP. I quite like rclone. Here is a short rclone tutorial


Thanks for using Herbie, and Happy Racing 🏎 🏁

- Brian

πŸ‘¨πŸ»β€πŸ’» Contributing Guidelines
πŸ’¬ GitHub Discussions
πŸš‘ GitHub Issues
🌐 Personal Webpage
🌐 University of Utah HRRR archive

βœ’ Pando HRRR Archive citation:

Blaylock B., J. Horel and S. Liston, 2017: Cloud Archiving and Data Mining of High Resolution Rapid Refresh Model Output. Computers and Geosciences. 109, 43-50. https://doi.org/10.1016/j.cageo.2017.08.005

P.S. If you like Herbie, check out my GOES-2-go package to download GOES-East/West data and SynopticPy to download mesonet data from the Synoptic API.

Comments
  • HRRR as Zarr on AWS

    HRRR as Zarr on AWS

    @blaylockbk , this is probably the wrong place to raise this, but I saw in your HRRR Archive FAQ, you said:

    One day, we hope this data will be archived elsewhere that is more accessible to everyone. Perhaps soon it will be hosted by Amazon by their Opendata initiative. I would advocate to keep it in the GRIB2 format (the original format it is output as), but it would also be nice to store the data in a "cloud-friendly" format such as zarr.

    To have archived HRRR data in Zarr would be AMAZING. We were trying to figure out how to download 1 year of HRRR surface fields to drive a Delaware Bay hydrodynamics simulation, and thinking how useful it would be to have the data on AWS. We could store as Zarr but create GRIB-on-demand service for those who need it. I've been active on the Pangeo project, and we have some tools now that could make the conversion, chunking and upload to cloud much easier. And I'd be happy to help out.

    @zflamig, you guys would be up for a proposalon this, right ?

    opened by rsignell-usgs 21
  • Convert PosixPath to str before passing to cfgrib

    Convert PosixPath to str before passing to cfgrib

    I was just following the basic setup guide. Installed with conda using the example environment.yaml file, and ran into this issue:

    >>> H.xarray('TMP:2 m')
    indexpath value  is ignored
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/herbie/archive.py", line 900, in xarray
        Hxr = cfgrib.open_datasets(
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 105, in open_datasets
        datasets = open_variable_datasets(path, backend_kwargs=backend_kwargs, **kwargs)
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 93, in open_variable_datasets
        datasets.extend(raw_open_datasets(path, bk, **kwargs))
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 66, in raw_open_datasets
        datasets.append(open_dataset(path, backend_kwargs=backend_kwargs, **kwargs))
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_store.py", line 39, in open_dataset
        return xr.open_dataset(path, **kwargs)  # type: ignore
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/xarray/backends/api.py", line 495, in open_dataset
        backend_ds = backend.open_dataset(
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_plugin.py", line 99, in open_dataset
        store = CfGribDataStore(
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/xarray_plugin.py", line 39, in __init__
        self.ds = opener(filename, **backend_kwargs)
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/dataset.py", line 728, in open_fieldset
        index = messages.FieldsetIndex.from_fieldset(fieldset, index_keys, computed_keys)
      File "/usr/local/anaconda3/envs/herbie/lib/python3.9/site-packages/cfgrib/messages.py", line 365, in from_fieldset
        iteritems = enumerate(fieldset)
    TypeError: 'PosixPath' object is not iterable
    

    It seems like that cfgrib either expects a str filename, or an already opened object: https://github.com/ecmwf/cfgrib/blob/a0d9763/cfgrib/xarray_plugin.py#L35-L38

    Converting the path to str here makes the example pass for me.

    opened by WToma 6
  • GFS data downloaded with searchString is not complete. Radiation variables are omitted.

    GFS data downloaded with searchString is not complete. Radiation variables are omitted.

    Hi!

    I have tried to download some radiation variables from GFS with no success. Some months ago I was able to download this data with the same version (Herbie 0.0.6), however, now it only gives me the following set of allowed variables:

    array(['PRMSL', 'CLWMR', 'ICMR', 'RWMR', 'SNMR', 'GRLE', 'REFD', 'REFC',
           'VIS', 'UGRD', 'VGRD', 'VRATE', 'GUST', 'HGT', 'TMP', 'RH', 'SPFH',
           'VVEL', 'DZDT', 'ABSV', 'O3MR', 'TCDC', 'HINDEX', 'MSLET', 'PRES',
           'TSOIL', 'SOILW', 'SOILL', 'CNWAT', 'WEASD', 'SNOD', 'ICETK',
           'DPT', 'APTMP', 'ICEG', 'CPOFP', 'PRATE', 'CSNOW', 'CICEP',
           'CFRZR', 'CRAIN', 'SFCR', 'FRICV', 'VEG', 'SOTYP', 'WILT', 'FLDCP',
           'SUNSD', 'LFTX', 'CAPE', 'CIN', 'PWAT', 'CWAT', 'TOZNE', 'LCDC',
           'MCDC', 'HCDC', 'HLCY', 'USTM', 'VSTM', 'ICAHT', 'VWSH', '4LFTX',
           'HPBL', 'POT', 'PLPL', 'LAND', 'ICEC', 'ICETMP'], dtype=object)
    

    It is only a small subset of the full set of variables in NOAA GFS. Is not possible to download radiation variables from GFS anymore with Herbie?

    GFS 
    opened by sramirez 5
  • How to dump GRIB data into a text file.

    How to dump GRIB data into a text file.

    I am truly aweful at python I can't seem to work with grib outside of python so please don't laugh too hard when you read my code:

    from herbie.archive import Herbie
    import numpy as np
    
    H = Herbie('2022-01-26', model='ecmwf', product='oper', fxx=12)
    
    ds = H.xarray(':2t:', remove_grib=False)
    
    dsw = H.xarray(':10(u|v):', remove_grib=False)
    ds['spd'] = np.sqrt(dsw['u10'] ** 2 + dsw['v10'] ** 2)
    
    dsp = H.xarray(':tp:', remove_grib=False)
    ds['tp'] = dsp['tp']
    
    file = open('test.txt', 'a')
    for lon in ds['longitude']:
        for lat in ds['latitude']:
            point = ds.sel(longitude=lon, latitude=lat, method='nearest')
            line = str(point['longitude'].values) + ',' + str(point['latitude'].values) + ',' + str(point['t2m'].values) + ',' + str(point['spd'].values) + ',' + str(point['tp'].values) + '\n'
            file.write(line)
    file.close()
    

    After 5 minutes it was like 5% done. I get why it's bad, but I honestly just don't want to spend a month learning python.

    I prefer to just make like a (405900, 5) array and store a raw blob file of float32s like so:

    lon1,lat1,t2m1,spd1,tp1,.....,lonN,latN,t2mN,spdN,tpN

    Any advice would be amazing.

    opened by CraigglesO 5
  • Carpenter Workshop instructions missing from tutorials

    Carpenter Workshop instructions missing from tutorials

    First, this tool is awesome. Thanks for publishing and maintaining it!

    I installed your package using conda install -c conda-forge herbie-data, but wasn't able to run this portion of your tutorial https://blaylockbk.github.io/Herbie/_build/html/user_guide/notebooks/data_hrrr.html without first installing Carpenter Workshop, e.g.,

    pip install git+https://github.com/blaylockbk/Carpenter_Workshop.git

    It might be helpful to add some additional instruction/details in the tutorials or here https://github.com/blaylockbk/Herbie#installation.

    documentation 
    opened by williamhobbs 4
  • conda environment.yml install

    conda environment.yml install

    I am trying to install Herbie using the conda environment.yml file but conda is finding conflicts that are causing the install to fail:

    (base) jmiller@ubuntu:~$ conda env create -f environment.yml
    Collecting package metadata (repodata.json): done
    Solving environment: \
    Found conflicts! Looking for incompatible packages.
    This can take several minutes.  Press CTRL-C to abort.
    failed
    
    UnsatisfiableError: The following specifications were found to be incompatible with each other:
    
    
    
    Package geos conflicts for:
    metpy -> cartopy[version='>=0.15.0'] -> shapely[version='>=1.6.4'] -> geos[version='>=3.4']
    cartopy[version='>=0.20.3'] -> geos[version='>=3.10.3,<3.10.4.0a0|>=3.11.0,<3.11.1.0a0']
    metpy -> cartopy[version='>=0.15.0'] -> geos[version='3.6.2|>=3.10.0,<3.10.1.0a0|>=3.10.1,<3.10.2.0a0|>=3.10.2,<3.10.3.0a0|>=3.10.3,<3.10.4.0a0|>=3.11.0,<3.11.1.0a0|>=3.6.2,<3.6.3.0a0|>=3.7.0,<3.7.1.0a0|>=3.7.1,<3.7.2.0a0|>=3.7.2,<3.7.3.0a0|>=3.8.0,<3.8.1.0a0|>=3.8.1,<3.8.2.0a0|>=3.9.0,<3.9.1.0a0|>=3.9.1,<3.9.2.0a0']
    geopandas -> shapely -> geos[version='3.6.2|>=3.10.0,<3.10.1.0a0|>=3.10.1,<3.10.2.0a0|>=3.10.2,<3.10.3.0a0|>=3.10.3,<3.10.4.0a0|>=3.11.0,<3.11.1.0a0|>=3.4|>=3.6.2,<3.6.3.0a0|>=3.7.0,<3.7.1.0a0|>=3.7.1,<3.7.2.0a0|>=3.7.2,<3.7.3.0a0|>=3.8.0,<3.8.1.0a0|>=3.8.1,<3.8.2.0a0|>=3.9.0,<3.9.1.0a0|>=3.9.1,<3.9.2.0a0']
    

    I was able to manually create a conda environment and then install using pip

    UPDATE: Turns out the conda environment I set up using pip actually doesn't want to work either. I'll keep trying and see if I can get it working.

    opened by jjm0022 4
  • Try this: Multithreading for downloading speedup

    Try this: Multithreading for downloading speedup

    Would Multithreading help download many files (and many parts of files) quickly?

    This would be a helper tool used in herbie.tools (in bulk_download)

    Check out this article for some inspiration.

    • https://superfastpython.com/threadpoolexecutor-download-books/
    • https://superfastpython.com/threadpoolexecutor-map-vs-submit/

    It seems like simply downloading files are parts of files is an IO-bound task that could see some speedup fro multi threading.

    Possiblly could see speedup when iterating on downloading chunks of the file.

    πŸ’‘ Idea 
    opened by blaylockbk 4
  • Using Username/Password authorization when extending Herbie

    Using Username/Password authorization when extending Herbie

    I'm trying to add IMERG to the list of models. To download the data one must register with NASA and then use a username/password to access the data.

    Typically I've gotten the data using python through a url using this format https://username:password@URL however when I add that style of url to the model's self.SOURCES I get the following Traceback

    Traceback (most recent call last):
      File "herbie_methods.py", line 37, in <module>
        H.download(verbose=False)
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/site-packages/herbie/archive.py", line 642, in download
        urllib.request.urlretrieve(self.grib, outFile, _reporthook)
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 247, in urlretrieve
        with contextlib.closing(urlopen(url, data)) as fp:
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 222, in urlopen
        return opener.open(url, data, timeout)
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 525, in open
        response = self._open(req, data)
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 542, in _open
        result = self._call_chain(self.handle_open, protocol, protocol +
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 502, in _call_chain
        result = func(*args)
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 1397, in https_open
        return self.do_open(http.client.HTTPSConnection, req,
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/urllib/request.py", line 1323, in do_open
        h = http_class(host, timeout=req.timeout, **http_conn_args)
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/http/client.py", line 1383, in __init__
        super(HTTPSConnection, self).__init__(host, port, timeout,
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/http/client.py", line 834, in __init__
        (self.host, self.port) = self._get_hostport(host, port)
      File "/Users/judson/opt/anaconda3/envs/main/lib/python3.8/http/client.py", line 877, in _get_hostport
        raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
    http.client.InvalidURL: nonnumeric port: '[email protected]'
    

    Is there a way around this, or should I add the username/password differently?

    opened by isodrosotherm 4
  • Create a conda-forge recipe so Herbie can be installed via conda

    Create a conda-forge recipe so Herbie can be installed via conda

    It would be nice if Herbie could be installed with Conda directly, instead of using Pip. Especially since Herbie depends on cfgrib and cartopy, which have dependencies that can't be installed with pip (proj, GEOS, eccodes)

    I don't have experience with this, but want to learn. If anyone can help out with this, that would be awesome 😁

    • https://blog.gishub.org/how-to-publish-a-python-package-on-conda-forge
    • https://docs.conda.io/projects/conda-build/en/latest/user-guide/tutorials/build-pkgs-skeleton.html
    • https://github.com/conda-forge/staged-recipes/blob/main/recipes/example/meta.yaml
    • https://github.com/blaylockbk/staged-recipes#getting-started
    help wanted 
    opened by blaylockbk 3
  • fast_Herbie_xarray() does not work with hrrr's subh product

    fast_Herbie_xarray() does not work with hrrr's subh product

    In attempting to run

    dates = pd.date_range('2022-01-01 1:00', 
                          '2022-01-01 3:00',
                          freq='1H')
    fxx = 1
    h_list = fast_Herbie_xarray(DATES=dates, fxx=fxx, model='hrrr', product='subh', searchString=':PRES:surface')
    

    I get the following traceback

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    Untitled-1.ipynb Cell 5' in <cell line: 1>()
    ----> [1](vscode-notebook-cell:Untitled-1.ipynb?jupyter-notebook#ch0000011untitled?line=0) hh = fast_Herbie_xarray(DATES=dates, fxx=fxx, model='hrrr', product='subh', searchString=':PRES:surface')
    
    File ~/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py:228, in fast_Herbie_xarray(DATES, searchString, fxx, max_threads, xarray_kw, **kwargs)
        [225](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=224)     ds_list = [future.result() for future in as_completed(futures)]
        [227](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=226) # Sort the DataSets, first by lead time (step), then by run time (time)
    --> [228](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=227) ds_list.sort(key=lambda ds: ds.step.item())
        [229](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=228) ds_list.sort(key=lambda ds: ds.time.item())
        [231](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=230) # Reshape list with dimensions (len(DATES), len(fxx))
    
    File ~/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py:228, in fast_Herbie_xarray.<locals>.<lambda>(ds)
        [225](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=224)     ds_list = [future.result() for future in as_completed(futures)]
        [227](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=226) # Sort the DataSets, first by lead time (step), then by run time (time)
    --> [228](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=227) ds_list.sort(key=lambda ds: ds.step.item())
        [229](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=228) ds_list.sort(key=lambda ds: ds.time.item())
        [231](file:///Users/judson/miniforge3/envs/herbie/lib/python3.8/site-packages/herbie/tools.py?line=230) # Reshape list with dimensions (len(DATES), len(fxx))
    
    AttributeError: 'list' object has no attribute 'step'
    

    When I just get one herbie obj and read data into xarray I get the following note

    Note: Returning a list of [2] xarray.Datasets because of multiple hypercubes.
    

    and H.xarray() ends up returning a list of two xarray.Datasets (one with the 15min,30min,45min forecast and one with the 1hr fcst). Pretty sure this is what's causing the issue of not being able to use fast_Herbie_xarray() with hrrr subh. Not sure if there's a way around it?

    opened by isodrosotherm 3
  • Multiple Levels Subsetting

    Multiple Levels Subsetting

    When using the subset function and attempting to subset multiple levels (t2m, surface, 10m wind, etc), xarray does not open all variables, only attained surface variables. Wondering if this has to do with Herbie or xarray. Thought I would make you aware of this.

    opened by FireWx42 3
  • Improve config file usage

    Improve config file usage

    I would like to make changes to the way the config file is used, like as the ability to change the config values temporarily (adjust the defaults inside a single script)

    Here is what ChatGPT says I could do:


    To allow a user to temporarily update config attributes without modifying the config file, you can modify the Config class to store the temporary updates in a separate dictionary. Here's an example of how you can do this:

    class Config:
        def __init__(self, config_file):
            self.config_file = config_file
            self.config = {}
            self.temp_updates = {}
            with open(self.config_file, 'r') as f:
                for line in f:
                    key, value = line.strip().split('=')
                    self.config[key] = value
    
        def get(self, key):
            if key in self.temp_updates:
                return self.temp_updates[key]
            return self.config[key]
    
        def set(self, key, value, temporary=False):
            if temporary:
                self.temp_updates[key] = value
            else:
                self.config[key] = value
                with open(self.config_file, 'w') as f:
                    for k, v in self.config.items():
                        f.write(f'{k}={v}\n')
    
        def reset(self):
            self.temp_updates = {}
    
    

    With this modified Config class, you can use the set method to temporarily update a config value by setting the temporary argument to True. The temporary update will be stored in the temp_updates dictionary, and the get method will return the temporary value if it exists.

    To reset all temporary updates and restore the original config values, you can use the reset method. This will clear the temp_updates dictionary.

    Here's an example of how you might use these new methods:

    # Load the config file
    config = Config('/path/to/config.txt')
    
    # Get the original value of a config key
    value = config.get('key')
    
    # Temporarily update the value of the key
    config.set('key', 'new_value', temporary=True)
    
    # Get the temporary updated value of the key
    temp_value = config.get('key')
    
    # Reset the temporary updates and restore the original value
    config.reset()
    
    # Get the original value of the key again
    value = config.get('key')
    
    

    There, that might give you a starting place.

    opened by blaylockbk 1
  • Test if Beltzer package can be used to create GRIB idx files without downloading fill.

    Test if Beltzer package can be used to create GRIB idx files without downloading fill.

    Ok, so I saw this on LinkedIn and it would be super helpful for Herbie...

    image

    This python package would enable Herbie to subset data even when an index file isn't available for the file.

    https://github.com/joxtoby/beltzer

    opened by blaylockbk 1
  • FastHerbie error when reading ECMWF ensemble in xarray.

    FastHerbie error when reading ECMWF ensemble in xarray.

    Discussed in https://github.com/blaylockbk/Herbie/discussions/116

    Originally posted by csteele2 November 1, 2022 I was trying to use Herbie to easily download and process the european ensemble data. Not sure if I don't understand Herbie-fast or what, because Herbie fast appears to download the entire dataset, and seems like it takes way longer to do one timestep than my loop for 6 days. I have the sample of code I am using below. The fast herbie that is commented out took an hour for maybe one time step? Not sure, when it started another loop, I killed it, because my other loop takes 45 minutes, however, I have not been able to download a complete dataset in the three days I have been trying, for any cycle.

    variable = "tp" #tp for precipitation
    tp_all = []
    valid_times = []
    forecast_hours_qpf = range(3,147,3)
    model_search_string = ":"+variable+":sfc:"
    #forecast_hours_qpf = range(3,147,3)
    #ptotal = fast_Herbie_xarray(DATES=model_run.strftime('%Y-%m-%d %H:00'), model="ecmwf", product="enfo", fxx=forecast_hours_qpf, max_threads=5, search_string=model_search_string)
    
    for t in forecast_hours_qpf:
        H = Herbie(model_run.strftime('%Y-%m-%d %H:00'), model="ecmwf", product="enfo", fxx=t)
        tp = H.xarray(":"+variable+":sfc:")[0]
        #tp = tp.rename({"number":"pertubation"})
        tp_all.append(tp)
        valid_times.append(model_run + timedelta(hours=t))
    
    ptotal = xr.concat([tp_all[i] for i in range(0,len(forecast_hours_qpf))], dim='step')
    

    The most common problem is one or more timesteps will have the number (member/pertubation) coordinates as 0 instead of a an array of length 50. If I go back an assign those 50, it's clear something weird happened as revealed by this spot check of a single point (look at the 05-03Z column): image

    I have not yet attempted to just download those times separately, but I would think this has to be a problem with the processing vs data, right? This is probably way more data than with a typical use-case, but I like me my ensemble data.

    Other than these issues, kudos on this though, it makes dealing with this big data so so so so so so so much easier, and really helps elevate some serious science game.

    bug ECMWF 
    opened by blaylockbk 0
  • create a codespaces container environment

    create a codespaces container environment

    Codespaces just rolled out. Let's see what I can use it for.

    60 free hours seems like plenty of free compute time to check bugs and some development work...without needing to install the python environment on my laptop.

    image

    opened by blaylockbk 0
  • Add interface to wgrib2 to produce subset

    Add interface to wgrib2 to produce subset

    Requested more than once...

    it is possible to set a region and some variables names to download.

    No, it's not possible, but wgrib2 can subset a grib file for you after it is downloaded.

    opened by blaylockbk 0
Releases(2022.09.0)
  • 2022.09.0(Sep 11, 2022)

    Huge shoutout to Ryan May for instructing me on how to publish Herbie on conda-forge. You can now install Herbie 2022.9.0.post1 with Conda

    conda install -c conda-forge herbie-data
    

    What's Changed

    • Changed to CalVer versioning. Release versions will follow the YYYY.mm.micro format as I felt this more clearly highlights when the package was last updated.
    • Changed default branch from master to main
    • Add missing pygrib dependency by @haim0n in https://github.com/blaylockbk/Herbie/pull/74
    • Remove unused import by @haim0n in https://github.com/blaylockbk/Herbie/pull/78
    • Changes subset hash label to include date and fxx by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/96
    • Blaylockbk/issue98 change setup.py to setup.cfg and pyproject.toml by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/99
    • Added pygrib and cartopy as a dependency.
    • Added NAM model #93
    • Documentation: added dark mode documentation via PyData's Sphinx theme
    • Let Herbie find the most recent GRIB2 file. by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/103

    New Contributors

    • @haim0n made their first contribution in https://github.com/blaylockbk/Herbie/pull/74

    Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.10...2022.09.0

    Source code(tar.gz)
    Source code(zip)
  • 0.0.10(May 7, 2022)

    ⭐ Over 110 GitHub stars!

    Wow, I had no idea so many people would find Herbie useful. Thanks everyone.

    What's Changed

    • Now you can import the Herbie class easier with from herbie import Herbie instead of from herbie.archive import Herbie
    • Add template for GEFS by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/56
    • Changed how extending Herbie with custom templates works by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/55
    • Herbie helper tool to make an index file for all grib2 files in a directory. by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/59
    • Smarter cURL ranges for faster subset downloads by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/62
    • Moved the grib and index search in the Herbie class to their own functions (find_grib and find_idx)
    • Cache the full index file Dataframe first time it is read.
    • Added some ASCII art to __init__.py and some Easter eggs (because every project needs cool ASCII art and Easter eggs 😎)
    • Add some multithreading tools to Herbie by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/69 -- have not done extensive tests; watch your memory
    • Fixed subset filename bug (https://github.com/blaylockbk/Herbie/commit/de70cb7b70f899431e5481e28ed4952aab3ef4b6) that affected GFS files.
    • Herbie can make an index file if one does not exist on the remote archive if the full grib2 file is downloaded locally and if wgrib2 is installed. https://github.com/blaylockbk/Herbie/commit/86535735a7ee0cd9045f51fd79d85a03873b3c84
    • Concat HRRR subh fields, if possible https://github.com/blaylockbk/Herbie/commit/8a9a862392d51ee8666647473d7ae4d984f1182d

    Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.9...0.0.10

    Source code(tar.gz)
    Source code(zip)
  • 0.0.9(Mar 8, 2022)

    Changelog

    • Fixed #42: bug where GFS index files could not be found.
    • Speed up Herbie.xarray by Reusing local_file by @WToma in https://github.com/blaylockbk/Herbie/pull/46
    • Removed old hrrrb API and old documentation.

    Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.8...0.0.9

    Source code(tar.gz)
    Source code(zip)
  • 0.0.8(Jan 27, 2022)

    Add access to ECMWF open data forecast products

    The main feature of this release is the ability to retrieve ECMWF open data forecast products (see tweet)

    The big change was implementing a method to read the grib_ls-style index files to get the byte ranges for specific variables/parameters. This means that the searchString argument will need to be specified differently than that used for other models. Read more about the searchString argument for grib_ls-style index files: https://blaylockbk.github.io/Herbie/_build/html/user_guide/searchString.html#grib-ls-style-index-files

    For example:

    from herbie.archive import Herbie
    # Create Herbie object to discover ECMWF operational forecast product
    H = Herbie("2022-01-26", model="ecmwf", product="oper", fxx=12)
    
    # Download the full grib2 file
    H.download()
    
    # Download just the 10-m u and v winds
    H.download(searchString=":10(u|v):")
    
    # Retrieve the 500 hPa temperature as an xarray.Dataset
    ds = H.xarray(searchString=":t:500:")
    

    🏹 More examples for retrieving ECMWF open data


    Changelog

    • Added access to ECMWF's open data products by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/37

    Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.7...0.0.8

    Pretty Pictures

    image image image

    Source code(tar.gz)
    Source code(zip)
  • 0.0.7(Jan 21, 2022)

    Changelog

    • Default config value for priority is now None, which will make the download source priority to be the order of the SOURCE in the models template files.
    • isort all imports
    • Fixed Issue #24: Use hash labels to name subset files to make unique file names so filenames don't get too long (Pull Request #26).
    • When loading data into xarray, Herbie will parse CF grid_mapping from grib file with pygrib/pyproj/metpy. (see CF Appendix F: Grid Mapping for more info). 2789859120883e32592f5559a1e211696b58cf3e
    • Lay the groundwork for local GRIB2 files that are not downloaded from remote sources. Suppose you ran a model locally (like WRF) and have that data stored locally. Herbie can be configured to find and read those local files! This assumes an index file also exists in the same directory with .idx appended to the file name.
      • Documentation needed.
    • Added model template for rap_ncei. This is poorly implemented because the NCEI data doesn't match the other RAP model URLs.
    • #24, implemented hashed filename for subset by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/26
    • Add a Gitter chat badge to README.md by @gitter-badger in https://github.com/blaylockbk/Herbie/pull/21
    • Remove Gitter chat badge. I'm not committed to chatting on Gitter; just focus on discussions.
    • Add support for NCEI historical RAP analyses by @djgagne in https://github.com/blaylockbk/Herbie/pull/30
    • Handle different idx styles by @blaylockbk in https://github.com/blaylockbk/Herbie/pull/35 (more changes in addition to those by djgagne)
    • Fixed #33 so that the herbie.tools.bulk_download will return a dict of Herbie objects that were successful and those that failed.

    Full Changelog: https://github.com/blaylockbk/Herbie/compare/0.0.6...0.0.7

    Source code(tar.gz)
    Source code(zip)
  • 0.0.6(Aug 30, 2021)

    Changelog

    • #18 Use TOML as config file format instead of INI
    • Expanded setting that can be set in the configuration file
    • Adopt Black formatting
    • Moved PyPI project from hrrrb to herbie-data
    Source code(tar.gz)
    Source code(zip)
  • 0.0.5(Jul 28, 2021)

    New Name! HRRR-B πŸ – Herbie

    I updated the GitHub repository name to Herbie and I'm slowly removing the old hrrrb API (but it's still there).

    The most significant change is that the vision of Herbie has expanded. Herbie is being built do download many different model types, not just the HRRR model.

    • Rename package to herbie. "Herbie is your model output download assistant with a mind of its own." Yes, this is named after a favorite childhood movie series.
    • Implement new Herbie class
    • Drop support for hrrrx (experimental HRRR no longer archived on Pando and ESRL is now developing RRFS)
    • Added ability to download and read RAP model GRIB2 files.
    • Less reliance on Pando, more on aws and google.
    • New method for searchString index file search. Uses same regex search patterns as old API.
    • Filename for GRIB2 subset includes all GRIB message numbers.
    • Moved default download source to config file setting.
    • Check local file copy on init. (Don't need to look for file on remote if we have local copy)
    • Option to remove grib2 file when reading xarray if didn't already exist locally (don't clutter local disk).
    • Attach index file DataFrame to object if it exists.
    • If full file exists locally, use remote idx file to cURL local file instead of remote. (Can't create idx file locally because wgrib2 not available on windows)
    • Added GFS data, though it isn't implemented as cleanly as HRRR or RAP
    • Renamed 'field' argument to 'product'
    • ✨ Moved the source URL templates to their own classes in the models folder
    • Renamed GitHub repository to Herbie (changed from HRRR_archive_download)
    • Added RRFS, NBM, GFS, RAP as models Herbie can download
    • Reworked read_idx() to support index files with additional info (specifically for the NBM).
    Source code(tar.gz)
    Source code(zip)
  • 0.0.4(Jun 4, 2021)

    New Herbie API

    There are a few things about the hrrrb API that make it difficult to update, so I started changing things under the new name "herbie." But don't worry, both the hrrrb and herbie APIs are included. The setup.py file also is fixed.

    To use the new Herbie API, refer to the documentation for some usage examples.

    Source code(tar.gz)
    Source code(zip)
  • 0.0.3(Feb 26, 2021)

    Welcome to the world, HRRR-B πŸŽ‚

    This is my first initial GitHub release ever! I have published on PyPi before, but this is my first here on GitHub. Certainly, a happy birthday for the HRRR-B package.

    Be aware, this is v0.0.3, meaning it is subject to change at my leisure. The purpose of this repository is to serve as an example of how you can download HRRR data from archives, but I try to keep this package in a workable state that might be useful for you.

    πŸ“” Documentation

    Source code(tar.gz)
    Source code(zip)
Owner
Brian Blaylock
Atmospheric scientist. Post-doc at Naval Research Laboratory
Brian Blaylock
The LiberaPay archive module for the SeanPM life archive project.

By: Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans | sq Shqiptare Albania

Sean P. Myrick V19.1.7.2 1 Aug 26, 2022
A python package template that can be adapted for RAP projects

Warning - this repository is a snapshot of a repository internal to NHS Digital. This means that links to videos and some URLs may not work. Repositor

NHS Digital 3 Nov 8, 2022
An awesome script to convert the University Of Oviedo web calendar to Google or Outlook calendars.

autoUniCalendar Un script en Python para convertir el calendario de la intranet de la Universidad de Oviedo en un calendario de Outlook o Google Calen

Bimo99B9 14 Sep 28, 2022
A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.

DeltaCAT DeltaCAT is a Pythonic Data Catalog powered by Ray. Its data storage model allows you to define and manage fast, scalable, ACID-compliant dat

null 45 Oct 15, 2022
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.

Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big

Brian Blaylock 88 Dec 16, 2022
Urban Big Data Centre Housing Sensor Project

Housing Sensor Project The Urban Big Data Centre is conducting a study of indoor environmental data in Scottish houses. We are using Raspberry Pi devi

Jeremy Singer 2 Dec 13, 2021
Ahmed Hossam 12 Oct 17, 2022
Bookmarkarchiver - Python script that archives all of your bookmarks on the Internet Archive

bookmarkarchiver Python script that archives all of your bookmarks on the Internet Archive. Supports all major browsers. bookmarkarchiver uses the off

Anthony Chen 3 Oct 9, 2022
Download and archive entire usenet newsgroups over NNTP.

Usenet Archiving Tool This code is for archiving Usenet discussions, not downloading files. Newsgroup posts are saved under the authors name and email

Corey White 2 Dec 23, 2021
Your self-hosted bookmark archive. Free and open source.

Your self-hosted bookmark archive. Free and open source. Contents About LinkAce Support Setup Contribution About LinkAce LinkAce is a self-hosted arch

Kevin Woblick 1.7k Jan 3, 2023
Archive, organize, and watch for changes to publicly available information.

0. Overview The Trapper Keeper is a collection of scripts that support archiving information from around the web to make it easier to study and use. I

Bill Fitzgerald 9 Oct 26, 2022
Automate your Microsoft Learn Student Ambassadors event certificate with Python

Microsoft Learn Student Ambassador Certificate Automation This repo simply use a template certificate docx file and generates certificates both docx a

Muhammed Oğuz 24 Aug 24, 2022
A simple python project which control paint brush in microsoft paint app

Paint Buddy In Python A simple python project which control paint brush in micro

Ordinary Pythoneer 1 Dec 27, 2021
This repository is an archive of emails that are sent by the awesome Quincy Larson every week.

Awesome Quincy Larson Email Archive This repository is an archive of emails that are sent by the awesome Quincy Larson every week. If you fi

Sourabh Joshi 912 Jan 5, 2023
Metal Gear Rising: Revengeance's DAT archive (un)packer

DOOMP Metal Gear Rising: Revengeance's DAT archive (un)packer

Christopher Holzmann PΓ©rez 5 Sep 2, 2022
Tools for downloading and processing numerical weather predictions

NWP Tools for downloading and processing numerical weather predictions At the moment, this code is focused on downloading historical UKV NWPs produced

Open Climate Fix 6 Nov 24, 2022
Wunderland desktop wallpaper and Microsoft Teams background.

Wunderland Professional Impress your colleagues, friends and family with this edition of the "Wunderland" wallpaper. With the nostalgic feel of the or

null 3 Dec 14, 2022
How to use Microsoft Bing to search for leaks?

Installation In order to install the project, you need install its dependencies: $ pip3 install -r requirements.txt Add your Bing API key to bingKey.t

Ernestas Kardzys 2 Sep 21, 2022
Project 2 for Microsoft Azure on WUT

azure-proj2 Project 2 for Microsoft Azure on WUT Table of contents Team Tematyka projektu Architektura Opis rozwiΔ…zania Demo dzaΕ‚ania The Team Krzyszt

null 1 Dec 7, 2021