PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

Related tags

Data Analysis pystan
Overview

PyStan

PyStan is a Python interface to Stan, a package for Bayesian inference.

Stan® is a state-of-the-art platform for statistical modeling and high-performance statistical computation. Thousands of users rely on Stan for statistical modeling, data analysis, and prediction in the social, biological, and physical sciences, engineering, and business.

Notable features of PyStan include:

  • Automatic caching of compiled Stan models
  • Automatic caching of samples from Stan models
  • An interface similar to that of RStan
  • Open source software: ISC License

Getting started

Install PyStan with pip install pystan. PyStan requires Python ≥3.7 running on Linux or macOS. You will also need a C++ compiler such as gcc ≥9.0 or clang ≥10.0.

The following block of code shows how to use PyStan with a model which studied coaching effects across eight schools (see Section 5.5 of Gelman et al (2003)). This hierarchical model is often called the "eight schools" model.

import stan

schools_code = """
data {
  int<lower=0> J;         // number of schools
  real y[J];              // estimated treatment effects
  real<lower=0> sigma[J]; // standard error of effect estimates
}
parameters {
  real mu;                // population treatment effect
  real<lower=0> tau;      // standard deviation in treatment effects
  vector[J] eta;          // unscaled deviation from mu by school
}
transformed parameters {
  vector[J] theta = mu + tau * eta;        // school treatment effects
}
model {
  target += normal_lpdf(eta | 0, 1);       // prior log-density
  target += normal_lpdf(y | theta, sigma); // log-likelihood
}
"""

schools_data = {"J": 8,
                "y": [28,  8, -3,  7, -1,  1, 18, 12],
                "sigma": [15, 10, 16, 11,  9, 11, 10, 18]}

posterior = stan.build(schools_code, data=schools_data)
fit = posterior.sample(num_chains=4, num_samples=1000)
eta = fit["eta"]  # array with shape (8, 4000)
df = fit.to_frame()  # pandas `DataFrame`

Citation

We appreciate citations as they let us discover what people have been doing with the software. Citations also provide evidence of use which can help in obtaining grant funding.

To cite PyStan in publications use:

Riddell, A., Hartikainen, A., & Carter, M. (2021). PyStan (3.0.0). https://pypi.org/project/pystan

Or use the following BibTeX entry:

@misc{pystan,
  title = {pystan (3.0.0)},
  author = {Riddell, Allen and Hartikainen, Ari and Carter, Matthew},
  year = {2021},
  month = mar,
  howpublished = {PyPI}
}

Please also cite Stan.

Comments
  • feat: Add log_prob method to Model

    feat: Add log_prob method to Model

    Added log_prob method to Model instances, allowing users to calculate the log probability of a list of unconstrained parameters.

    This feature is accompanied by a test: the log_prob method is validated by comparing the output against the log probability (lp__) extracted from a model fit.

    Closes #40

    opened by mjcarter95 23
  • Compilation crashes with `undefined symbol: _ZSt28__throw_bad_array_new_lengthv` (Fedora 34, gcc 11)

    Compilation crashes with `undefined symbol: _ZSt28__throw_bad_array_new_lengthv` (Fedora 34, gcc 11)

    After installing pystan and running the 8schools example, the compilation crashes with error message

    ImportError: /home/solant/.cache/httpstan/4.4.2/models/sk7xw5y6/stan_services_model_sk7xw5y6.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZSt28__throw_bad_array_new_lengthv
    

    I'm running Fedora 34, I tried both the most recent python version and 3.7.10. Any idea what's wrong? I couldn't find anything by googling the above error.

    bug 
    opened by solbes 21
  • Exception: variable does not exist;

    Exception: variable does not exist;

    I run this pystan 3.0 with the eight schools case, and find out an error: File "/opt/python3/lib/python3.6/site-packages/stan/model.py", line 189, in build raise RuntimeError(response_payload["error"]["message"]) RuntimeError: Error calling param_names: `Exception: variable does not exist; processing stage=data initialization; variable name=J; base type=int (in 'unknown file name' at line 3)

    Could you please check it? Thank you.

    The source code is : import stan program_code = """ data { int<lower=0> J; // number of schools real y[J]; // estimated treatment effects real<lower=0> sigma[J]; // s.e. of effect estimates } parameters { real mu; real<lower=0> tau; real eta[J]; } transformed parameters { real theta[J]; for (j in 1:J) theta[j] = mu + tau * eta[j]; } model { target += normal_lpdf(eta | 0, 1); target += normal_lpdf(y | theta, sigma); } """

    data = {'J': 8, 'y': [28, 8, -3, 7, -1, 1, 18, 12], 'sigma': [15, 10, 16, 11, 9, 11, 10, 18]}

    posterior = stan.build(program_code, data=data) fit = posterior.sample(num_chains=4, num_samples=1000)

    bug 
    opened by L619 20
  • Run from JupyterLab / Jupyter Notebook

    Run from JupyterLab / Jupyter Notebook

    Running httpstan from Jupyter Lab/Notebook fails due to jupyter is already running asyncio event

    ~\miniconda3\envs\stan3\lib\asyncio\base_events.py in run_forever(self)
        427             raise RuntimeError(
    --> 428                 'Cannot run the event loop while another loop is running')
        429         self._set_coroutine_wrapper(self._debug)
    
    RuntimeError: Cannot run the event loop while another loop is running
    

    Using IPython works (or I'm running this on Windows, and Python crash when I exit the python, so I can save the results with ArviZ to netCDF and use it later).

    opened by ahartikainen 19
  • Document how to run tests

    Document how to run tests

    Dear @ariddell,

    Newcomers would welcome a "Setup" section in the README. I have followed these steps to install the version I'm also hacking on:

    $ conda create -n pystan-next python=3.6 numpy scipy cython -c conda-forge
    $ source activate pystan-next
    $ pip install -r test-requirements.txt
    $ pip install -e .
    

    But I couldn't run the test suite successfully. Running

    $ python -m pytest
    

    gave OS errors related to servers and processes ([Errno 98] Address already in use). I haven't dived into the internals, I would like to work on https://github.com/stan-dev/pystan/issues/338#issuecomment-355820310 directly!

    Thank you, Marianne

    opened by mkcor 19
  • Decode Error when extracting protobuf message (macOS)

    Decode Error when extracting protobuf message (macOS)

    Just playing around with the package, I find that beyond a certain number of data or parameters, I seem to run into a Decode Error.

    Taking the sample model...

    
    schools_code = """
    data {
      int<lower=0> J;         // number of schools
      real y[J];              // estimated treatment effects
      real<lower=0> sigma[J]; // standard error of effect estimates
    }
    parameters {
      real mu;                // population treatment effect
      real<lower=0> tau;      // standard deviation in treatment effects
      vector[J] eta;          // unscaled deviation from mu by school
    }
    transformed parameters {
      vector[J] theta = mu + tau * eta;        // school treatment effects
    }
    model {
      target += normal_lpdf(eta | 0, 1);       // prior log-density
      target += normal_lpdf(y | theta, sigma); // log-likelihood
    }
    """
    

    This succeeds:

    schools_data = {"J": 8,
                    "y": [28,  8, -3,  7, -1,  1, 18, 12],
                    "sigma": [15, 10, 16, 11,  9, 11, 10, 18]}
    
    posterior = stan.build(schools_code, data=schools_data)
    fit = posterior.sample(num_chains=4, num_samples=1000)
    

    This succeeds:

    schools_data = {"J": 8*10,
                    "y": [28,  8, -3,  7, -1,  1, 18, 12]*10,
                    "sigma": [15, 10, 16, 11,  9, 11, 10, 18]*10}
    
    posterior = stan.build(schools_code, data=schools_data)
    fit = posterior.sample(num_chains=4, num_samples=1000)
    

    But this fails:

    schools_data = {"J": 8*11,
                    "y": [28,  8, -3,  7, -1,  1, 18, 12]*11,
                    "sigma": [15, 10, 16, 11,  9, 11, 10, 18]*11}
    
    posterior = stan.build(schools_code, data=schools_data)
    fit = posterior.sample(num_chains=4, num_samples=1000)
    

    Stack trace:

    DecodeError                               Traceback (most recent call last)
    <ipython-input-3-815b881aecc2> in <module>
          1 posterior = stan.build(schools_code, data=schools_data)
    ----> 2 fit = posterior.sample(num_chains=1, num_samples=1000)
    
    ~/.pyenv/versions/3.8.5/envs/stan/lib/python3.8/site-packages/stan/model.py in sample(self, **kwargs)
        239
        240         try:
    --> 241             return asyncio.run(go())
        242         except KeyboardInterrupt:
        243             pass
    
    ~/.pyenv/versions/3.8.5/lib/python3.8/asyncio/runners.py in run(main, debug)
         41         events.set_event_loop(loop)
         42         loop.set_debug(debug)
    ---> 43         return loop.run_until_complete(main)
         44     finally:
         45         try:
    
    ~/.pyenv/versions/3.8.5/lib/python3.8/asyncio/base_events.py in run_until_complete(self, future)
        614             raise RuntimeError('Event loop stopped before Future completed.')
        615
    --> 616         return future.result()
        617
        618     def stop(self):
    
    ~/.pyenv/versions/3.8.5/envs/stan/lib/python3.8/site-packages/stan/model.py in go()
        188                         if resp.status != 200:
        189                             raise RuntimeError((await resp.json())["message"])
    --> 190                         stan_outputs.append(tuple(extract_protobuf_messages(await resp.read())))
        191
        192                 def is_nonempty_logger_message(msg):
    
    ~/.pyenv/versions/3.8.5/envs/stan/lib/python3.8/site-packages/stan/model.py in extract_protobuf_messages(fit_bytes)
        131                 msg = callbacks_writer_pb2.WriterMessage()
        132                 next_pos, pos = varint_decoder(fit_bytes, pos)
    --> 133                 msg.ParseFromString(fit_bytes[pos : pos + next_pos])
        134                 yield msg
        135                 pos += next_pos
    
    DecodeError: Error parsing message
    

    Running on MacOS 10.15.6 with 32GB, Python 3.8.5, stan 3.0.0b4, httpstan 2.3.0

    opened by lwoloszy 17
  • ci: macos-11 image uses 10.9 wheels

    ci: macos-11 image uses 10.9 wheels

    Describe the bug

    For some reason macos11 images in ci pipeline use 10.9 wheels with pip.

    Describe your system

    Github Actions

    Steps/Code to Reproduce

    bug 
    opened by ahartikainen 13
  • BrokenProcessPool issue (Debian 10, py3.9)

    BrokenProcessPool issue (Debian 10, py3.9)

    I get the following error when I build and then sample

    I previously was able to run other stan files on the same machine.

    Exception in callback handle_create_fit.<locals>._services_call_done({'done': True, 'metadata': {'fit': {'name': 'models/xgkim...fits/wonoxjcj'}}, 'name': 'operations/wonoxjcj', 'result': {'code': 400, 'message': "Exception du...broken)\\n']`", 'status': 'Bad Request'}})(<Task finishe...ble anymore')>) at /home/dnachbar/.local/lib/python3.9/site-packages/httpstan/views.py:367
    handle: <Handle handle_create_fit.<locals>._services_call_done({'done': True, 'metadata': {'fit': {'name': 'models/xgkim...fits/wonoxjcj'}}, 'name': 'operations/wonoxjcj', 'result': {'code': 400, 'message': "Exception du...broken)\\n']`", 'status': 'Bad Request'}})(<Task finishe...ble anymore')>) at /home/dnachbar/.local/lib/python3.9/site-packages/httpstan/views.py:367>
    Traceback (most recent call last):
      File "/usr/lib/python3.9/asyncio/events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "/home/dnachbar/.local/lib/python3.9/site-packages/httpstan/views.py", line 392, in _services_call_done
        httpstan.cache.delete_fit(operation["metadata"]["fit"]["name"])
      File "/home/dnachbar/.local/lib/python3.9/site-packages/httpstan/cache.py", line 140, in delete_fit
        path.unlink()
      File "/usr/lib/python3.9/pathlib.py", line 1343, in unlink
        self._accessor.unlink(self)
    FileNotFoundError: [Errno 2] No such file or directory: '/home/dnachbar/.cache/httpstan/4.4.2/models/xgkim2uo/fits/wonoxjcj.jsonlines.lz4'
    Traceback (most recent call last):
      File "/home/dnachbar/python/attribution/sim.py", line 79, in <module>
        fit = model.sample(num_samples=200, num_chains=2)
      File "/home/dnachbar/.local/lib/python3.9/site-packages/stan/model.py", line 74, in sample
        return self.hmc_nuts_diag_e_adapt(**kwargs)
      File "/home/dnachbar/.local/lib/python3.9/site-packages/stan/model.py", line 94, in hmc_nuts_diag_e_adapt
        return self._create_fit(kwargs)
      File "/home/dnachbar/.local/lib/python3.9/site-packages/stan/model.py", line 279, in _create_fit
        return asyncio.run(go())
      File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
        return loop.run_until_complete(main)
      File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
        return future.result()
      File "/home/dnachbar/.local/lib/python3.9/site-packages/stan/model.py", line 209, in go
        raise RuntimeError(operation["result"]["message"])
    RuntimeError: Exception during call to services function: `BrokenProcessPool('A process in the process pool was terminated abruptly while the future was running or pending.')`, traceback: `['  File "/home/dnachbar/.local/lib/python3.9/site-packages/httpstan/services_stub.py", line 153, in call\n    future.result()\n']`
    
    bug 
    opened by dirknbr 11
  • Cast np.int types to python int in `_make_json_serializable`

    Cast np.int types to python int in `_make_json_serializable`

    def _make_json_serializable(data: dict) -> dict:
        """Convert `data` with numpy.ndarray-like values to JSON-serializable form.
        Returns a new dictionary.
        Arguments:
            data (dict): A Python dictionary or mapping providing the data for the
                model. Variable names are the keys and the values are their
                associated values. Default is an empty dictionary.
        Returns:
            dict: Copy of `data` dict with JSON-serializable values.
        """
        # no need for deep copy, we do not modify mutable items
        data = data.copy()
        for key, value in data.items():
            # first, see if the value is already JSON-serializable
            try:
                json.dumps(value)
            except TypeError:
                pass
            else:
                continue
            # numpy scalar
            if isinstance(value, np.ndarray) and value.ndim == 0:
                data[key] = np.asarray(value).tolist()
            # numpy.ndarray, pandas.Series, and anything similar
            elif isinstance(value, collections.abc.Collection):
                data[key] = np.asarray(value).tolist()
            else:
                raise TypeError(f"Value associated with variable `{key}` is not JSON serializable.")
        return data
    

    Currently np.int64 type values raise TypeError (raise TypeError(f"Value associated with variable{key}is not JSON serializable.")).

    I think we can add numbers.Integral and numbers.Real and cast numbers automatically

    from numbers import Intergral, Real
    
    if isinstance(value, Integral):
        value = int(value)
    elif isintance(value, Real):
        value = float(value) 
    
    opened by ahartikainen 11
  • [discussion] PyStan 3 Beta 2 and Proposed PyStan 2 Deprecation Plans

    [discussion] PyStan 3 Beta 2 and Proposed PyStan 2 Deprecation Plans

    This is a draft of a post intended for the Stan Forums. Feedback welcome. It would be nice to find some kind of common ground among the PyStan devs and among any other Stan devs who care to sign on.

    PyStan 3 Beta 2 and Proposed PyStan 2 Deprecation Plans

    PyStan 3 Beta 2 is here and it works for most people. It uses Stan 2.24.0 and has great features PyStan 2 lacks. A release candidate for PyStan 3 will wait until RStan 3 is ready since harmonizing the interfaces is a high-priority goal.

    PyStan 2 works fine but is stuck on 2.19. It is fair to say that PyStan 2 is no longer actively maintained.

    This leaves us in an uncomfortable situation. PyStan 2 is not maintained but there is no official, non-beta-version replacement for PyStan 2. (And there will be no official replacement until RStan 3 is ready.) This post describes one proposed way of dealing with this situation.


    People who can move to PyStan 3 Beta 2 should start doing so. For those using the default sampler, PyStan 3 Beta 2 works fine. If you can draw samples using code written for PyStan 2, you can very likely draw samples using PyStan 3 Beta 2.

    As for people who can't move to or who depend on features missing from PyStan 3 Beta 2, there are some alternatives available.

    People who can start using PyStan 3 Beta 2 right now:

    • macOS and Linux users who have x86_64 hardware and who are drawing samples from models using the default sampler.

    People who cannot start using PyStan 3 Beta 2 right now:

    • Users who use variational inference, maximization algorithms (e.g., LBFGS), or samplers other than the default sampler.
    • Users stuck on Python 2.
    • Windows users who cannot use WSL2 to emulate Linux.

    For these users it makes sense to switch over to a different Stan interface (e.g., CmdStanPy, CmdStan, RStan) or use a different NUTS implementation (e.g., PyMC3). PyStan 3's set of supported platforms is unlikely to change before 2021. People using Stan for VI or posterior maximization can use packages such as jax and pytorch. jax and pytorch support a broader range of optimization algorithms. They also have multinational corporations supporting their development.


    It seems to me that there's a path available for all Python users who want to use newer versions of the Stan C++ library. No Python user is left entirely without options.

    With this in mind, I'd like to propose doing the following over the coming months:

    1. Announce PyStan 3 Beta 2 in a variety of places in order to attract more testers.
    2. Indicate in a variety of places that PyStan 2 is no longer being developed and that folks should start exploring their options.

    Updated 2020-08-19: Minor edits for style, changed WSL to WSL2. Updated 2020-08-31: Minor edits in last section.

    opened by riddell-stan 11
  • Add information from model to fit

    Add information from model to fit

    Hi,

    could we add model.program_code maybe other information too, so one can infer the model and recreate the model without model instance.

    They could to dict under fit.model_info?

    wontfix 
    opened by ahartikainen 11
  • No mention of how to generate logs (verbose=True) for pystan3

    No mention of how to generate logs (verbose=True) for pystan3

    Describe the problem with the documentation

    Not sure if this functionality exists yet for pystan3 (ability for user to specify verbosity to generate more extensive logging), but if so it would be grat to document how this argument can be passed into either the build or sampler methods.

    Suggest a potential alternative/fix

    I have looked in detail at the code base so not sure if this is just a documentation ticket or also a code enhancement.

    opened by jelc53 0
  • `KeyError: 'message'` occurs when trying to build a model with wrong datatype

    `KeyError: 'message'` occurs when trying to build a model with wrong datatype

    Describe the bug

    When the data has wrong datatype (e.g. values of dataset are str type), pystan raises KeyError: 'message'.

    The full Traceback is:

    Traceback (most recent call last):
      File "/workdir/main.py", line 25, in <module>
        posterior = stan.build(model, data=data)
      File "/usr/local/lib/python3.9/site-packages/stan/model.py", line 519, in build
        return asyncio.run(go())
      File "/usr/local/lib/python3.9/asyncio/runners.py", line 44, in run
        return loop.run_until_complete(main)
      File "/usr/local/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
        return future.result()
      File "/usr/local/lib/python3.9/site-packages/stan/model.py", line 511, in go
        raise RuntimeError(resp.json()["message"])
    KeyError: 'message'
    

    This is possibly a problem of the httpstan. But I think this pystan's behavior isn't intended. So I'm reporting here.

    Describe your system

    • CPU type: x86-64
    • Windows 10 19044.2251
    • WSL 2 Ubuntu 20.04
    • Docker Desktop 4.10.1
    • Python 3.9 (I also checked occurrence of the same error in 3.8 and 3.11)
    • pystan 3.6.0 and httpstan 4.9.1 (I also checked in pystan==3.2.0 and httpstan==4.5.0)

    Steps/Code to Reproduce

    My code sample consists of two files. To run them, please place them at the same directory.

    ├── Dockerfile
    └── main.py
    
    # Dockerfile
    FROM python:3.9
    COPY main.py .
    RUN pip3 install httpstan==4.9.1 pystan==3.6.0
    CMD ["python3", "main.py"]
    
    # main.py
    import stan
    
    
    model = """
    data {
      int<lower=0> N;
      vector[N] x;
      vector[N] y;
    }
    parameters {
      real alpha;
      real beta;
      real<lower=0> sigma;
    }
    model {
      y ~ normal(alpha + beta * x, sigma);
    }
    """
    
    # data has str values
    data = {
        'y': ['58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72'],
        'x': ['115', '117', '120', '123', '126', '129', '132', '135', '139', '142', '146', '150', '154', '159', '164'],
        'N': 15
    }
    posterior = stan.build(model, data=data)
    
    bug 
    opened by nigimitama 1
  • I ran into an issue, while trying to get started with stan/pystan.

    I ran into an issue, while trying to get started with stan/pystan.

    I ran into the following errors and warnings, while trying to get started with stan/pystan using the documentation at https://pystan.readthedocs.io/en/latest/ This is what is there now:

    import stan
    
    schools_code = """
    data {
      int<lower=0> J;         // number of schools
      real y[J];              // estimated treatment effects
      real<lower=0> sigma[J]; // standard error of effect estimates
    }
    parameters {
      real mu;                // population treatment effect
      real<lower=0> tau;      // standard deviation in treatment effects
      vector[J] eta;          // unscaled deviation from mu by school
    }
    transformed parameters {
      vector[J] theta = mu + tau * eta;        // school treatment effects
    }
    model {
      target += normal_lpdf(eta | 0, 1);       // prior log-density
      target += normal_lpdf(y | theta, sigma); // log-likelihood
    }
    """
    
    schools_data = {"J": 8,
                    "y": [28,  8, -3,  7, -1,  1, 18, 12],
                    "sigma": [15, 10, 16, 11,  9, 11, 10, 18]}
    
    posterior = stan.build(schools_code, data=schools_data)
    fit = posterior.sample(num_chains=4, num_samples=1000)
    eta = fit["eta"]  # array with shape (8, 4000)
    df = fit.to_frame()  # pandas `DataFrame, requires pandas
    

    Running the above, we get the following messages from the stan.build function:

    Messages from stanc:
    Warning in '/tmp/httpstan_yl4pxs0i/model_zzhabz4t.stan', line 4, column 2: Declaration
        of arrays by placing brackets after a variable name is deprecated and
        will be removed in Stan 2.32.0. Instead use the array keyword before the
        type. This can be changed automatically using the auto-format flag to
        stanc
    Warning in '/tmp/httpstan_yl4pxs0i/model_zzhabz4t.stan', line 5, column 2: Declaration
        of arrays by placing brackets after a variable name is deprecated and
        will be removed in Stan 2.32.0. Instead use the array keyword before the
        type. This can be changed automatically using the auto-format flag to
        stanc
    Warning: The parameter tau has no priors. This means either no prior is
        provided, or the prior(s) depend on data variables. In the later case,
        this may be a false positive.
    Warning: The parameter mu has no priors. This means either no prior is
        provided, or the prior(s) depend on data variables. In the later case,
        this may be a false positive.
    

    Then posterior.sample fails with:

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    Cell In [3], line 1
    ----> 1 fit = posterior.sample(num_chains=4, num_samples=1000)
    
    File ~/.local/lib/python3.10/site-packages/stan/model.py:89, in Model.sample(self, num_chains, **kwargs)
         61 def sample(self, *, num_chains=4, **kwargs) -> stan.fit.Fit:
         62     """Draw samples from the model.
         63 
         64     Parameters in ``kwargs`` will be passed to the default sample function.
       (...)
         87 
         88     """
    ---> 89     return self.hmc_nuts_diag_e_adapt(num_chains=num_chains, **kwargs)
    
    File ~/.local/lib/python3.10/site-packages/stan/model.py:108, in Model.hmc_nuts_diag_e_adapt(self, num_chains, **kwargs)
         92 """Draw samples from the model using ``stan::services::sample::hmc_nuts_diag_e_adapt``.
         93 
         94 Parameters in ``kwargs`` will be passed to the (Python wrapper of)
       (...)
        105 
        106 """
        107 function = "stan::services::sample::hmc_nuts_diag_e_adapt"
    --> 108 return self._create_fit(function=function, num_chains=num_chains, **kwargs)
    
    File ~/.local/lib/python3.10/site-packages/stan/model.py:312, in Model._create_fit(self, function, num_chains, **kwargs)
        309     return fit
        311 try:
    --> 312     return asyncio.run(go())
        313 except KeyboardInterrupt:
        314     return
    
    File /usr/local/lib/python3.10/asyncio/runners.py:44, in run(main, debug)
         42     if debug is not None:
         43         loop.set_debug(debug)
    ---> 44     return loop.run_until_complete(main)
         45 finally:
         46     try:
    
    File /usr/local/lib/python3.10/asyncio/base_events.py:646, in BaseEventLoop.run_until_complete(self, future)
        643 if not future.done():
        644     raise RuntimeError('Event loop stopped before Future completed.')
    --> 646 return future.result()
    
    File ~/.local/lib/python3.10/site-packages/stan/model.py:236, in Model._create_fit.<locals>.go()
        234         sampling_output.write_line("<info>Sampling:</info> <error>Initialization failed.</error>")
        235         raise RuntimeError("Initialization failed.")
    --> 236     raise RuntimeError(message)
        238 resp = await client.get(f"/{fit_name}")
        239 if resp.status != 200:
    
    RuntimeError: Exception during call to services function: `BrokenProcessPool('A child process terminated abruptly, the process pool is not usable anymore')`, traceback: `['  File "/home/knappa/.local/lib/python3.10/site-packages/httpstan/services_stub.py", line 112, in call\n    future = asyncio.get_running_loop().run_in_executor(executor, lazy_function_wrapper_partial)  # type: ignore\n', '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 818, in run_in_executor\n    executor.submit(func, *args), loop=self)\n', '  File "/usr/local/lib/python3.10/concurrent/futures/process.py", line 715, in submit\n    raise BrokenProcessPool(self._broken)\n']`
    

    This is on a debian box (version=testing), running pystan 3.5.0, installed through pip.

    Originally posted by @knappa in https://github.com/stan-dev/pystan/issues/354#issuecomment-1251231774

    opened by knappa 4
  • Deleting model fits from cache in a simulation setting (where each model is given a different seed)

    Deleting model fits from cache in a simulation setting (where each model is given a different seed)

    Describe the problem with the documentation

    I am working with STAN models in a simulation study, using PyStan, where I implement the same model multiple times with different values for random_seed. I noticed that after fitting, the fit is saved to my cache folder under httpstan/4.4.2/models/“model_name”/fits/“fit_name”.

    The problem I run into is that my memory gets cluttered by these files, while I don’t need them. I have tried clearing the folders containing these files manually, but since I am using parallelization, I cannot just delete entire folders on the go.

    Is there a way to delete fit-files after I retrieve the posterior samples that I want, or keep Stan from saving these files? I tried using the delete_fit-function from httpstan.cache, which requires you to specify an identifier for the (e.g. model_name), which is easy to obtain, and an identifier for the fit (e.g. fit_name), which I am not sure how to obtain (there is a calculate_fit_name-function in httpstan.fits, but I cannot get it to work). The documentation on how to use these functions (calculate_fit_name and delete_fit) is not clear to me.

    Suggest a potential alternative/fix

    Could you provide a use case on how to delete model fits from cache (in a setting where a new model is fitted within each iteration of a for-loop)?

    opened by coenvdm 6
  • Update docs and tests for Stan 2.32 syntax changes

    Update docs and tests for Stan 2.32 syntax changes

    Address this kind of warning:

    Declaration of arrays by placing brackets after a variable name is deprecated and
        will be removed in Stan 2.32.0. Instead use the array keyword before the
        type. This can be changed automatically using the auto-format flag to
        stanc
    

    Documentation and tests must be updated.

    bug help wanted 
    opened by riddell-stan 3
  • Setting a maximum number of hardware threads regardless of the number of chains

    Setting a maximum number of hardware threads regardless of the number of chains

    Hi there,

    After skimming through the PyStan documentation, I just realized that there is no easy way to set a maximum number of cores within PyStan itself. I saw a closed issue from 2 years ago, #136, which claimed that this feature doesn't have a clean fix from the Python side of things.

    As such I would like to ask you if:

    1. Has the implementation changed such that this feature can be easily added as e.g. a keyword argument when running the sampling?
    2. Which hacks are there available to limit the number of hardware threads?

    I'm asking this because running more chains than available hardware threads makes Stan run as much chains in parallel and, once one chain is complete, other will take its place executing, as intended. However, setting an artificial upper bound might induce context switching between chains, because PyStan, or Stan, will probably think that there are more hardware threads available to it. Is this something which is possible to happen?

    Also, from #136, the name STAN_THREADS suggest it is an environmental variable, however changing it in the same terminal where I'm running the Python script doesn't seem to work, as 4 threads are still used when sampling the model.

    This is not something very critical, as I could just run the program with nice, but it is a feature I would like to see added nonetheless, for additional convenience.

    Also, when creating a new Github issue, three options show up:

    • Bug report
    • Documentation improvement
    • Question regarding PyStan and/or Stan
    • Blanck issue

    In this case, this is a feature request, which I believe should be here rather than in the Stan forums. Please do correct me if I'm wrong.

    Thank you!

    wontfix 
    opened by jpmvferreira 2
Owner
Stan
Stan
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 8k Dec 29, 2022
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Dec 25, 2022
Describing statistical models in Python using symbolic formulas

Patsy is a Python library for describing statistical models (especially linear models, or models that have a linear component) and building design mat

Python for Data 866 Dec 16, 2022
Statistical package in Python based on Pandas

Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. Some of its main features are listed below. F

Raphael Vallat 1.2k Dec 31, 2022
statDistros is a Python library for dealing with various statistical distributions

StatisticalDistributions statDistros statDistros is a Python library for dealing with various statistical distributions. Now it provides various stati

null 1 Oct 3, 2021
Probabilistic reasoning and statistical analysis in TensorFlow

TensorFlow Probability TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in TensorFlow. As part of the TensorFl

null 3.8k Jan 5, 2023
Creating a statistical model to predict 10 year treasury yields

Predicting 10-Year Treasury Yields Intitially, I wanted to see if the volatility in the stock market, represented by the VIX index (data source), had

null 10 Oct 27, 2021
Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine Intro This repo contains the python/stan version of the Statistical Rethinking

Andrés Suárez 3 Nov 8, 2022
First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

dbt-osmosis First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we wan

Alexander Butler 150 Jan 6, 2023
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
A Python package for the mathematical modeling of infectious diseases via compartmental models

A Python package for the mathematical modeling of infectious diseases via compartmental models. Originally designed for epidemiologists, epispot can be adapted for almost any type of modeling scenario.

epispot 12 Dec 28, 2022
BioMASS - A Python Framework for Modeling and Analysis of Signaling Systems

Mathematical modeling is a powerful method for the analysis of complex biological systems. Although there are many researches devoted on produ

BioMASS 22 Dec 27, 2022
OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere.

opendrift OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere. Do

OpenDrift 167 Dec 13, 2022
We're Team Arson and we're using the power of predictive modeling to combat wildfires.

We're Team Arson and we're using the power of predictive modeling to combat wildfires. Arson Map Inspiration There’s been a lot of wildfires in Califo

Jerry Lee 3 Oct 17, 2021
Flood modeling by 2D shallow water equation

hydraulicmodel Flood modeling by 2D shallow water equation. Refer to Hunter et al (2005), Bates et al. (2010). Diffusive wave approximation Local iner

null 6 Nov 30, 2022
A real data analysis and modeling project - restaurant inspections

A real data analysis and modeling project - restaurant inspections Jafar Pourbemany 9/27/2021 This project represents data analysis and modeling of re

Jafar Pourbemany 2 Aug 21, 2022
BAyesian Model-Building Interface (Bambi) in Python.

Bambi BAyesian Model-Building Interface in Python Overview Bambi is a high-level Bayesian model-building interface written in Python. It's built on to

null 861 Dec 29, 2022
NumPy and Pandas interface to Big Data

Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar inte

Blaze 3.1k Jan 5, 2023
CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner.

CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner. It is aimed to integrate this tool with several more features including providing a User Interface.

Ravi Prakash 3 Jun 27, 2021