A Python module for parallel optimization of expensive black-box functions

Overview

blackbox: A Python module for parallel optimization of expensive black-box functions

What is this?

A minimalistic and easy-to-use Python module that efficiently searches for a global minimum of an expensive black-box function (e.g. optimal hyperparameters of simulation, neural network or anything that takes significant time to run). User needs to provide a function, a search domain (ranges of each input parameter) and a total number of function calls available. A code scales well on multicore CPUs and clusters: all function calls are divided into batches and each batch is evaluated in parallel.

A mathematical method behind the code is described in this arXiv note (there were few updates to the method recently): https://arxiv.org/pdf/1605.00998.pdf

Don't forget to cite this note if you are using method/code.

Demo

(a) - demo function (unknown to a method).

(b) - running a procedure using 15 evaluations.

(c) - running a procedure using 30 evaluations.

Installation

pip3 install black-box

Objective function

Simply needs to be wrapped into a Python function.

def fun(par):
    ...
    return output

par is a vector of input parameters (a Python list), output is a scalar value to be minimized.

Running the procedure

import black_box as bb


def fun(par):
    return par[0]**2 + par[1]**2  # dummy example


best_params = bb.search_min(f = fun,  # given function
                            domain = [  # ranges of each parameter
                                [-10., 10.],
                                [-10., 10.]
                                ],
                            budget = 40,  # total number of function calls available
                            batch = 4,  # number of calls that will be evaluated in parallel
                            resfile = 'output.csv')  # text file where results will be saved

Important:

  • All function calls are divided into batches and each batch is evaluated in parallel. Total number of batches is budget/batch. The value of batch should correspond to the number of available computational units.
  • An optional parameter executor = ... should be specified within bb.search_min() in case when custom parallel engine is used (ipyparallel, dask.distributed, pathos etc). executor should be an object that has a map method.

Intermediate results

In addition to search_min() returning list of optimal parameters, all trials are sorted by function value (best ones at the top) and saved in a text file with the following structure:

Parameter #1 Parameter #2 ... Parameter #n Function value
+1.6355e+01 -4.7364e+03 ... +6.4012e+00 +1.1937e-04
... ... ... ... ...

Author

Paul Knysh ([email protected])

Feel free to email me if you have any questions or comments.

Comments
  • Add a new parameter 'executor' for custom parallelisation engine

    Add a new parameter 'executor' for custom parallelisation engine

    New feature : hability to provide a custom executor (an object that behave like the mp.Pool, with a context manager behaviour and a map method.)

    example using dask.distributed tool :

    In [1]: import blackbox as bb
       ...: import numpy as np
       ...: from distributed import Client
       ...: 
       ...: c = Client("localhost:8786")
       ...: 
       ...: def fun(par):
       ...:     a, b = par
       ...:     return np.sqrt((a - 5) ** 2 + (b - 3) ** 2)
       ...: 
       ...: 
       ...: def main():
       ...:     bb.search(f=fun,  # given function
       ...:               box=[[-10., 10.], [-10., 10.]],  # range of values for each parameter
       ...:               n=4,  # number of function calls on initial stage (global search)
       ...:               m=4,  # number of function calls on subsequent stage (local search)
       ...:               batch=8,  # number of calls that will be evaluated in parallel
       ...:               resfile='output.csv',
       ...:               executor=c.get_executor)  # text file where results will be saved
       ...: 
       ...: 
       ...: if __name__ == '__main__':
       ...:     main()
    

    That snippet launch the computation on the different dask workers whose can be on different computers on the network.

    opened by celliern 20
  • TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'

    TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'

    Thank you for providing this very useful toolbox!

    When I try to use it for my optimization problem, it starts out quite well but it raises an error after having evaluated around half of the batches:

    File "/Users/philipp/miniconda3/lib/python3.6/site-packages/black_box/blackbox.py", line 106, in search_min
      points[n+batch*i:n+batch*(i+1), -1] = list(e.map(f, list(map(cubetobox, points[n+batch*i:n+batch*(i+1), 0:-1]))))/fmax
    TypeError: unsupported operand type(s) for /: 'NoneType' and 'float' 
    
    

    It seems, that the map somehow returns a NoneType. I couldn't find the reason for it. Do you have any suggestions?

    Unfortunately, I can not provide a MWE, since my cost-function is a very advanced black-box. The function is called with:

    fit_par = bb.search_min(f=cost_function, domain=bounds_list, budget=50, batch=4, resfile = 'bb_output.csv')

    Tried this on a Linux and MacOs Machine. I am working in a anaconda environment with Python 3.7.9 black-box 1.0.2

    Upon further investigation I discorvered, that there is a nan appearing at the last position of all the sublists in the points list. Could this be the problem and how could this have happened?

    For completeness, my domain is

    [[1000000000.0, 30000000000.0],
     [1000000000.0, 30000000000.0],
     [2e+18, 5e+18],
     [-1e-07, 3e-07],
     [10000.0, 100000.0],
     [-5000.0, 5000.0],
     [10000.0, 100000.0],
     [-5000.0, 5000.0]]
    
    opened by PhilippAumann 7
  • Packaged blackbox using a setup.py

    Packaged blackbox using a setup.py

    Hello,

    I found your module pretty useful so I tried to package it under a setup.py so it can be installed using python setup.py install

    I basically named the package as blackboxfunctions. If you don't like the name you can just change it or I make another pull request with another name that you like. I could not name it blackbox because a package like that already exists so there would be conflicts in case you want to upload this to PyPI

    I also moved blackbox.py under blackboxfunctions directory and created an init.py that basically imports everything so it can be used as

    import blackboxfunctions as bb

    Also I created a .gitignore

    You can check the information of the module in setup.py. I specified Paul Knysh as the author and used the email I found on his github page for the author_email field (not sure if you like this).

    Lastly I changed the way blackbox prints to the output by using python's logging module. So for the error messages I used logger.error, for warnings logger.warning and for info messages logger.info. By doing this the user can actually control the output levels of the blackbox module by using logging.

    I assumed that you would like to keep functionality for python 2.7 even though it is being deprecated, so I made sure that it works for both versions. I tested the setup.py and the module itself against python 2.7 and python 3.8 and they work as expected.

    Also I have a branch in my blackbox repository named logging (https://github.com/tchar/blackbox/tree/logging) with the only change to be the usage of python's logging module without rearranging the structure, no setup.py and no renaming. So if you don't want the changes made for setup.py let me know so I make a pull request just for the logging change.

    Please let me know what you think of the changes

    Cheers, tchar

    opened by tchar 6
  • Parameters constraints

    Parameters constraints

    I'm looking for an adaptation of your method to allow joint constraints on parameters, not individually (i.e. range of value for each params). For instance: param1 + param2 + param3 <1
    0<param1<1
    0<param2<1
    0<param3<1 I can constraint range values of param1, param2 and param3 in [0, 1] but not the sum ?

    Thank you

    opened by ludovicdmt 6
  • Comparing to naive optimization

    Comparing to naive optimization

    For a high dimensional problem, one naive approach is to calculate all the partial derivative in parallel then apply SGD or other algorithm. Have you compare this naive approach with your method, say in term of numbers of function calls and convergence rate.

    opened by Hugo-Leung 4
  • New latin(n,d)

    New latin(n,d)

    Hello! I thought you might be interested in a direct way of making points. It uses low discrepancy quasirandom sequence (with seed) to fill hypercube faster than minimizing spread function and with better spread (24k vs 27k with n=300, d=22). From http://extremelearning.com.au/unreasonable-effectiveness-of-quasirandom-sequences/ Regards, Andrii

    opened by lightash 4
  • Formatting and docstring changes

    Formatting and docstring changes

    Hello, thank you for publishing your project! I'm no optimization expert, but I was interested in learning about your parallel optimization procedure.

    I reformatted the code into a more conventional python style (see the scipy.optimize module for example). This will make the code more readable to the average user. I followed the numpy documentation guide for doctrings (https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt).

    I also added a script (example.py) with an example from your arXiv note.

    Best regards, Drew

    opened by lancastdw 4
  • Comparison with scipy optimize

    Comparison with scipy optimize

    Hello,

    Do you have a comparison with scipy's minimize function (https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize), maybe even using the different methods?

    I have created a simulation (https://github.com/pedvide/simetuc) that takes from 1s to maybe 1000s to solve. optimization of the few (1-5) free parameters is painfully slow for the longer simulations, so I'm interested in any improvement!

    opened by pedvide 3
  • Input an initial guess

    Input an initial guess

    Hello Paul,

    Thank you for this nice piece of code :+1: Is it possible to feed the algorithm with an initial guess ? Or can you think of some clues to implement this ?

    opened by CharlyEmpereurmot 1
  • LIcence

    LIcence

    I would like to use Blackbox as a part of my reactive-transport modeling tool for calibration. How would you like me to cite you in my Github project?

    opened by pycckuu 1
  • let the user choose the pool object

    let the user choose the pool object

    It could be a good idea to have a pool argument which take an object with a map method. That way, it will be easy to use other multiprocessing tools like ipyparallel, dask.distributed or pathos.

    See abcpmc for a similar implementation.

    opened by celliern 1
  • Sample code fails with TypeError

    Sample code fails with TypeError

    The sample code from the readme fails on my Python 3.10.5 install.

    result = bb.minimize(f=fun, # given function
        domain=[[-5, 5], [-5, 5]], # ranges of each parameter
        budget=20, # total number of function calls available
        batch=4 # number of calls that will be evaluated in parallel
    )
    print(result["best_x"])
    print(result["best_f"])
    

    Leads to:

    INFO     evaluating batch 1/5 (samples 1..4/20) 10-06 10:54:00
    INFO     evaluating batch 2/5 (samples 5..8/20) 10-06 10:54:04
    INFO     evaluating batch 3/5 (samples 9..12/20) 10-06 10:54:09
    Traceback (most recent call last):
      File "Optimierung_Logistikkosten_BlackBox.py", line 273, in <module>
        main()
      File "Optimierung_Logistikkosten_BlackBox.py", line 208, in main
        result = bb.minimize(f=fun, # given function
      File "\blackbox\blackbox.py", line 86, in minimize
        v1 = compute_volume_unit_ball(d)
      File "\blackbox\blackbox.py", line 165, in compute_volume_unit_ball
        v1 = np.pi ** (d / 2) / np.math.factorial(d / 2)
    TypeError: 'float' object cannot be interpreted as an integer
    

    Note that I have redacted the path information.

    opened by JSchoeck 1
  • insure int type for math.factorial

    insure int type for math.factorial

    https://bugs.python.org/issue37315 Accepting floats for math.factorial was removed. since d is either even or manipulated to be even, it should be safe to convert to an integer. https://stackoverflow.com/questions/21753841/factorial-in-numpy-and-scipy math.factorial and np.math.factorial is the same thing

    opened by ll01 1
  • Cannot pass function arguments.

    Cannot pass function arguments.

    Hello Paul.

    Is it possible to pass function arguments to the solver? For example, my function is defined as:

    def func(x,*parms):

    And then with the scipy minimize I can pass the parameters with the args= argument:

    minimize(func, Initial_Guesses, method = 'SLSQP', bounds = Bounds, args=Parms )

    I looked at the code but did not see a way to pass this argument. Thanks

    opened by SSimonTemplar 1
  • Integer / categorical parameters

    Integer / categorical parameters

    Is there a way to limit parameters to integers only? I'm currently rounding the parameter in the objective function, but this leads to the same value being evaluated multiple times.

    opened by beojan 1
  • added processes arg to executor call

    added processes arg to executor call

    Using a distributed job submission platform to submit the function evaluation calls to a cluster. It seemed that the number of evaluations run in parallel was lower than what I set it to with the "batch" parameter. Reading through the Python doc for multiprocessing.Pool, the default for the processesargument is the number of cores on the local machine. Setting processes=batch solved the problem, allowing the number of evaluations set to run in parallel.

    opened by stevenmikes 2
Owner
Paul Knysh
Paul Knysh
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet, CVPR2021 安全AI挑战者计划第六期:ImageNet无限制对抗攻击 决赛第四名(team name: Advers)

null 51 Dec 1, 2022
transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

transfer_adv CVPR-2021 AIC-VI: unrestricted Adversarial Attacks on ImageNet CVPR2021 安全AI挑战者计划第六期赛道2:ImageNet无限制对抗攻击 介绍 : 深度神经网络已经在各种视觉识别问题上取得了最先进的性能。

null 25 Dec 8, 2022
Code for "Diversity can be Transferred: Output Diversification for White- and Black-box Attacks"

Output Diversified Sampling (ODS) This is the github repository for the NeurIPS 2020 paper "Diversity can be Transferred: Output Diversification for W

null 50 Dec 11, 2022
[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs In this work, we propose a framework HijackGAN, which enables non-linear latent space travers

Hui-Po Wang 46 Sep 5, 2022
Explainer for black box models that predict molecule properties

Explaining why that molecule exmol is a package to explain black-box predictions of molecules. The package uses model agnostic explanations to help us

White Laboratory 172 Dec 19, 2022
A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

Yunxia Zhao 3 Dec 29, 2022
Code for "Retrieving Black-box Optimal Images from External Databases" (WSDM 2022)

Retrieving Black-box Optimal Images from External Databases (WSDM 2022) We propose how a user retreives an optimal image from external databases of we

joisino 5 Apr 13, 2022
Racing line optimization algorithm in python that uses Particle Swarm Optimization.

Racing Line Optimization with PSO This repository contains a racing line optimization algorithm in python that uses Particle Swarm Optimization. Requi

Parsa Dahesh 6 Dec 14, 2022
Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

ttopt Description Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train (TT) format and maximu

null 5 May 23, 2022
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.

Optimized Einsum Optimized Einsum: A tensor contraction order optimizer Optimized einsum can significantly reduce the overall execution time of einsum

Daniel Smith 653 Dec 30, 2022
Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

scikit-opt Swarm Intelligence in Python (Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Algorithm, Immune Algorithm,A

郭飞 3.7k Jan 3, 2023
library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

Steven G. Johnson 1.4k Dec 25, 2022
The AugNet Python module contains functions for the fast computation of image similarity.

AugNet AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation arxiv link In our work, we propose AugNet, a new deep le

Ming 74 Dec 28, 2022
Opinionated code formatter, just like Python's black code formatter but for Beancount

beancount-black Opinionated code formatter, just like Python's black code formatter but for Beancount Try it out online here Features MIT licensed - b

Launch Platform 16 Oct 11, 2022
PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation The paper: https://arxiv.org/abs/1704.03296 What makes

Jacob Gildenblat 322 Dec 17, 2022
A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

W.I.P-Aim-Memory-Game A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squar

dE_soot 1 Dec 8, 2021