OMLT: Optimization and Machine Learning Toolkit

Overview
OMLT CI Status https://codecov.io/gh/cog-imperial/OMLT/branch/main/graph/badge.svg?token=9U7WLDINJJ

OMLT: Optimization and Machine Learning Toolkit

OMLT is a Python package for representing machine learning models (neural networks and gradient-boosted trees) within the Pyomo optimization environment. The package provides various optimization formulations for machine learning models (such as full-space, reduced-space, and MILP) as well as an interface to import sequential Keras and general ONNX models.

Please reference the preprint of this software package as:

@misc{ceccon2022omlt,
     title={OMLT: Optimization & Machine Learning Toolkit},
     author={Ceccon, F. and Jalving, J. and Haddad, J. and Thebelt, A. and Tsay, C. and Laird, C. D. and Misener, R.},
     year={2022},
     eprint={2202.02414},
     archivePrefix={arXiv},
     primaryClass={stat.ML}
}

Examples

import tensorflow
import pyomo.environ as pyo
from omlt import OmltBlock, OffsetScaling
from omlt.neuralnet import FullSpaceNNFormulation, NetworkDefinition
from omlt.io import load_keras_sequential

#load a Keras model
nn = tensorflow.keras.models.load_model('tests/models/keras_linear_131_sigmoid', compile=False)

#create a Pyomo model with an OMLT block
model = pyo.ConcreteModel()
model.nn = OmltBlock()

#the neural net contains one input and one output
model.input = pyo.Var()
model.output = pyo.Var()

#apply simple offset scaling for the input and output
scale_x = (1, 0.5)       #(mean,stdev) of the input
scale_y = (-0.25, 0.125) #(mean,stdev) of the output
scaler = OffsetScaling(offset_inputs=[scale_x[0]],
                    factor_inputs=[scale_x[1]],
                    offset_outputs=[scale_y[0]],
                    factor_outputs=[scale_y[1]])

#provide bounds on the input variable (e.g. from training)
scaled_input_bounds = {0:(0,5)}

#load the keras model into a network definition
net = load_keras_sequential(nn,scaler,scaled_input_bounds)

#multiple formulations of a neural network are possible
#this uses the default NeuralNetworkFormulation object
formulation = FullSpaceNNFormulation(net)

#build the formulation on the OMLT block
model.nn.build_formulation(formulation)

#query inputs and outputs, as well as scaled inputs and outputs
model.nn.inputs
model.nn.outputs
model.nn.scaled_inputs
model.nn.scaled_outputs

#connect pyomo model input and output to the neural network
@model.Constraint()
def connect_input(mdl):
    return mdl.input == mdl.nn.inputs[0]

@model.Constraint()
def connect_output(mdl):
    return mdl.output == mdl.nn.outputs[0]

#solve an inverse problem to find that input that most closely matches the output value of 0.5
model.obj = pyo.Objective(expr=(model.output - 0.5)**2)
status = pyo.SolverFactory('ipopt').solve(model, tee=False)
print(pyo.value(model.input))
print(pyo.value(model.output))

Development

OMLT uses tox to manage development tasks:

  • tox -av to list available tasks
  • tox to run tests
  • tox -e lint to check formatting and code styles
  • tox -e format to automatically format files
  • tox -e docs to build the documentation
  • tox -e publish to publish the package to PyPi

Contributors

GitHub Name Acknowledgements
jalving Jordan Jalving This work was funded by Sandia National Laboratories, Laboratory Directed Research and Development program
fracek Francesco Ceccon This work was funded by an Engineering & Physical Sciences Research Council Research Fellowship [GrantNumber EP/P016871/1]
carldlaird Carl D. Laird Initial work was funded by Sandia National Laboratories, Laboratory Directed Research and Development program. Current work supported by Carnegie Mellon University.
tsaycal Calvin Tsay This work was funded by an Engineering & Physical Sciences Research Council Research Fellowship [GrantNumber EP/T001577/1], with additional support from an Imperial College Research Fellowship.
thebtron Alexander Thebelt This work was supported by BASF SE, Ludwigshafen am Rhein.
Comments
  • Adding time index to

    Adding time index to "ML Surrogates for Chemical Processes with OMLT":

    Discussed in https://github.com/cog-imperial/OMLT/discussions/78

    Originally posted by SaM-92 May 13, 2022 Hello,

    Thank you for this nice library. I'd want to ask a question about adding a time index to the model. Can we add time (t) to the model and then modify the input variable for each (t) and receive the forecast from NN for each (t), then maximum the objective function for the summing over 24 hours?

    I put it in the equation as below. So, basically what I want to do is solving the same problem over a period of let's say 24 hours.

    Thanks so much in advance! :)

    image

    opened by SaM-92 5
  • The upper bound on z should always be at least 0

    The upper bound on z should always be at least 0

    In some cases, it is possible to compute an upper bound on zhat that is strictly negative. In such cases, setting the upper bound on z to be equal to the upper bound on zhat is incorrect and produces an infeasible model. Even if zhat is strictly negative, z can still be 0.

    opened by michaelbynum 5
  • Variable type for NN bounds

    Variable type for NN bounds

    I am confused about what format of input bounds should be passed with an onnx model to omlt. Trying a list or a dictionary:

    lb = np.maximum(0, image - epsilon_infty)
    ub = np.minimum(1, image + epsilon_infty)
    input_bounds = [(float(l), float(u)) for l, u in zip(lb[0], ub[0])]
    

    or

    input_bounds = {}
    for i in range(28*28):
        input_bounds[i] = (float(lb[0][i]), float(ub[0][i])) 
    

    Attempting to create the omlt model with either:

    write_onnx_model_with_bounds(f.name, None, input_bounds)
    network_definition = load_onnx_neural_network_with_bounds(f.name)
    formulation = NeuralNetworkFormulation(network_definition)
    m = pyo.ConcreteModel()
    m.nn = OmltBlock()
    m.nn.build_formulation(formulation) 
    

    Results in: ValueError: Variable 'bounds' keyword must be a tuple or function

    opened by tsaycal 4
  • [WIP] New NetworkDefinition with CNN support

    [WIP] New NetworkDefinition with CNN support

    This is the code I mentioned during our last meeting that changes how networks are defined. This is still a WIP but now I think I'm ready for feedback.

    A couple of things to notice:

    • I store layers in a graph so that we can visit them in topological order
    • Layers inputs and outputs are multi indexed
    • Sometimes we need to "reshape" outputs to fit another node input, for this reason I added a input_index_transformer (now I realise it should be input_index_mapper). This is useful with CNN because the output of the conv layer needs to be reshaped for the dense layers. The input index mapper makes sure we don't need to introduce extra variables.
    • Since z and zhat are now multi indexed, I need to scope them with a block for each layer. This has the added benefit that now models are much easier to debug.

    I didn't have time to check the tests so the best way to run an example is to run the _test.py script I included (and will delete for the final PR).

    opened by fracek 4
  • Support for max pooling layers

    Support for max pooling layers

    This code adds support for formulating max pooling layers into OMLT, using the formulation in Anderson et al. (2020). Changes are made in the io directory to parse ONNX max pool nodes; in neuralnet/layer.py to add a layer class for arbitrary pooling operations; in neuralnet/layers/full_space.py to add the Anderson formulation of max pooling; and in the relevant test programs.

    Legal Acknowledgement
    By contributing to this software project, I agree my contributions are submitted under the BSD license. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

    opened by adi4656 3
  • add comments for math formulations

    add comments for math formulations

    I added comments and math formulations for all functions used to generate the gradient-boosted trees optimization model. Is there an easy way to test how these comments will look like in the docs before merging?

    opened by ThebTron 3
  • make relu and convolution networks smaller

    make relu and convolution networks smaller

    This commit makes the relu network in neural_network_formulations.ipynb smaller. It also uses a smaller convolutional network in mnist_example_convolutionsal.ipynb. Both of these should find solutions with cbc in a few seconds. If Cbc still hangs, it may be a separate issue.

    opened by jalving 3
  • update notebooks to pass tests

    update notebooks to pass tests

    This PR should hopefully get the notebook tests to pass. I updated all of the neural network formulations to use the final names we picked. I also had to make some adjustments to build_network.ipynb.

    opened by jalving 3
  • update io __init__.py with attempt import

    update io __init__.py with attempt import

    This PR uses the pyomo attempt_import to defer the import of dependent packages until they are used (or to check if the package exists). See src/omlt/dependencies.py. This should address issue #96.

    The function attempt_import offers a lot of extra functionality - not sure whether we need to use all of it.

    E.g., to check if keras (or onnx) is available, use: from omlt.dependencies import keras_available

    E.g., to import keras (or onnx) use: from omlt.dependencies import keras

    If you then try to use keras and it is not installed, Pyomo's attempt import will throw an exception.

    Legal Acknowledgement
    By contributing to this software project, I agree my contributions are submitted under the BSD license. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

    opened by jalving 2
  • Update tox.ini and setup.cfg

    Update tox.ini and setup.cfg

    This PR contains minor changes to configuration files to fix linting and notebook test failures that appeared due to dependency updates.

    Legal Acknowledgement
    By contributing to this software project, I agree my contributions are submitted under the BSD license. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

    opened by jalving 2
  • Linting and Formatting

    Linting and Formatting

    Legal Acknowledgement
    By contributing to this software project, I agree my contributions are submitted under the BSD license. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

    This PR makes many changes because it updates all of the code formatting. The main changes are:

    • All src and test files updated using isort and black linting operations.
    • setup.cfg requires python >= 3.7
    • A doc string for OMLT in the outer __init__.py file
    • changes to tox.ini such that github actions SHOULD execute linting checks with flake8 and black.

    Note that most linting is ignored. A separate issue will be opened to address flake8 stylistic requirements.

    opened by jalving 2
  • Solver gives wrong value for sklearn-inputs

    Solver gives wrong value for sklearn-inputs

    Hi, I tried making a simple notebook that optimizes the output of a single-input single-output sklearn gradient boosting model. The output-varieble seems to be ok, but the input variable (decision variable) is not. I've tested it with IPOPT and CBC. I am running this on Linux in Github Codespaces.

    image image

    It's hard to boil down my code to an MRE, so I've attached the full notebook. It is not that big.

    OMLT with sklearn.zip

    opened by viggotw 2
  • OMLT 1.0 missing required dependency

    OMLT 1.0 missing required dependency

    I believe that there is a missing required dependency in the OMLT 1.0 release: onnx is listed as a "testing" dependency, but is required in order to import anything from omlt.io.

    Steps to reproduce:

    % pip install --user omlt
    Collecting omlt
      Downloading omlt-1.0-py2.py3-none-any.whl (29 kB)
    Requirement already satisfied: importlib-metadata in [...] (from omlt) (4.8.1)
    Requirement already satisfied: numpy in [...] (from omlt) (1.21.2)
    Requirement already satisfied: pyomo in [...] (from omlt) (6.4.3.dev0)
    Requirement already satisfied: networkx in [...] (from omlt) (2.6.3)
    Requirement already satisfied: typing-extensions>=3.6.4 in [...] (from importlib-metadata->omlt) (3.10.0.2)
    Requirement already satisfied: zipp>=0.5 in [...] (from importlib-metadata->omlt) (3.4.0)
    Requirement already satisfied: ply in [...] (from pyomo->omlt) (3.11)
    Installing collected packages: omlt
    Successfully installed omlt-1.0
    
    % python
    Python 3.7.12 (default, Nov 10 2021, 15:38:43)
    [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from omlt.io.keras_reader import load_keras_sequential
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "[...]/site-packages/omlt/io/__init__.py", line 1, in <module>
        from omlt.io.onnx import load_onnx_neural_network, write_onnx_model_with_bounds, load_onnx_neural_network_with_bounds
      File "[...]/site-packages/omlt/io/onnx.py", line 4, in <module>
        import onnx
    ModuleNotFoundError: No module named 'onnx'
    

    Possible solutions:

    • Don't import anything from omlt/io/__init__.py
    • Leverage a delayed import mechanism in omlt/io/onnx.py (e.g., something like pyomo.common.dependencies.attempt_import())
    • Add onnx as a required dependency
    opened by jsiirola 2
  • Input bounds documentation

    Input bounds documentation

    @fracek, we need documentation on write_input_bounds and load_input_bounds (both in https://github.com/cog-imperial/OMLT/blob/main/src/omlt/io/input_bounds.py).

    I need to clarify. Does the following make sense and (if it doesn't) would you please suggest an alternative?

    For write_input_bounds:

    """ Write the specified input bounds to the given file. This input implicitly assumes that all inputs are defined (no indices missing) and all indices are bounded. """

    ` """

    Parameters
    ----------
    input_bounds_filename: file
    input_bounds: dict or list
    """`
    

    For load_input_bounds:

    """ Read the input bounds from the given file. """

    ` """

    Parameters
    ----------
    input_bounds_filename: file
    The file should be a list of tuples with a key (index of the input), lower bound (real number), and upper bound (real number).
    """`
    
    question 
    opened by rmisener 1
  • Support for scikit neural networks and renaming omlt.onnx.py

    Support for scikit neural networks and renaming omlt.onnx.py

    This code adds an interface for scikit learn MLPRegressor() objects via the sklearn2onnx library and support for scikit offset scaling objects. All scikit scaling objects which do linear scaling are supported including StandardScaler, MaxAbsScaler, MinMaxScaler, and RobustScaler.

    Some changes were required to the onnx_parser to handle the different conventions created by sklearn2onnx such as the biases being stored as (1, n) matrices instead of (n,) vectors.

    Additionally, since sklearn_reader.py imports the omlt onnx reader it was renamed to onnx_reader.py instead of onnx.py as this leads to a circular import issue since the naming omlt.onnx.py conflicts with the global onnx library.

    opened by joshuahaddad 1
  • Cleanup in activations (constraints and functions)

    Cleanup in activations (constraints and functions)

    There is some design work to do with the activation functions and constraints to gain more code reuse and expand the list of supported activations.

    This issue originally came from a PR comment by @jalving.

    add_constraint=True is not used here

    net_block and net are also not used in activation methods. we should discuss whether we need them. they might be there for consistency on the calling side.

    Originally posted by @jalving in https://github.com/cog-imperial/OMLT/pull/24#discussion_r784112237

    opened by carldlaird 0
  • Add checks for unsupported activation operation types

    Add checks for unsupported activation operation types

    onnx_parser.py (and possibly keras_reader.py) does not contain checks for unsupported operation types:

    === from PR comment

    will this use a linear activation if maybe_node.op_type is not in _ACTIVATION_OP_TYPES? this could lead to an incorrect neural network

    Originally posted by @jalving in https://github.com/cog-imperial/OMLT/pull/24#r784097245

    opened by carldlaird 0
Owner
C⚙G - Imperial College London
Computational Optimisation Group @ Imperial College London
C⚙G - Imperial College London
library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

Steven G. Johnson 1.4k Dec 25, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17k Feb 11, 2021
Racing line optimization algorithm in python that uses Particle Swarm Optimization.

Racing Line Optimization with PSO This repository contains a racing line optimization algorithm in python that uses Particle Swarm Optimization. Requi

Parsa Dahesh 6 Dec 14, 2022
A research toolkit for particle swarm optimization in Python

PySwarms is an extensible research toolkit for particle swarm optimization (PSO) in Python. It is intended for swarm intelligence researchers, practit

Lj Miranda 1k Dec 30, 2022
A simple and lightweight genetic algorithm for optimization of any machine learning model

geneticml This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model. Installation Use pip to ins

Allan Barcelos 8 Aug 10, 2022
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 6, 2023
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 1, 2023
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Facebook Research 408 Jan 1, 2023
MILK: Machine Learning Toolkit

MILK: MACHINE LEARNING TOOLKIT Machine Learning in Python Milk is a machine learning toolkit in Python. Its focus is on supervised classification with

Luis Pedro Coelho 610 Dec 14, 2022
Multi-Modal Machine Learning toolkit based on PyTorch.

简体中文 | English TorchMM 简介 多模态学习工具包 TorchMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 TorchMM 初始版本 v1.0 特性 丰富的任务场景:工具

njustkmg 1 Jan 5, 2022
Multi-Modal Machine Learning toolkit based on PaddlePaddle.

简体中文 | English PaddleMM 简介 飞桨多模态学习工具包 PaddleMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 PaddleMM 初始版本 v1.0 特性 丰富的任务

njustkmg 520 Dec 28, 2022
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Ilya Kostrikov 3k Dec 31, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Dec 30, 2022
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 3, 2022