Climin is a Python package for optimization, heavily biased to machine learning scenarios

Overview

climin

climin is a Python package for optimization, heavily biased to machine learning scenarios distributed under the BSD 3-clause license. It works on top of numpy and (partially) gnumpy.

The project was started in winter 2011 by Christian Osendorfer and Justin Bayer. Since then, Sarah Diot-Girard, Thomas Rueckstiess and Sebastian Urban have contributed. If you use climin in your (academic) work, please cite as (tech report is in preparation):

J. Bayer and C. Osendorfer and S. Diot-Girard and T. Rückstiess and Sebastian Urban. climin - A pythonic framework for gradient-based function optimization. TUM Tech Report. 2016. http://github.com/BRML/climin

Important links

Dependencies

The software is tested under Python 2.7 with numpy 1.10.4, scipy 0.17. The tests are run with nosetests.

Installation

Use git to clone the official repository; then run pip install --user -e . in the clone to intall in your local user space.

Testing

From the download directory run nosetests tests/.

Comments
  • Improve minibatch iteration

    Improve minibatch iteration

    • implement MinibatchIterator class - a reusable iterator over minibatches
    • return an instance of MinibatchIterator from minibatches fun
    • the new iterator creates 1 minibatch at a time saving GPU memory
    opened by akosiorek 6
  • Remove stop

    Remove stop

    I just realized that if we never calculate more than needed in the optimizers loop (because it can be done from the outside) we actually don't need the stop functionality. Yields are rather fast (compared to model evaluations). This would make code a lot easier.

    Any objections?

    opened by bayerj 6
  • Stopping criterions and a convenience function to optimize

    Stopping criterions and a convenience function to optimize

    Several reasons might lead you to stop optimization:

    1. the gradient is 0,
    2. the change of the parameters is negligible,
    3. a finite amout of time has passed,
    4. a desired error has been reached,
    5. a finite amount of function/gradient evaluations has been done,
    6. a finite amount of iterations has been done.

    It would be nice to have convenience functions for this. Most are easy, but e.g. 2. needs to keep track of values--thus, the stopping criterion is stateful. I have a feeling we will overshoot if we solve all of these.

    opened by bayerj 5
  • climin now runs on both Python2.7 and Python3.4

    climin now runs on both Python2.7 and Python3.4

    The tests in test_rprop.py fail in the master version. I've verified that the failure is unrelate to the compatibility fixes.

    Also, it seems the builds on Travis CI fail with timeout. I can alter .travis.yml to use conda for NumPy/SciPy. This should significantly speed up the installation step. What do you think?

    opened by superbobry 3
  • Initial step rate for rprop

    Initial step rate for rprop

    introducing an initial step rate as recommended for irprop- fixes the slow start. in some cases it was barely usable, as getting from min_step to a decent step rate takes forever. i also removed the changes_max parameter as it was never used.

    opened by makroiss 3
  • potential bug in adadelta

    potential bug in adadelta

    @bayerj It seems there is a bug in adadelta.py when momentum is used. The momentum correction can be applied to adadelta, rmrprop and others stochastic updates. The potential bug is at the 110-th line of adadelta.py

        def _iterate(self):
            for args, kwargs in self.args:
                step_m1 = self.step
                d = self.decay
                o = self.offset
                m = self.momentum
                step1 = step_m1 * m * self.step_rate
                self.wrt -= step1
    
                gradient = self.fprime(self.wrt, *args, **kwargs)
    
                self.gms = (d * self.gms) + (1 - d) * gradient ** 2
                step2 = sqrt(self.sms + o) / sqrt(self.gms + o) * gradient * self.step_rate
                self.wrt -= step2
    
                self.step = step1 + step2
                self.sms = (d * self.sms) + (1 - d) * self.step ** 2
    
                self.n_iter += 1
    
                yield {
                    'n_iter': self.n_iter,
                    'gradient': gradient,
                    'args': args,
                    'kwargs': kwargs,
                }
    

    I think it should be step1 = step_m1 * m instead of step1 = step_m1 * m * self.step_rate. Correct me if I am wrong.

    Note that in rmsprop.py, the 160-th line is

     step1 = step_m1 * self.momentum
    

    , which is correct.

    opened by yorkerlin 2
  • Installing climin on windows x64, anaconda 2.5 and python 2.7

    Installing climin on windows x64, anaconda 2.5 and python 2.7

    Let me point out that with anaconda 2.5, python 2.7 and windows x64, after installing climin with pip, it is necessary to comment out this section in init.py of climin.

    if sys.platform == 'win32':

    basepath = imp.find_module('numpy')[1]

    ctypes.CDLL(os.path.join(basepath, 'core', 'libmmd.dll'))

    ctypes.CDLL(os.path.join(basepath, 'core', 'libifcoremd.dll'))

    then climin runs superbly.

    Today 64-bit machines are probably the majority and 32-bit machines an exception.

    opened by finmod 1
  • Make OnSignal work on windows

    Make OnSignal work on windows

    There is an issue with FORTAN libraries replacing signal handlers and not being able to recover them:

    http://stackoverflow.com/questions/15457786/ctrl-c-crashes-python-after-importing-scipy-stats

    The solution is to

    Add to climin/__init__.py:

    if sys.platform == 'win32':
      basepath = imp.find_module('numpy')[1]
      ctypes.CDLL(os.path.join(basepath, 'core', 'libmmd.dll'))
      ctypes.CDLL(os.path.join(basepath, 'core', 'libifcoremd.dll'))
    

    And then extend OnSignal with

    import win32api
    win32api.SetConsoleCtrlHandler(self._console_ctrl_handler, 1)
    

    in the windows case.

    After that, climin has to be imported before scipy by the user.

    opened by bayerj 1
  • adjusted TimeElapsed for interruptions

    adjusted TimeElapsed for interruptions

    previously, TimeElapsed would simply calculate time from its instantiation, which is problematic if the experiment is interrupted thereafter. Now it can read out the info dict which can hold a more meaningful runtime value.

    opened by msoelch 1
  • Update adam

    Update adam

    The decay parameter is no longer included in the original adam paper anymore (was apparently only needed for a convergence proof), thus it is set to None and a warning is issued if the user deliberately sets it to another value. Also fixes a bug in the (nesterov) momentum step and includes an optimization mentioned in the adam paper.

    opened by Wiebke 0
  • Update signal handler fix for newer numpy/Anaconda version

    Update signal handler fix for newer numpy/Anaconda version

    A newer Anaconda version (tested with 4.1.5) comes with numpy version 1.10.4 which uses newer Fortran compilers. The old fix fails, but the newly introduced flag 'FOR_DISABLE_CONSOLE_CTRL_HANDLER' (Intel Fortran version >=16.0) can be utilized instead for preventing breaking of Ctrl+C/Break.

    opened by Wiebke 0
  • Support to complex numbers

    Support to complex numbers

    Optimization schemes with complex numbers are widely used in physics, and recently, machine learning.

    I strongly suggest to add the support to complex numbers for optimization engines like RmsProp et. al.

    We just need a few lines of change and several tests.

    E.g. climin/rmsprop.py line 165-167

                self.moving_mean_squared = (
                    self.decay * self.moving_mean_squared
                    + (1 - self.decay) * gradient ** 2) 
                --> + (1 - self.decay) * np.abs(gradient) ** 2)
    

    A single line of change would make it applicable for complex numbers.

    The same is true for Adam and Adadelta.

    On the other side, GradientDescent works well already without any change.

    Maybe a bit effort is needed for Rprop, I have no clue yet how to make it compatible with complex numbers due to the ill defined sign function for complex numbers.

    opened by GiggleLiu 1
  • GD does not accept sequence for `step_rate`

    GD does not accept sequence for `step_rate`

    Contrary to what the docstring says, gradient descent does not accept a sequence for the step_rate parameter.

    I'm happy to submit a pull request for this, if there's still interest!

    opened by markvdw 0
  • Add climin to PyPI?

    Add climin to PyPI?

    Hi, I'd love to see this package in PyPI. Any plans to do so? It would help installing the package and declaring it as a dependency.

    Just to reserve the package name and to test if things work, I registered and uploaded the most recent version to PyPI. I will either remove the package from PyPI or move the PyPI package ownership to you, whatever you wish.

    In order to use PyPI, you'd need to fix version numbering to follow PEP 0440: https://www.python.org/dev/peps/pep-0440/. Thus, you would need to change the version number in setup.py to something like 0.1a1, 0.1b4, 0.1rc2 etc.

    Uploading to PyPI can be done as python setup.py sdist upload.

    If you are interested in this and need any help, I'd be happy to help if I can.

    opened by jluttine 0
  • initial value used in rmsprop

    initial value used in rmsprop

    @bayerj The 152-th line of rmsprop.py

            self.moving_mean_squared = 1
    

    I think the initial value should be 0 instead of 1. Any reason why 1 is better than 0

    opened by yorkerlin 2
Owner
Biomimetic Robotics and Machine Learning at Technische Universität München
Biomimetic Robotics and Machine Learning at Technische Universität München
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 7, 2023
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 8, 2023
Python package for stacking (machine learning technique)

vecstack Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API Convenient wa

Igor Ivanov 671 Dec 25, 2022
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla

null 6.2k Jan 1, 2023
ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

A library for debugging/inspecting machine learning classifiers and explaining their predictions

null 154 Dec 17, 2022
Python package for machine learning for healthcare using a OMOP common data model

This library was developed in order to facilitate rapid prototyping in Python of predictive machine-learning models using longitudinal medical data from an OMOP CDM-standard database.

Sontag Lab 75 Jan 3, 2023
A simple machine learning package to cluster keywords in higher-level groups.

Simple Keyword Clusterer A simple machine learning package to cluster keywords in higher-level groups. Example: "Senior Frontend Engineer" --> "Fronte

Andrea D'Agostino 10 Dec 18, 2022
Data science, Data manipulation and Machine learning package.

duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l

David Kundih 3 Oct 19, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

Microsoft 8.4k Dec 30, 2022
Pyomo is an object-oriented algebraic modeling language in Python for structured optimization problems.

Pyomo is a Python-based open-source software package that supports a diverse set of optimization capabilities for formulating and analyzing optimization models. Pyomo can be used to define symbolic problems, create concrete problem instances, and solve these instances with standard solvers.

Pyomo 1.4k Dec 28, 2022
Implementation of linesearch Optimization Algorithms in Python

Nonlinear Optimization Algorithms During my time as Scientific Assistant at the Karlsruhe Institute of Technology (Germany) I implemented various Opti

Paul 3 Dec 6, 2022
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

null 2 Aug 23, 2022
Bayesian optimization in JAX

Bayesian optimization in JAX

Predictive Intelligence Lab 26 May 11, 2022
jaxfg - Factor graph-based nonlinear optimization library for JAX.

Factor graphs + nonlinear optimization in JAX

Brent Yi 134 Dec 21, 2022
MooGBT is a library for Multi-objective optimization in Gradient Boosted Trees.

MooGBT is a library for Multi-objective optimization in Gradient Boosted Trees. MooGBT optimizes for multiple objectives by defining constraints on sub-objective(s) along with a primary objective. The constraints are defined as upper bounds on sub-objective loss function. MooGBT uses a Augmented Lagrangian(AL) based constrained optimization framework with Gradient Boosted Trees, to optimize for multiple objectives.

Swiggy 66 Dec 6, 2022