MILK: Machine Learning Toolkit

Related tags

Deep Learning milk
Overview

MILK: MACHINE LEARNING TOOLKIT

Machine Learning in Python

Milk is a machine learning toolkit in Python.

Its focus is on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems.

For unsupervised learning, milk supports k-means clustering and affinity propagation.

Milk is flexible about its inputs. It optimised for numpy arrays, but can often handle anything (for example, for SVMs, you can use any dataype and any kernel and it does the right thing).

There is a strong emphasis on speed and low memory usage. Therefore, most of the performance sensitive code is in C++. This is behind Python-based interfaces for convenience.

To learn more, check the docs at http://packages.python.org/milk/ or the code demos included with the source at milk/demos/.

Examples

Here is how to test how well you can classify some features,labels data, measured by cross-validation:

import numpy as np
import milk
features = np.random.rand(100,10) # 2d array of features: 100 examples of 10 features each
labels = np.zeros(100)
features[50:] += .5
labels[50:] = 1
confusion_matrix, names = milk.nfoldcrossvalidation(features, labels)
print 'Accuracy:', confusion_matrix.trace()/float(confusion_matrix.sum())

If want to use a classifier, you instanciate a learner object and call its train() method:

import numpy as np
import milk
features = np.random.rand(100,10)
labels = np.zeros(100)
features[50:] += .5
labels[50:] = 1
learner = milk.defaultclassifier()
model = learner.train(features, labels)

# Now you can use the model on new examples:
example = np.random.rand(10)
print model.apply(example)
example2 = np.random.rand(10)
example2 += .5
print model.apply(example2)

There are several classification methods in the package, but they all use the same interface: train() returns a model object, which has an apply() method to execute on new instances.

Details

License: MIT

Author: Luis Pedro Coelho (with code from LibSVM and scikits.learn)

API Documentation: http://packages.python.org/milk/

Mailing List: http://groups.google.com/group/milk-users

Features

  • SVMs. Using the libsvm solver with a pythonesque wrapper around it.
  • LASSO
  • K-means using as little memory as possible. It can cluster millions of instances efficiently.
  • Random forests
  • Self organising maps
  • Stepwise Discriminant Analysis for feature selection.
  • Non-negative matrix factorisation
  • Affinity propagation

Recent History

The ChangeLog file contains a more complete history.

New in 0.6.1 (11 May 2015)

  • Fixed source distribution

New in 0.6 (27 Apr 2015)

  • Update for Python 3

New in 0.5.3 (19 Jun 2013)

  • Fix MDS for non-array inputs
  • Fix MDS bug
  • Add return_* arguments to kmeans
  • Extend zscore() to work on non-ndarrays
  • Add frac_precluster_learner
  • Work with older C++ compilers

New in 0.5.2 (7 Mar 2013)

  • Fix distribution of Eigen with source

New in 0.5.1 (11 Jan 2013)

  • Add subspace projection kNN
  • Export pdist in milk namespace
  • Add Eigen to source distribution
  • Add measures.curves.roc
  • Add mds_dists function
  • Add verbose argument to milk.tests.run

New in 0.5 (05 Nov 2012)

  • Add coordinate-descent based LASSO
  • Add unsupervised.center function
  • Make zscore work with NaNs (by ignoring them)
  • Propagate apply_many calls through transformers
  • Much faster SVM classification with means a much faster defaultlearner() [measured 2.5x speedup on yeast dataset!]

For older versions, see ChangeLog file

Comments
  • Installation errors using Python 2.7.3

    Installation errors using Python 2.7.3

    Hey @luispedro,

    I have Python 2.7.3 installed through homebrew, and I'm seeing the following error when trying to install through pip:

    Downloading/unpacking milk
      Running setup.py egg_info for package milk
        build_src
        building extension "milk.supervised._perceptron" sources
        building extension "milk.supervised._lasso" sources
        building extension "milk.unsupervised._kmeans" sources
        building extension "milk.supervised._tree" sources
        building extension "milk.unsupervised._som" sources
        building extension "milk.supervised._svm" sources
        build_src: building npy-pkg config files
    
    Installing collected packages: milk
      Running setup.py install for milk
        unifing config_cc, config, build_clib, build_ext, build commands --compiler options
        unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
        build_src
        building extension "milk.supervised._perceptron" sources
        building extension "milk.supervised._lasso" sources
        building extension "milk.unsupervised._kmeans" sources
        building extension "milk.supervised._tree" sources
        building extension "milk.unsupervised._som" sources
        building extension "milk.supervised._svm" sources
        build_src: building npy-pkg config files
        customize UnixCCompiler
        customize UnixCCompiler using build_ext
        customize UnixCCompiler
        customize UnixCCompiler using build_ext
        building 'milk.supervised._lasso' extension
        compiling C++ sources
        C compiler: c++ -fno-strict-aliasing -fno-common -dynamic -I/usr/local/include -DNDEBUG -g -O3 -Wall
    
        compile options: '-I/usr/local/lib/python2.7/site-packages/numpy/core/include -I/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c'
        extra options: '--std=c++0x'
        c++: milk/supervised/_lasso.cpp
        milk/supervised/_lasso.cpp:7:10: fatal error: 'random' file not found
        #include <random>
                 ^
        1 error generated.
        milk/supervised/_lasso.cpp:7:10: fatal error: 'random' file not found
        #include <random>
                 ^
        1 error generated.
        error: Command "c++ -fno-strict-aliasing -fno-common -dynamic -I/usr/local/include -DNDEBUG -g -O3 -Wall -I/usr/local/lib/python2.7/site-packages/numpy/core/include -I/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c milk/supervised/_lasso.cpp -o build/temp.macosx-10.8-x86_64-2.7/milk/supervised/_lasso.o --std=c++0x" failed with exit status 1
        Complete output from command /usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python -c "import setuptools;__file__='/var/folders/cj/5794yqw14v3381qs0br04hf80000gn/T/pip-build/milk/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/cj/5794yqw14v3381qs0br04hf80000gn/T/pip-RpL1TH-record/install-record.txt --single-version-externally-managed:
        running install
    
    running build
    
    running config_cc
    
    unifing config_cc, config, build_clib, build_ext, build commands --compiler options
    
    running config_fc
    
    unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
    
    running build_src
    
    build_src
    
    building extension "milk.supervised._perceptron" sources
    
    building extension "milk.supervised._lasso" sources
    
    building extension "milk.unsupervised._kmeans" sources
    
    building extension "milk.supervised._tree" sources
    
    building extension "milk.unsupervised._som" sources
    
    building extension "milk.supervised._svm" sources
    
    build_src: building npy-pkg config files
    
    running build_py
    
    running build_ext
    
    customize UnixCCompiler
    
    customize UnixCCompiler using build_ext
    
    customize UnixCCompiler
    
    customize UnixCCompiler using build_ext
    
    building 'milk.supervised._lasso' extension
    
    compiling C++ sources
    
    C compiler: c++ -fno-strict-aliasing -fno-common -dynamic -I/usr/local/include -DNDEBUG -g -O3 -Wall
    
    
    
    compile options: '-I/usr/local/lib/python2.7/site-packages/numpy/core/include -I/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c'
    
    extra options: '--std=c++0x'
    
    c++: milk/supervised/_lasso.cpp
    
    milk/supervised/_lasso.cpp:7:10: fatal error: 'random' file not found
    
    #include <random>
    
             ^
    
    1 error generated.
    
    milk/supervised/_lasso.cpp:7:10: fatal error: 'random' file not found
    
    #include <random>
    
             ^
    
    1 error generated.
    
    error: Command "c++ -fno-strict-aliasing -fno-common -dynamic -I/usr/local/include -DNDEBUG -g -O3 -Wall -I/usr/local/lib/python2.7/site-packages/numpy/core/include -I/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c milk/supervised/_lasso.cpp -o build/temp.macosx-10.8-x86_64-2.7/milk/supervised/_lasso.o --std=c++0x" failed with exit status 1
    

    When I clone the repo and run python setup.py install, I get this error:

    compile options: '-I/usr/local/lib/python2.7/site-packages/numpy/core/include -I/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c'
    extra options: '-std=c++0x -stdlib=libc++'
    c++: milk/supervised/_lasso.cpp
    milk/supervised/_lasso.cpp:10:10: fatal error: 'eigen3/Eigen/Dense' file not found
    #include <eigen3/Eigen/Dense>
             ^
    1 error generated.
    milk/supervised/_lasso.cpp:10:10: fatal error: 'eigen3/Eigen/Dense' file not found
    #include <eigen3/Eigen/Dense>
             ^
    1 error generated.
    

    Thanks for any help!

    opened by zachwill 8
  • Added initial centroid parameter for kmeans

    Added initial centroid parameter for kmeans

    There are many applications where specifying the initial centroids for a kmeans run is useful. This could be in cases where new initialization methods are being considered or when some iterative kmeans algorithm is desired (ie: xmeans)

    This pull request adds the base functionality for this in milk.unsupervised.kmeans which should extend to all other related kmeans functions.

    opened by mynameisfiber 3
  • milk build fails saying 'skipping incompatible.....' on windows 7 64 bit anaconda(spyder 2.2.5) Python 2.7.5 64bits, Qt 4.8.4, PySide 1.2.1

    milk build fails saying 'skipping incompatible.....' on windows 7 64 bit anaconda(spyder 2.2.5) Python 2.7.5 64bits, Qt 4.8.4, PySide 1.2.1

    I've been trying to install milk on my windows machine [windows 7 64 bit anaconda(spyder 2.2.5) Python 2.7.5 64bits, Qt 4.8.4, PySide 1.2.1]. Spent whole day trying to figure out what is causing this build fail but failed. Below is a trail of the >>>pip install milk log...

    at the end it skips many lines of code saying 'skipping incompatible...'. Can someone please help me with a way to get around this to work?

    copying milk\unsupervised\gaussianmixture.py -> build\lib.win-amd64-2.7\milk\unsupervised
    copying milk\unsupervised\kmeans.py -> build\lib.win-amd64-2.7\milk\unsupervised
    copying milk\unsupervised\normalise.py -> build\lib.win-amd64-2.7\milk\unsupervised
    copying milk\unsupervised\parzen.py -> build\lib.win-amd64-2.7\milk\unsupervised
    copying milk\unsupervised\pca.py -> build\lib.win-amd64-2.7\milk\unsupervised
    copying milk\unsupervised\pdist.py -> build\lib.win-amd64-2.7\milk\unsupervised
    copying milk\unsupervised\som.py -> build\lib.win-amd64-2.7\milk\unsupervised
    copying milk\unsupervised\__init__.py -> build\lib.win-amd64-2.7\milk\unsupervised
    creating build\lib.win-amd64-2.7\milk\utils
    copying milk\utils\parallel.py -> build\lib.win-amd64-2.7\milk\utils
    copying milk\utils\utils.py -> build\lib.win-amd64-2.7\milk\utils
    copying milk\utils\__init__.py -> build\lib.win-amd64-2.7\milk\utils
    creating build\lib.win-amd64-2.7\milk\wrapper
    copying milk\wrapper\wraplibsvm.py -> build\lib.win-amd64-2.7\milk\wrapper
    copying milk\wrapper\__init__.py -> build\lib.win-amd64-2.7\milk\wrapper
    
    creating build\lib.win-amd64-2.7\milk\tests\data
    
    copying milk\tests\data\jugparallel_jugfile.py -> build\lib.win-amd64-2.7\milk\tests\data
    copying milk\tests\data\jugparallel_kmeans_jugfile.py -> build\lib.win-amd64-2.7\milk\tests\data
    copying milk\tests\data\__init__.py -> build\lib.win-amd64-2.7\milk\tests\data
    
    creating build\lib.win-amd64-2.7\milk\unsupervised\nnmf
    copying milk\unsupervised\nnmf\hoyer.py -> build\lib.win-amd64-2.7\milk\unsupervised\nnmf
    copying milk\unsupervised\nnmf\lee_seung.py -> build\lib.win-amd64-2.7\milk\unsupervised\nnmf
    copying milk\unsupervised\nnmf\__init__.py -> build\lib.win-amd64-2.7\milk\unsupervised\nnmf
    copying milk\tests\data\regression-2-Dec-2009.pp.gz -> build\lib.win-amd64-2.7\milk\tests\data
    copying milk\tests\data\__init__.pyc -> build\lib.win-amd64-2.7\milk\tests\data
    
    running build_ext
    
    Looking for python27.dll
    
    customize Mingw32CCompiler
    
    customize Mingw32CCompiler using build_ext
    
    Looking for python27.dll
    
    customize Mingw32CCompiler
    
    customize Mingw32CCompiler using build_ext
    
    building 'milk.supervised._perceptron' extension
    
    compiling C++ sources
    
    C compiler: g++ -g -DDEBUG -DMS_WIN64 -O0 -Wall
    
    
    
    creating build\temp.win-amd64-2.7
    
    creating build\temp.win-amd64-2.7\Release
    
    creating build\temp.win-amd64-2.7\Release\milk
    
    creating build\temp.win-amd64-2.7\Release\milk\supervised
    
    compile options: '-DNPY_MINGW_USE_CUSTOM_MSVCR -D__MSVCRT_VERSION__=0x0900 -Ie:\
    programs\Anaconda\lib\site-packages\numpy\core\include -Ie:\programs\Anaconda\in
    clude -Ie:\programs\Anaconda\PC -c'
    
    extra options: '-std=c++0x'
    
    g++ -g -DDEBUG -DMS_WIN64 -O0 -Wall -DNPY_MINGW_USE_CUSTOM_MSVCR -D__MSVCRT_VERS
    ION__=0x0900 -Ie:\programs\Anaconda\lib\site-packages\numpy\core\include -Ie:\pr
    ograms\Anaconda\include -Ie:\programs\Anaconda\PC -c milk/supervised/_perceptron
    .cpp -o build\temp.win-amd64-2.7\Release\milk\supervised\_perceptron.o -std=c++0
    x
    
    Found executable e:\programs\Rtools\gcc-4.6.3\bin\g++.exe
    
    g++ -g -shared build\temp.win-amd64-2.7\Release\milk\supervised\_perceptron.o -L
    e:\programs\Anaconda\libs -Le:\programs\Anaconda\PCbuild\amd64 -lpython27 -lmsvc
    r90 -o build\lib.win-amd64-2.7\milk\supervised\_perceptron.pyd
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs/libp
    ython27.a when searching for -lpython27
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs/pyth
    on27.lib when searching for -lpython27
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs\libp
    ython27.a when searching for -lpython27
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs/libp
    ython27.a when searching for -lpython27
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs/pyth
    on27.lib when searching for -lpython27
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs\pyth
    on27.lib when searching for -lpython27
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: cannot find -lpython27
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs/libm
    svcr90.a when searching for -lmsvcr90
    
    e:/programs/rtools/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i
    686-w64-mingw32/bin/ld.exe: skipping incompatible e:\programs\Anaconda\libs\libm
    svcr90.a when searching for -lmsvcr90
    
    collect2: ld returned 1 exit status
    
    error: Command "g++ -g -shared build\temp.win-amd64-2.7\Release\milk\supervised\
    _perceptron.o -Le:\programs\Anaconda\libs -Le:\programs\Anaconda\PCbuild\amd64 -
    lpython27 -lmsvcr90 -o build\lib.win-amd64-2.7\milk\supervised\_perceptron.pyd"
    failed with exit status 1
    
    ----------------------------------------
    Cleaning up...
    Command e:\programs\Anaconda\python.exe -c "import setuptools, tokenize;__file__
    ='c:\\users\\ramesh\\appdata\\local\\temp\\pip_build_Ramesh\\milk\\setup.py';exe
    c(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n')
    , __file__, 'exec'))" install --record c:\users\ramesh\appdata\local\temp\pip-6g
    mee8-record\install-record.txt --single-version-externally-managed --compile fai
    led with error code 1 in c:\users\ramesh\appdata\local\temp\pip_build_Ramesh\mil
    k
    Storing debug log for failure in C:\Users\Ramesh\pip\pip.log
    
    E:\Programs\Anaconda>
    
    opened by rkkarpuram 2
  • ValueError: too many boolean indices

    ValueError: too many boolean indices

    I'm getting this error:

    try_milk/1.9.3 $ python bug.py 
    Traceback (most recent call last):
      File "bug.py", line 14, in <module>
        learner.train(features, labels)
      File "/Users/b/.pythonbrew/pythons/Python-2.7/lib/python2.7/site-packages/milk/supervised/multi.py", line 120, in train
        model = self.base.train(features[idxs], (labels[idxs]==i).astype(int), **child_kwargs)
      File "/Users/b/.pythonbrew/pythons/Python-2.7/lib/python2.7/site-packages/milk/supervised/adaboost.py", line 86, in train
        H,A = _adaboost(features, labels, self.base, self.max_iters)
      File "/Users/b/.pythonbrew/pythons/Python-2.7/lib/python2.7/site-packages/milk/supervised/adaboost.py", line 39, in _adaboost
        train_out = names[train_out]
    ValueError: too many boolean indices
    

    When I run this code from the examples:

    #!/usr/bin/env python
    
    import milk.supervised.randomforest
    import milk.supervised.adaboost
    import milksets.wine
    from milk.supervised.multi import one_against_one
    
    weak = milk.supervised.randomforest.rf_learner()
    learner = milk.supervised.adaboost.boost_learner(weak)
    learner = one_against_one(learner)
    
    features, labels = milksets.wine.load()
    
    learner.train(features, labels)
    

    system information

    try_milk/1.9.3 $ python --version
    Python 2.7
    
    try_milk/1.9.3 $ uname -a
    Darwin username.local 12.0.0 Darwin Kernel Version 12.0.0: Sun Jun 24 23:00:16 PDT 2012; root:xnu-2050.7.9~1/RELEASE_X86_64 x86_64
    

    Virtualenv:

    try_milk/1.9.3 $ yolk -l
    Pygments        - 1.5          - active 
    Python          - 2.7          - active development (/Users/b/.pythonbrew/pythons/Python-2.7/lib/python2.7/lib-dynload)
    bpython         - 0.11         - active 
    distribute      - 0.6.26       - active 
    ipython         - 0.12.1       - active 
    irckit          - 0.1.1        - active 
    milk            - 0.4.2        - active 
    milksets        - 0.1.3        - active 
    numpy           - 1.6.2        - active 
    pandas          - 0.8.1        - active 
    pip             - 1.1          - active 
    python-dateutil - 1.5          - active 
    pytz            - 2012c        - active 
    setuptools      - 0.6c11       - active 
    virtualenv-clone - 0.2.4        - active 
    virtualenv      - 1.7.2        - active 
    virtualenvwrapper - 3.5          - active 
    wsgiref         - 0.1.2        - active development (/Users/b/.pythonbrew/pythons/Python-2.7/lib/python2.7)
    yolk            - 0.4.3        - active
    
    opened by audy 2
  • Errors new in numpy 1.9.0r1

    Errors new in numpy 1.9.0r1

    ======================================================================
    FAIL: milk.tests.test_pdist.test_pdist
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in runTest
        self.test(*self.arg)
      File "X:\Python27-x64\lib\site-packages\milk\tests\test_pdist.py", line 11, in test_pdist
        assert np.allclose(Dxx[i,j], np.sum((X[i]-X[j])**2))
    AssertionError
    
    ======================================================================
    FAIL: milk.tests.test_pdist.test_plike
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in runTest
        self.test(*self.arg)
      File "X:\Python27-x64\lib\site-packages\milk\tests\test_pdist.py", line 27, in test_plike
        assert Lxx[0,0] == Lxx2[0,0]
    AssertionError
    

    Not sure what is going on here.

    opened by charris 1
  • Python 3 version?

    Python 3 version?

    I'm interested in doing some work with self organising maps in python 3. I figure it'd make more sense to use existing code, rather than write my own bugs. Is there a plan to convert milk to python 3? If not, and I attempt it, and would you be willing to pull a 3.x branch, or would it be better to focus on getting some SOM code into a bigger project, like scikit-learn?

    opened by naught101 1
  • Installing milk using virtualenv

    Installing milk using virtualenv

    I am having problems installing milk using virtualenv. I am writing a script in Jenkins that regularly builds milk since some of the older projects we use have dependencies. I am seeing this error on every build

    (virtual_environment)jenkins@developers:/usr0/home/jenkins/workspace/milk$ python setup.py install running install running bdist_egg running egg_info running build_src build_src building extension "milk.supervised._perceptron" sources building extension "milk.supervised._lasso" sources building extension "milk.unsupervised._kmeans" sources building extension "milk.supervised._tree" sources building extension "milk.unsupervised._som" sources building extension "milk.supervised._svm" sources build_src: building npy-pkg config files writing milk.egg-info/PKG-INFO writing top-level names to milk.egg-info/top_level.txt writing dependency_links to milk.egg-info/dependency_links.txt reading manifest file 'milk.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching '*' under directory 'milk/supervised/eigen3' writing manifest file 'milk.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize UnixCCompiler customize UnixCCompiler using build_ext building 'milk.supervised._lasso' extension compiling C++ sources C compiler: g++ -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -fPIC

    compile options: '-I/usr0/home/jenkins/workspace/milk/virtual_environment/local/lib/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c' extra options: '-std=c++0x' g++: milk/supervised/_lasso.cpp milk/supervised/_lasso.cpp:10: fatal error: eigen3/Eigen/Dense: No such file or directory compilation terminated. milk/supervised/_lasso.cpp:10: fatal error: eigen3/Eigen/Dense: No such file or directory compilation terminated. error: Command "g++ -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -fPIC -I/usr0/home/jenkins/workspace/milk/virtual_environment/local/lib/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c milk/supervised/_lasso.cpp -o build/temp.linux-x86_64-2.7/milk/supervised/_lasso.o -std=c++0x" failed with exit status 1

    opened by icaoberg 1
  • Removed unnecessary allocations and casts

    Removed unnecessary allocations and casts

    A lot of data was being allocated when reusing old segments was possible... I took advantage of this as much as possible. I also optimized the distance normalization.

    opened by mynameisfiber 1
  • New release on PyPI?

    New release on PyPI?

    Version 0.5.2 from PyPI fails to install on Wakari due to the C++11 requirement, whereas installation from the Git source succeeds presumably due to PR #9. I'm not sure what your release schedule/policy is usually, but can you push a new release to PyPI? Wakari users will thank you :smiley:

    opened by clayadavis 1
  • Weighted voting AdaBoost

    Weighted voting AdaBoost

    Hello Luis!

    milk/supervised/weighted_voting_adaboost.py is the code I was supposed to send. I see it as a generalization of AdaBoost for working with multiple class labels.

    A learner for this classifier initialization must be already constructed, e.g.: learner = weighted_voting_ada_learner(100, one_against_one(tree_learner(criterion=neg_z1_loss)))

    Maybe I am wrong, but I've discovered that multiclass strategies can't work with weights. For an example, one_against_one.train (milk/supervised/multi.py:114). I think this train() method works with no respect to input weights. And that probably causes my adaboost incorrect learning.

    Best regards, Igor

    opened by ishalyminov 1
  • Documentation does not explain affinity propagation

    Documentation does not explain affinity propagation

    Hi. I notice that your documentation claims that milk supports affinity propagation, but it does not explain nor provide an explicit example. There are also no affinity propagation benchmarks listed.

    Does milk in fact support affinity propagation? If so, how would one use it, and do you have any performance benchmarks against scikit.learn? Further, would this implementation support a precomputed matrix of similarities a la scikit?

    Please let me know. Thanks.

    opened by AkshatM 0
  • Import milk shows : SystemError: initialization of _kmeans failed without raising an exception

    Import milk shows : SystemError: initialization of _kmeans failed without raising an exception

    I installed milk and imported in Ipyhton by import milk. It says the below system error.

    In [1]: import milk
    ---------------------------------------------------------------------------
    SystemError                               Traceback (most recent call last)
    <ipython-input-1-2c1431d8c8f7> in <module>()
    ----> 1 import milk
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/__init__.py in <module>()
         55
         56 try:
    ---> 57     from .nfoldcrossvalidation import nfoldcrossvalidation
         58     from .supervised.defaultclassifier import defaultclassifier
         59     from .supervised.defaultlearner import defaultlearner
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/nfoldcrossvalidation.py in <module>()
    ----> 1 from .measures.nfoldcrossvalidation import foldgenerator, getfold, nfoldcrossvalidation
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/measures/nfoldcrossvalidation.py in <module>()
          5
          6 from __future__ import division
    ----> 7 from ..supervised.classifier import normaliselabels
          8 import numpy as np
          9 from functools import reduce
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/supervised/__init__.py in <module>()
         51
         52 from .defaultclassifier import defaultclassifier, svm_simple
    ---> 53 from .classifier import normaliselabels
         54 from .gridsearch import gridsearch
         55 from .tree import tree_learner
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/supervised/classifier.py in <module>()
         23 from __future__ import division
         24 import numpy as np
    ---> 25 from .normalise import normaliselabels
         26 from .base import supervised_model
         27
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/supervised/normalise.py in <module>()
          8 import numpy as np
          9 from .base import supervised_model
    ---> 10 from ..unsupervised.normalise import zscore
         11
         12 __all__ = [
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/unsupervised/__init__.py in <module>()
         16 '''
         17
    ---> 18 from .kmeans import kmeans,repeated_kmeans, select_best_kmeans
         19 from .gaussianmixture import *
         20 from .pca import pca, mds, mds_dists
    
    /Users/elancheliyan/anaconda/lib/python3.5/site-packages/milk/unsupervised/kmeans.py in <module>()
         25 from numpy import linalg
         26
    ---> 27 from . import _kmeans
         28 from ..utils import get_pyrandom
         29 from .normalise import zscore
    
    SystemError: initialization of _kmeans failed without raising an exception
    

    I found milk will be helpful in case to do K means clustering with mahalanobis distance instead of euclidean distance. can you please tell me how to rectify this error.

    opened by epratheeban 1
  • RuntimeError when running SOM

    RuntimeError when running SOM

    Hi,

    Thanks for the great work!

    I found an issue when running SOM. My python shell is under Win7 64bit using Anaconda. Here is my code.

        from sklearn import datasets
        import milk
        n_points = 1000
        X, color = datasets.samples_generator.make_s_curve(n_points, random_state=0)
        g=(1000,)
        grid = milk.unsupervised.som(data=X, shape=g, iterations=1000, L=.2, radius=4, R=None)
    

    Here is the error message.

        RuntimeError                              Traceback (most recent call last)
        <ipython-input-7-dbad0be19950> in <module>()
        ----> 1 grid = milk.unsupervised.som(data=X, shape=g, iterations=1000, L=.2, radius=4, R=None)
    
        C:\Anaconda\lib\site-packages\milk\unsupervised\som.pyc in som(data, shape, iterations, L, radius, R)
        109         data = data.astype(np.float32)
        110     grid = np.array(R.sample(list(data), np.product(shape))).reshape(shape + (d,))
        --> 111     putpoints(grid, data, L=L, radius=radius, iterations=iterations, shuffle=True, R=R)
        112     return grid
    
        C:\Anaconda\lib\site-packages\milk\unsupervised\som.pyc in putpoints(grid, points, L, radius, iterations, shuffle, R)
         49         if shuffle:
         50             random.shuffle(points)
        ---> 51         _som.putpoints(grid, points, L, radius)
         52
         53 def closest(grid, f):
    
        RuntimeError: Arguments to putpoints don't conform to expectation. Are you calling this directly? This is an internal function!
    

    Did I make a mistake? I have checked the _som.pyd file is in the same folder with SOM.py file, so it seems no issue in SOM.py. However, since I don't know how to open _som.pyd file, I am not sure whether I make a mistake in using codes.

    Thanks!

    opened by vickyliau 0
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17k Feb 11, 2021
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 6, 2023
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 1, 2023
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Facebook Research 408 Jan 1, 2023
Multi-Modal Machine Learning toolkit based on PyTorch.

简体中文 | English TorchMM 简介 多模态学习工具包 TorchMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 TorchMM 初始版本 v1.0 特性 丰富的任务场景:工具

njustkmg 1 Jan 5, 2022
Multi-Modal Machine Learning toolkit based on PaddlePaddle.

简体中文 | English PaddleMM 简介 飞桨多模态学习工具包 PaddleMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 PaddleMM 初始版本 v1.0 特性 丰富的任务

njustkmg 520 Dec 28, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Dec 30, 2022
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 3, 2022
CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning This repository contains the code and relevant instructions

XiaoMing 5 Aug 19, 2022
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 8, 2023
D2Go is a toolkit for efficient deep learning

D2Go D2Go is a production ready software system from FacebookResearch, which supports end-to-end model training and deployment for mobile platforms. W

Facebook Research 744 Jan 4, 2023
TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Fernando Pérez-García 1.6k Jan 6, 2023
TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision

TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision

null 52 Dec 23, 2022
FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

null 18 Sep 1, 2022
🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Cogitare is a Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python. A friendly interface for beginners and a powerful too

Cogitare - Modern and Easy Deep Learning with Python 76 Sep 30, 2022