Machine Learning toolbox for Humans

Related tags

Deep Learning rep
Overview

Reproducible Experiment Platform (REP)

Join the chat at https://gitter.im/yandex/rep Build Status PyPI version Documentation CircleCI

REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way.

Main features:

  • unified python wrapper for different ML libraries (wrappers follow extended scikit-learn interface)
    • Sklearn
    • TMVA
    • XGBoost
    • uBoost
    • Theanets
    • Pybrain
    • Neurolab
    • MatrixNet service(available to CERN)
  • parallel training of classifiers on cluster
  • classification/regression reports with plots
  • interactive plots supported
  • smart grid-search algorithms with parallel execution
  • research versioning using git
  • pluggable quality metrics for classification
  • meta-algorithm design (aka 'rep-lego')

REP is not trying to substitute scikit-learn, but extends it and provides better user experience.

Howto examples

To get started, look at the notebooks in /howto/

Notebooks can be viewed (not executed) online at nbviewer
There are basic introductory notebooks (about python, IPython) and more advanced ones (about the REP itself)

Examples code is written in python 2, but library is python 2 and python 3 compatible.

Installation with Docker

We provide the docker image with REP and all it's dependencies. It is a recommended way, specially if you're not experienced in python.

Installation with bare hands

However, if you want to install REP and all of its dependencies on your machine yourself, follow this manual: installing manually and running manually.

Links

License

Apache 2.0, library is open-source.

Minimal examples

REP wrappers are sklearn compatible:

from rep.estimators import XGBoostClassifier, SklearnClassifier, TheanetsClassifier
clf = XGBoostClassifier(n_estimators=300, eta=0.1).fit(trainX, trainY)
probabilities = clf.predict_proba(testX)

Beloved trick of kagglers is to run bagging over complex algorithms. This is how it is done in REP:

from sklearn.ensemble import BaggingClassifier
clf = BaggingClassifier(base_estimator=XGBoostClassifier(), n_estimators=10)
# wrapping sklearn to REP wrapper
clf = SklearnClassifier(clf)

Another useful trick is to use folding instead of splitting data into train/test. This is specially useful when you're using some kind of complex stacking

from rep.metaml import FoldingClassifier
clf = FoldingClassifier(TheanetsClassifier(), n_folds=3)
probabilities = clf.fit(X, y).predict_proba(X)

In example above all data are splitted into 3 folds, and each fold is predicted by classifier which was trained on other 2 folds.

Also REP classifiers provide report:

report = clf.test_on(testX, testY)
report.roc().plot() # plot ROC curve
from rep.report.metrics import RocAuc
# learning curves are useful when training GBDT!
report.learning_curve(RocAuc(), steps=10)  

You can read about other REP tools (like smart distributed grid search, folding and factory) in documentation and howto examples.

Comments
  • Problem with TMVAClassifier

    Problem with TMVAClassifier

    After REP installation from here, I've met the following problem with TMVAClassifier fitting: I'm trying to train TMVAClassifier, and IOError raises after following strings: " baseline = TMVAClassifier(method='kBDT', features=variables, BoostType='Grad', NTrees=40, Shrinkage=0.01, MaxDepth=7, UseNvars=6, nCuts=-1) features=variables)

    baseline.fit(train, train['signal'])"

    Stacktrace is next: IOError Traceback (most recent call last) in () 3 UseNvars=6, nCuts=-1) 4 # baseline = TMVAClassifier(method='kBDT', NTrees=50, Shrinkage=0.05, features=variables) ----> 5 baseline.fit(train, train['signal'])

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in fit(self, X, y, sample_weight) 288 self.factory_options = '{}:AnalysisType=Multiclass'.format(self.factory_options) 289 --> 290 return self._fit(X, y, sample_weight=sample_weight) 291 292 def predict_proba(self, X):

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in _fit(self, X, y, sample_weight, model_type) 104 add_info = _AdditionalInformation(directory, model_type=model_type) 105 try: --> 106 self._run_tmva_training(add_info, X, y, sample_weight) 107 finally: 108 self._remove_tmp_directory(directory)

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in run_tmva_training(self, info, X, y, sample_weight) 134 xml_filename = os.path.join(info.directory, 'weights', 135 '{job}{name}.weights.xml'.format(job=info.tmva_job, name=self._method_name)) --> 136 with open(xml_filename, 'r') as xml_file: 137 self.formula_xml = xml_file.read() 138

    IOError: [Errno 2] No such file or directory: '/home/artem/Documents/IPython Notebooks/CERN + Yandex/Original Baseline/flavours-of-physics-start/tmp0Fhtqe/weights/TMVAEstimation_REP_Estimator.weights.xml'

    As I found, weights/ folder was created outside of temporary folder instead created inside in last one. It causes the error above.

    ROOT 5.34, Python 2.7, GCC 4.8, Ubuntu 14.04 LTS (x64). All requirenments for REP were installed successfully (from requirenments.txt)

    bug 
    opened by HolyBayes 9
  • FoldingClassifier: KFold vs StratifiedKFold

    FoldingClassifier: KFold vs StratifiedKFold

    Hey,

    first of all a compliment: I really like your repo and I build a lot of code on it, it's so useful! About the FoldingClassifier: There was already a request to implement the StratifiedKFolding additionally to the "normal" KFolding. I would be very glad to see this but I'd even go a step further: why don't you completely replace the KFold with a StratifiedKFold?

    I think, from an ML point of view, it is always better (or, in best case, equally good) to use a stratified one. Using a normal KFolding only introduces different class-balances which (usually) result in "shifted" probabilities among the different classifier, whereas a stratified one does not and therefore makes each trained classifiers predictions "comparable".

    Or in other words: I cannot think of any case where you want to have a non-stratified KFolding instead of a stratified one.

    What do you think?

    Best, Mayou

    enhancement 
    opened by jonas-eschle 5
  • Support for build on hosted on (ana)conda

    Support for build on hosted on (ana)conda

    I see that some of the continuous integration scripts support conda builds, although not all the dependencies are installed this way. Is there any hope of seeing a build on conda soon for Linux x86_64 systems?

    The reason I ask is that I have accounts on numerous batch systems, none of which I have root access or have any way to use docker. They're all linux-based though, as is the norm. So far as I know, this is the case for many researchers.

    It'd be great to see a way to quickly install REP on these systems. This would:

    • Cut down on the time needed to introduce people to REP
    • Hook into the environment management and environment logging provided by conda
    • Easily and quickly deploy REP on supercomputing nodes while requiring little of their filesystem

    This is especially useful for ensuring the ROOT install is sane. I know there has already been a lot of work in the direction of making REP easy to access and install. Perhaps this could be a healthy addition?

    question 
    opened by ewengillies 5
  • Add ability to initialise FoldingBase objects with external parser

    Add ability to initialise FoldingBase objects with external parser

    If you would like to run rep with eg a StratifiedKFold instead of a normal KFold, this will be possible after the pull request. If no external folder-object is parsed, the default KFold algorithm is used.

    opened by mschlupp 5
  • test_xgboost file is not running on windows 10

    test_xgboost file is not running on windows 10

    test_xgboost file is not running on windows 10 File "c:\Sander\my_code\rep-master\tests\test_xgboost.py", line 4, in from rep.estimators import XGBoostClassifier, XGBoostRegressor

    ImportError: cannot import name XGBoostClassifier

    when rep installatoin is ok but xgboost instal fails Microsoft Windows Version 10.0.10586 2015 Microsoft Corporation. All rights reserved.

    c:\Sander>pip install rep --no-dependencies Collecting rep Downloading rep-0.6.5.tar.gz (72kB) 100% |################################| 81kB 511kB/s Building wheels for collected packages: rep Running setup.py bdist_wheel for rep ... done Stored in directory: C:\Users\Sander\AppData\Local\pip\Cache\wheels\db\ee\06\ac6e3f3ec208edaee29654f0b55ffaf2719a51de799c396b91 Successfully built rep Installing collected packages: rep Successfully installed rep-0.6.5 You are using pip version 8.1.0, however version 8.1.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

    c:\Sander>pip install xgboost==0.4a30 Collecting xgboost==0.4a30 Downloading xgboost-0.4a30.tar.gz (753kB) 100% |################################| 757kB 553kB/s No files/directories in c:\users\sander\appdata\local\temp\pip-build-exobfm\xgboost\pip-egg-info (from PKG-INFO) You are using pip version 8.1.0, however version 8.1.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

    c:\Sander>

    opened by Sandy4321 5
  • Manual Install on Windows

    Manual Install on Windows

    Hi! Is there a way to install REP manually on Windows environment? When installing dependencies i get an error when installing gnureadline:

    Error: this module is not meant to work on Windows (try pyreadline instead)

    Is there a way to use pyreadline for windows uoosers?

    wontfix 
    opened by funkindy 4
  • Mac OS instalation with docker

    Mac OS instalation with docker

    It seems last docker release depricates boot2docker http://docs.docker.com/installation/mac/ "This release of Docker deprecates the Boot2Docker command line in favor of Docker Machine"

    How to install REP with latest docker release?

    opened by pupadupa 4
  • test failed

    test failed

    after python setup.py install I run cd tests ; nosetests . it runs for long time and ends up with errors:

    ..Info in <TCanvas::Print>: png file /tmp/tmpBg1dar.png has been created
    Error in <TFile::TFile>: file toy_datasets/toyMC_bck_mass.root does not exist
    E..E.
    ======================================================================
    ERROR: tests.z_test_notebook.test_notebooks_in_folder('/root/rep/howto/00-intro-ROOT.ipynb',)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
        self.test(*self.arg)
      File "/root/rep/rep/test/test_notebooks.py", line 43, in check_single_notebook
        raise RuntimeError(description)
    RuntimeError: Cell failed: 'T.Draw("min_DOCA")
    c1'
    
     Traceback:
    ---------------------------------------------------------------------------
    ReferenceError                            Traceback (most recent call last)
    <ipython-input-5-aa6c7320180d> in <module>()
    ----> 1 T.Draw("min_DOCA")
          2 c1
    
    ReferenceError: attempt to access a null-pointer
    

    What am I missing?

    opened by anaderi 3
  • Updating numpy in 0.6.6 docker breaks matplotlib

    Updating numpy in 0.6.6 docker breaks matplotlib

    % docker run -ti yandex/rep:0.6.6 bash -lc 'pip install -U numpy; python -c "from matplotlib import pyplot as plt; plt.figure()"'
    Activate: ROOT has been sourced. Environment settings are ready.
    ROOTSYS=/root/miniconda/envs/rep_py2
    Deactivate:Unsetting ROOT environment variables..
    Activate: ROOT has been sourced. Environment settings are ready.
    ROOTSYS=/root/miniconda/envs/rep_py2
    Collecting numpy
      Downloading numpy-1.11.2-cp27-cp27mu-manylinux1_x86_64.whl (15.3MB)
        100% |################################| 15.3MB 46kB/s
    Installing collected packages: numpy
      Found existing installation: numpy 1.10.4
        DEPRECATION: Uninstalling a distutils installed project (numpy) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
        Uninstalling numpy-1.10.4:
          Successfully uninstalled numpy-1.10.4
    Successfully installed numpy-1.11.2
    /root/miniconda/envs/rep_py2/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
      warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
    bash: line 1:   222 Illegal instruction     python -c "from matplotlib import pyplot as plt; plt.figure()"
    
    opened by sashabaranov 2
  • do we need to measure fit/predict time without %time?

    do we need to measure fit/predict time without %time?

    it is useful if jupyter frontend disconnects during fit/predict execution.

    might the following snippet be handy for such cases

    class Stopwatch(object):
        def __enter__(self):
            self.t0 = datetime.datetime.now()
            return self
    
        def __exit__(self, type, value, traceback):
            self.t1 = datetime.datetime.now()
    
        def __repr__(self):
            return "delta: (%s)" % (self.t1 - self.t0)
    
    
    with Stopwatch() as sfit:
        time.sleep(1)
    with Stopwatch() as spredict:
        time.sleep(1)
    
    print "fit:", sfit, "spredict:", spredict
    
    opened by anaderi 2
  • New REP docker version running in /var/lib/docker/volumes/ instead of ~/rep_container

    New REP docker version running in /var/lib/docker/volumes/ instead of ~/rep_container

    Hi.

    I had old REP docker version in ~/rep_container which started with run.sh script on 8080 port. I updated REP and it broke: sudo $REPDIR/run.sh worked, but I couldn't connect to localhost:8080 (connection refused). I've decided to update docker and REP according to new instructions: https://github.com/yandex/rep/wiki/Install-REP-with-Docker-(Linux).

    1. I installed Docker, according to instructions.
    2. netstat -anl | grep 8888 gave empty result
    3. git checkout https://github.com/yandex/rep.git didn't work (pathspec did not match any file(s) known to git), so I used git clone instead.
    4. First run of sudo make run was successful and installed container.
    5. I rebooted and second sudo make run gave the following

    docker run -ti --rm -p 8888:8888 --name rep yandex/rep:0.6.4
    Error response from daemon: Conflict. The name "rep" is already in use by container 3af0884aeedb. You have to remove (or rename) that container to be able to reuse that name. make: *
    * [run] Error 1* 6. I ran sudo docker images

    REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE yandex/rep 0.6.4 18a48bc5a3b6 8 hours ago 2.635 GB anaderi/rep latest 63c3db2850b6 4 months ago 1.649 GB 91c95931e552 7 months ago 910 B 7. I tried sudo docker start rep. It worked and I opned REP on localhost:8888. But its working folder changed. Now it is /var/lib/docker/volumes/dbcc7ff99538007d9c6b244fb6b8f03bdcfd564f6076b36d79fa3330d2041107/_data/. It is quite unhandy, because it requires superuser rights to access and not conveniently located at all.

    Question: Is it a new system or did I something wrong? If latter, how to I fix it and run REP container in handy folder?

    opened by lodurality 2
  • Bump notebook from 4.2.1 to 6.4.12

    Bump notebook from 4.2.1 to 6.4.12

    Bumps notebook from 4.2.1 to 6.4.12.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Update lib

    Update lib

    Issue:

    ModuleNotFoundError Traceback (most recent call last) in 5 from sklearn.ensemble import HistGradientBoostingClassifier 6 from rep.report.metrics import RocAuc ----> 7 from rep.metaml import GridOptimalSearchCV, FoldingScorer, RandomParameterOptimizer 8 from rep.estimators import SklearnClassifier

    ~/.local/lib/python3.8/site-packages/rep/metaml/init.py in 2 3 from .factory import ClassifiersFactory, RegressorsFactory ----> 4 from .folding import FoldingClassifier, FoldingRegressor 5 from .gridsearch import GridOptimalSearchCV 6 from .stacking import FeatureSplitter

    ~/.local/lib/python3.8/site-packages/rep/metaml/folding.py in 11 12 from sklearn import clone ---> 13 from sklearn.cross_validation import KFold 14 from sklearn.utils import check_random_state 15 from . import utils

    ModuleNotFoundError: No module named 'sklearn.cross_validation'

    Correction suggested based on https://stackoverflow.com/questions/30667525/importerror-no-module-named-sklearn-cross-validation

    opened by RobsonRocha 1
  • Bump requests from 2.9.1 to 2.20.0

    Bump requests from 2.9.1 to 2.20.0

    Bumps requests from 2.9.1 to 2.20.0.

    Changelog

    Sourced from requests's changelog.

    2.20.0 (2018-10-18)

    Bugfixes

    • Content-Type header parsing is now case-insensitive (e.g. charset=utf8 v Charset=utf8).
    • Fixed exception leak where certain redirect urls would raise uncaught urllib3 exceptions.
    • Requests removes Authorization header from requests redirected from https to http on the same hostname. (CVE-2018-18074)
    • should_bypass_proxies now handles URIs without hostnames (e.g. files).

    Dependencies

    • Requests now supports urllib3 v1.24.

    Deprecations

    • Requests has officially stopped support for Python 2.6.

    2.19.1 (2018-06-14)

    Bugfixes

    • Fixed issue where status_codes.py's init function failed trying to append to a __doc__ value of None.

    2.19.0 (2018-06-12)

    Improvements

    • Warn user about possible slowdown when using cryptography version < 1.3.4
    • Check for invalid host in proxy URL, before forwarding request to adapter.
    • Fragments are now properly maintained across redirects. (RFC7231 7.1.2)
    • Removed use of cgi module to expedite library load time.
    • Added support for SHA-256 and SHA-512 digest auth algorithms.
    • Minor performance improvement to Request.content.
    • Migrate to using collections.abc for 3.7 compatibility.

    Bugfixes

    • Parsing empty Link headers with parse_header_links() no longer return one bogus entry.
    ... (truncated)
    Commits
    • bd84045 v2.20.0
    • 7fd9267 remove final remnants from 2.6
    • 6ae8a21 Add myself to AUTHORS
    • 89ab030 Use comprehensions whenever possible
    • 2c6a842 Merge pull request #4827 from webmaven/patch-1
    • 30be889 CVE URLs update: www sub-subdomain no longer valid
    • a6cd380 Merge pull request #4765 from requests/encapsulate_urllib3_exc
    • bbdbcc8 wrap url parsing exceptions from urllib3's PoolManager
    • ff0c325 Merge pull request #4805 from jdufresne/https
    • b0ad249 Prefer https:// for URLs throughout project
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Changes to TMVA API in new ROOT versions break TMVAClassifier

    Changes to TMVA API in new ROOT versions break TMVAClassifier

    Hi all,

    first of all, I wanted to thank and compliment the developers for this brilliant library. I finally had the chance to start playing with it today, but I was stopped in my tracks when trying to use a TMVAClassifier:

    AssertionError: ERROR: TMVA process is incorrect finished 
     LOG: None 
     Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/ludo/miniconda3/envs/pyroot/lib/python2.7/site-packages/rep/estimators/_tmvaFactory.py", line 86, in main
        tmva_process(classifier, info, data, labels, sample_weight)
      File "/home/ludo/miniconda3/envs/pyroot/lib/python2.7/site-packages/rep/estimators/_tmvaFactory.py", line 40, in tmva_process
        factory.AddVariable(var)
    AttributeError: 'Factory' object has no attribute 'AddVariable'
    

    My ROOT/TMVA versions are:

    You are running ROOT Version: 6.08/00, Nov 4, 2016
    TMVA Version 4.2.1, Feb 5, 2015
    

    Searching the web for this error message led me to this post on the ROOT forum: https://root-forum.cern.ch/t/25090, where the cause of problem is indicated as being due to a breaking change in the TMVA API:

    In recent ROOT versions (6.06 or 6.08, don't remember exactly), the TMVA interface has changed. You need to create a TMVA::DataLoader and call AddVariable on the dataloader object.

    As I understand, this is related to what was mentioned by @gandreassi in a comment to #104. Any idea on how complicated it would be to adapt tmva_process to the new interface?

    opened by fndari 1
Releases(0.6.6)
  • 0.6.6(Aug 9, 2016)

    • python2 and python3 dockers
    • updated libraries
    • added CacheClassifier
    • minimized size of docker image, simplified building process
    • some fixes for ML libraries
    • some documentation updates
    • deleted plot.ly
    • solved theanets reproducibility
    Source code(tar.gz)
    Source code(zip)
  • 0.6.5(Feb 3, 2016)

    Fixes:

    • TMVA process correct termination
    • TMVA fix for MAX OS El Capitan (problems with dynamic libraries paths)
    • fix travis (show not passed tests, create docker on dockerhub)
    • fix wget in notebooks
    • fix errors calculation in efficiencies (for flatness property)
    • added Makefile
    • fix normalization in the multi dimentional metric
    Source code(tar.gz)
    Source code(zip)
  • 0.6.4(Nov 21, 2015)

    • Add continuous integration
    • Python 3 support
    • Conda installation in docker and travis
    • Kitematic-friendly docker
    • Update all libraries versions
    • added Folding Regressor, added feature importances for folding
    • added minimization to gridsearch, added random gridsearch from distributions
    • added folding scorer for regressor to gridsearch
    • faster tests
    • updated notebooks
    • Fixes:
      • tmva termination
      • documentation for grid search
      • Gridsearch bugs with metrics (metric fit)
      • learning curve with mask for folding
    Source code(tar.gz)
    Source code(zip)
  • 0.6.3(Jul 30, 2015)

  • 0.6.2(Jul 6, 2015)

    • Support of neural networks in common interface:

      • theanets
      • neurolab
      • pybrain

      Now all the REP stuff is available for classifiers and regressors from these libraries:

      • usage inside sklearn pipeline
      • grid_search for hyper parameter optimization
      • reports, parallel training on cluster
    • New lovely documentation, check it out!

    • Fixes in metaclassifiers connected with usage of expressions-as-features

    • Rewritten FeatureSplitter

    • Switched to sklearn 0.16

    • New method train_test_split_group - splitting into train and test by the value of special column. Samples with same values are either both in train or both in test.

    • Update howto/notebooks with new open physical datasets

    Source code(tar.gz)
    Source code(zip)
  • 0.6.1(May 22, 2015)

    • Tmva implementation enhancement with root_numpy https://github.com/yandex/rep/issues/2.
    • Add FPRatTPR (return fpr value at fixed tpr) and TPRatFPR (return tpr value at fixed fpr) metrics, which are required, e.g. for tuning online triggering system. Moreover learning curves are available for these metrics now.
    • Many improvements in documentation.
    Source code(tar.gz)
    Source code(zip)
  • 0.6.0(May 12, 2015)

    • unified classifiers wrapper for variety of implementations: TMVA, Sklearn, XGBoost, uBoost
    • parallel training of classifiers on cluster
    • classification/regression reports with plots
    • support of interactive plots (bokeh, plotly)
    • grid-search with parallelized execution on a cluster
    • git, versioning of research
    • computation of different classification metrics
    • partial support of python 3.
    Source code(tar.gz)
    Source code(zip)
Owner
Yandex
Yandex open source projects and technologies
Yandex
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference. ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, scikit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio, video, etc.) and machine learning tasks (classification, object detection, speech recognition, generation, certification, etc.).

null 3.4k Jan 4, 2023
Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

Keras 57k Jan 9, 2023
Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

Keras 50.7k Feb 12, 2021
Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos This repository is the official tensorflow python implementation

Yasamin Jafarian 287 Jan 6, 2023
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 6, 2023
Topic Modelling for Humans

gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ

RARE Technologies 13.8k Jan 3, 2023
Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Human Attention for Text Classification Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Foc

Shunsuke KITADA 15 Dec 13, 2021
Synthetic Humans for Action Recognition, IJCV 2021

SURREACT: Synthetic Humans for Action Recognition from Unseen Viewpoints Gül Varol, Ivan Laptev and Cordelia Schmid, Andrew Zisserman, Synthetic Human

Gul Varol 59 Dec 14, 2022
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition The official code of ABINet (CVPR 2021, Oral).

null 334 Dec 31, 2022
SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements (CVPR 2021)

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements (CVPR 2021) This repository contains the official PyTorch implementa

Qianli Ma 133 Jan 5, 2023
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

The Power of Points for Modeling Humans in Clothing (ICCV 2021) This repository contains the official PyTorch implementation of the ICCV 2021 paper: T

Qianli Ma 158 Nov 24, 2022
The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

Yuki M. Asano 249 Dec 22, 2022
A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perform basic tasks.

AI_Personal_Voice_Assistant_Using_Python A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perf

Chumui Tripura 1 Oct 30, 2021
ICON: Implicit Clothed humans Obtained from Normals (CVPR 2022)

ICON: Implicit Clothed humans Obtained from Normals Yuliang Xiu · Jinlong Yang · Dimitrios Tzionas · Michael J. Black CVPR 2022 News ?? [2022/04/26] H

Yuliang Xiu 1.1k Jan 4, 2023
Bolt Online Learning Toolbox

Bolt Online Learning Toolbox Bolt features discriminative learning of linear predictors (e.g. SVM or Logistic Regression) using fast online learning a

Peter Prettenhofer 87 Dec 12, 2022
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in ???? Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Jan 5, 2023