This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML)

National Center for Cognitive Research of ITMO University

Last update: Dec 26, 2022

Related tags

Deep Learning machine-learning automation genetic-programming hyperparameter-optimization evolutionary-algorithms multimodality automl automated-machine-learning parameter-tuning structural-learning fedot

Overview

package
tests
docs
license
stats
support

This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML). It can build custom modeling pipelines for different real-world processes in an automated way using an evolutionary approach. FEDOT supports classification (binary and multiclass), regression, clustering, and time series prediction tasks.

The main feature of the framework is the complex management of interactions between various blocks of pipelines. First of all, this includes the stage of machine learning model design. FEDOT allows you to not just choose the best type of the model, but to create a complex (composite) model. It allows you to combine several models of different complexity, which helps you to achieve better modeling quality than when using any of these models separately. Within the framework, we describe composite models in the form of a graph defining the connections between data preprocessing blocks and model blocks.

The framework is not limited to specific AutoML tasks (such as pre-processing of input data, feature selection, or optimization of model hyperparameters), but allows you to solve a more general structural learning problem - for a given data set, a solution is built in the form of a graph (DAG), the nodes of which are represented by ML models, pre-processing procedures, and data transformation.

The project is maintained by the research team of the Natural Systems Simulation Lab, which is a part of the National Center for Cognitive Research of ITMO University.

The intro video about Fedot is available here:

FEDOT features

The main features of the framework are as follows:

The FEDOT architecture is highly flexible and therefore the framework can be used to automate the creation of mathematical models for various problems, types of data, and models;
FEDOT already supports popular ML libraries (scikit-learn, keras, statsmodels, etc.), but you can also integrate custom tools into the framework if necessary;
Pipeline optimization algorithms are not tied to specific data types or tasks, but you can use special templates for a specific task class or data type (time series forecasting, NLP, tabular data, etc.) to increase the efficiency;
The framework is not limited only to machine learning, it is possible to embed models related to specific areas into pipelines (for example, models in ODE or PDE);
Additional methods for hyperparameters tuning can be seamlessly integrated into FEDOT (in addition to those already supported);
The resulting pipelines can be exported in a human-readable JSON format, which allows you to achieve reproducibility of the experiments.

Thus, compared to other frameworks, FEDOT:

Is not limited to specific modeling tasks and claims versatility and expandability;
Allows managing the complexity of models and thereby achieving better results.
Allows building models using input data of various nature (texts, images, tables, etc.) and consisting of different types of models.

Installation

Common installation:

$ pip install fedot

In order to work with FEDOT source code:

$ git clone https://github.com/nccr-itmo/FEDOT.git
$ cd FEDOT
$ pip install -r requirements.txt
$ pytest -s test

How to use

FEDOT provides a high-level API that allows you to use its capabilities in a simple way. At the moment, the API can be used for classification and regression tasks only. But the time series forecasting and clustering support will be implemented soon (you can still solve these tasks via advanced initialization, see below). Input data must be either in NumPy arrays or CSV files.

To use the API, follow these steps:

Import Fedot class

from fedot.api.main import Fedot

Initialize the Fedot object and define the type of modeling problem. It provides a fit/predict interface:

fedot.fit runs the optimization and returns the resulting composite model;
fedot.predict returns the prediction for the given input data;
fedot.get_metrics estimates the quality of predictions using selected metrics

Numpy arrays, pandas data frames, and file paths can be used as sources of input data.

model = Fedot(problem='classification')

model.fit(features=train_data.features, target=train_data.target)
prediction = model.predict(features=test_data.features)

metrics = model.get_metrics()

For more advanced approaches, please use Examples & Tutorials section.

Examples & Tutorials

Jupyter notebooks with tutorials are located in the examples repository. There you can find the following guides:

Notebooks are issued with the corresponding release versions (the default version is 'latest').

Also, external examples are available:

Kaggle: baseline for Microsoft Stock - Time Series Analysis task

Extended examples:

Credit scoring problem, i.e. binary classification task
Time series forecasting, i.e. random process regression
Spam detection, i.e. natural language preprocessing
Movie rating prediction with multi-modal data

Also, several video tutorials are available (in Russian).

Publications about FEDOT

We also published several posts and news devoted to the different aspects of the framework:

In English:

How AutoML helps to create composite AI? - towardsdatascience.com
AutoML for time series: definitely a good idea - towardsdatascience.com
AutoML for time series: advanced approaches with FEDOT framework - towardsdatascience.com
Experience of hackathon winning with FEDOT - itmo.news
FEDOT as a factory of human-competitive results - video

In Russian:

General concepts of evolutionary design for composite pipelines - habr.com
Automated time series forecasting with FEDOT - habr.com
Details of FEDOT-based solution for Emergency DataHack - habr.com

Project structure

The latest stable release of FEDOT is on the master branch.

The repository includes the following directories:

Package core contains the main classes and scripts. It is the core of FEDOT framework
Package examples includes several how-to-use-cases where you can start to discover how FEDOT works
All unit and integration tests can be observed in the test directory
The sources of the documentation are in the docs

Also, you can check benchmarking a repository that was developed to provide a comparison of FEDOT against some well-known AutoML frameworks.

Current R&D and future plans

Currently, we are working on new features and trying to improve the performance and the user experience of FEDOT. The major ongoing tasks and plans:

Effective and ready-to-use pipeline templates for certain tasks and data types;
Integration with GPU via Rapids framework;
Alternative optimization methods of fixed-shaped pipelines;
Integration with MLFlow for import and export of the pipelines;
Improvement of high-level API.

Also, we are doing several research tasks related to AutoML time-series benchmarking and multi-modal modeling.

Any contribution is welcome. Our R&D team is open for cooperation with other scientific teams as well as with industrial partners.

Documentation

The general description is available in FEDOT.Docs repository.

Also, a detailed FEDOT API description is available in the Read the Docs.

Contribution Guide

The contribution guide is available in the repository.

Acknowledgments

We acknowledge the contributors for their important impact and the participants of the numerous scientific conferences and workshops for their valuable advice and suggestions.

Side projects

The prototype of web-GUI for FEDOT is available in FEDOT.WEB repository.

Contacts

Supported by

National Center for Cognitive Research of ITMO University

Citation

@article{nikitin2021automated,: title = {Automated evolutionary approach for the design of composite machine learning pipelines}, author = {Nikolay O. Nikitin and Pavel Vychuzhanin and Mikhail Sarafanov and Iana S. Polonskaia and Ilia Revin and Irina V. Barabanova and Gleb Maximov and Anna V. Kalyuzhnaya and Alexander Boukhanovsky}, journal = {Future Generation Computer Systems}, year = {2021}, issn = {0167-739X}, doi = {https://doi.org/10.1016/j.future.2021.08.022}}
@inproceedings{polonskaia2021multi,: title={Multi-Objective Evolutionary Design of Composite Data-Driven Models}, author={Polonskaia, Iana S. and Nikitin, Nikolay O. and Revin, Ilia and Vychuzhanin, Pavel and Kalyuzhnaya, Anna V.}, booktitle={2021 IEEE Congress on Evolutionary Computation (CEC)}, year={2021}, pages={926-933}, doi={10.1109/CEC45853.2021.9504773}}

Other papers - in ResearchGate.

Comments

Visualization of the operations used in pipelines
Featuring visualization of operations used in evolutionary process. Changes:

Added operation_kde plot to show operations by generations.

Added operation_animated_bar to show operations by generations with (or without) changing fitness.

Modified fitness_box plot. Now it visually fits the mentioned visuals and supports pct_best parameter.

Examples of visuals: KDE

Animated barplot The same with explicitly hidden fitness:

Modified fitness box
opened by MorrisNein 14
Preprocessing refactor
Changed the core architecture. Preprocessing operations now can be placed in separate nodes. Important changes:

The Model abstraction is now replaced with an Operation that has two descendant classes: DataOperation class and Model class;

Preprocessing operations, both simple, such as scaling, and advanced, such as lagged transformation for time series, can be used in nodes as previously models can be;

New tuning (optimization of hyperparameters in nodes) is implemented - here are two classes: ChainTuner and SequentialTuner, both of them are using hyperopt library;

New operator for mutation was implemented. Now it is possible to change hyperparameters in the nodes during chain composing;

Now it is possible to feed different data sources to primary nodes, in particular, this functionality is used with exogenous time series, an example for which is available in examples;

New low-level abstraction "Implementation" was appeared in the core. This abstraction includes custom models that are implemented in FEDOT;

New operations have been implemented and added to the data_operations repository, such as feature selection, exclusion of anomalous values, smoothing of time series, and much more.
opened by Dreamlone 13
+ pipeline node operations cache support

Example showing the meaning of using an operations cache (haven't tested with multiprocessing yet): examples/advanced/pipelines_caching.py

But...if you've got no time to wait for the results (up to 8 minutes by default), here is the result image with example_number=1, timeout=1., n_partitions=10:

Usages of the cache were added to: fedot/api/main.py fedot/core/composer/gp_composer/gp_composer.py fedot/core/validation/compose/metric_estimation.py fedot/api/api_utils/api_composer.py

Class responsible for the caching: fedot/core/composer/cache.py

Other changes contains mostly readability/style/performance fixes + full logger support.

opened by IIaKyJIuH 12
Feature/pipeline explanations
What's new:

An inner repo fedot/explainability for the corresponding experiments.

Explainer abstract class, implementing an interface.

SurrogateExplainer class for building surrogate explanation models. The only supported surrogate at the moment is the decision tree (for both classification and regression tasks).

explain method of the Fedot class for creating explanations -- instances of Explainer successors.

enhancement cases test
opened by MorrisNein 12
NLP init
New method InputData.from_text(), where you can pass meta_file.csv with text in it or path to directories with text files

New TextData class, where all the nlp utils are located. Not expected to use directly.

Current idea is: text files -> feature extraction (make table data, not text) -> pass to model/chain

[x] Finish BatchLoader for creation of meta_file.csv for collections of data (images, text)

[x] Finish the text files -> meta_file.csv

[x] Add tests

[x] add packed data && unpacking script
opened by BarabanovaIrina 11
enabled `logging_level` option
Fixed Log initialization uniqueness (in collaboration with @maypink)

Rid of useless logging_level_opt option (@maypink)

Enabled usage whilst multiprocessing: separate file writings, log level preserving

Added show_progress option to tuner

Made pytest fixture that "resets" singletons before each test to sustain theirs pattern

Minor fixes: caching, sortings, docstrings, logical fixes
opened by IIaKyJIuH 10
735-improving-fedot-documentation (structure)
Partial improvement.

Changed structure of the documentation according to this document

Added docstrings to classes/functions/props and improved existing ones

Created copy_doc decorator to copy docstrings for logically the same functions

Improved code blocks, typings, variables names
opened by IIaKyJIuH 10
Encoding bug
add import/export for "one_hot_encoding" operation

whether categories in test data contain in train data

fix issues with categorical expansion

fix testing time

RIGHT now we have preprocessing pipeline:

convert values to one type in columns

fill missing values

one hot encoding for categorical

Closed issues: #400 #399 #412
opened by MAGLeb 10
Sensitivity
Реализованы подходы к структурному анализу композитной модели. На данный момент Node можно:

Просто удалить с сохранением ее поддерева,

Затюнить,

Заменить на ноды:

кастомные (передать список моделей напрямую)

рандомные (передать количество нод, которые хотелось бы сгенерить)

иначе будут применены все модели доступные в рамках задачи.

Оценить чувствительность гиперпараметров моделей с использование индексов Соболя

Это можно сделать:

через class NodeAnalysis, который может проанализировать 1 ноду несколькими подходами.

через class ChainStructureAnalysis, который может проанализировать несколько нод несколькими подходами

В fedot.utilities.define_metric_by_task есть MetricByTask. Если метрика не указана, то берется стандартная метрика для Task, определенного внутри InputData.

Диаграмма классов:

Мини-туториал: https://fedot.readthedocs.io/en/latest/fedot/features/sensitivity_analysis.html

UPD: в рамках данного pr была отключена сборка информации о покрытие кода в manual_build action.
opened by BarabanovaIrina 10
Remote evaluation feature implemented for pipelines in composer
What's new:

'Remote' module:

run_pipeline.py script is added. It receives the pipeline and data description as input and returns the fitted pipeline. This script is aimed to be called at the remote node

ComputationalSetup class added to implement the remote fit of pipelines

Batch evaluation of the pipelines added to the composer if ComputationalSetup is initialized.

Logic for REST requests to a remote server with computational resources.

Other:

Indices corrected for Data. Now, the indices are preserved during transformations and not creating from scratch.

New notebook added for industrial case

Minor fixes in forecasting

Known disadvantages:

Now, the ComputationalSetup is DataMall-specific. The refactoring to support different strategies is expected.

Сlient should be moved to external repository as exported from pypi.

enhancement in progress
opened by nicl-nno 9

Very strict dependency requirements

I'm very excited to try out this package, but its strict dependency requirements are giving me headaches when setting up my conda environment.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fedot 0.3.1 requires scikit-learn==0.24.1, but you have scikit-learn 0.24.2 which is incompatible.
fedot 0.3.1 requires scikit-optimize==0.7.4, but you have scikit-optimize 0.9.dev0 which is incompatible.
fedot 0.3.1 requires xgboost==1.0.1, but you have xgboost 1.4.2 which is incompatible.

Is it possible to relax the requirements?

dependencies

opened by bacalfa 9

Metric evaluation error: y_true and y_pred contain different number of classes

The problem "Metric evaluation error: y_true and y_pred contain different number of classes 45, 85" raises for click_prediction_small dataset

Looks like class stratification is failing.
bug

opened by nicl-nno 2
Caching performance is worse than the one from the earlier versions

As for FEDOTv0.6.0 performance of the cache (both for operations and for preprocessors) has deeply regressed from the previous versions. That can be seen from quality metrics on classification datasets - they're better when the cache is turned off...

It's important to accelerate a work of the cacher, so it would be reasonable to use it (It is enabled by default).

opened by IIaKyJIuH 1
Bug with filter operation

https://colab.research.google.com/drive/1MRgwB9sCrFeXRme0eGSZ0KeBRAMlnfOk?usp=sharing data: https://drive.google.com/file/d/1Msx8sUVm6jh6L-QH20aJrw47DvvKk-HM/view?usp=sharing

Exception in run_regression_example(visualise) 22 **composer_params) 23 ---> 24 auto_model.fit(features=train, target='target') 25 prediction = auto_model.predict(features=test) 26 if visualise:

/usr/local/lib/python3.8/dist-packages/fedot/api/main.py in fit(self, features, target, predefined_model) 181 # Final fit for obtained pipeline on full dataset 182 if self.history and not self.history.is_empty() or not self.current_pipeline.is_fitted: --> 183 self._train_pipeline_on_full_dataset(recommendations, full_train_not_preprocessed) 184 self.params.api_params['logger'].message('Final pipeline was fitted') 185 else:

/usr/local/lib/python3.8/dist-packages/fedot/api/main.py in _train_pipeline_on_full_dataset(self, recommendations, full_train_not_preprocessed) 456 {k: v for k, v in recommendations.items() 457 if k != 'cut'}) --> 458 self.current_pipeline.fit( 459 full_train_not_preprocessed, 460 n_jobs=self.params.api_params['n_jobs'],

/usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/pipeline.py in fit(self, input_data, time_constraint, n_jobs) 139 140 if time_constraint is None: --> 141 train_predicted = self._fit(input_data=copied_input_data) 142 else: 143 train_predicted = self._fit_with_time_limit(input_data=copied_input_data, time=time_constraint)

/usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/pipeline.py in _fit(self, input_data, process_state_dict, fitted_operations) 102 with Timer() as t: 103 computation_time_update = not self.root_node.fitted_operation or self.computation_time is None --> 104 train_predicted = self.root_node.fit(input_data=input_data) 105 if computation_time_update: 106 self.computation_time = round(t.minutes_from_start, 3)

/usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in fit(self, input_data, **kwargs) 389 self.log.debug(f'Trying to fit secondary node with operation: {self.operation}') 390 --> 391 secondary_input = self._input_from_parents(input_data=input_data, parent_operation='fit') 392 393 return super().fit(input_data=secondary_input)

/usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in _input_from_parents(self, input_data, parent_operation) 429 parent_nodes = self._nodes_from_with_fixed_order() 430 --> 431 parent_results, _ = _combine_parents(parent_nodes, input_data, 432 parent_operation) 433

/usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in _combine_parents(parent_nodes, input_data, parent_operation) 471 parent_results.append(prediction) 472 elif parent_operation == 'fit': --> 473 prediction = parent.fit(input_data=input_data) 474 parent_results.append(prediction) 475 else:

/usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in fit(self, input_data, **kwargs) 301 else: 302 self.node_data = input_data --> 303 return super().fit(input_data) 304 305 def unfit(self):

/usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in fit(self, input_data) 179 self.fit_time_in_seconds = round(t.seconds_from_start, 3) 180 else: --> 181 operation_predict = self.operation.predict_for_fit(fitted_operation=self.fitted_operation, 182 data=input_data, 183 params=self._parameters)

/usr/local/lib/python3.8/dist-packages/fedot/core/operations/operation.py in predict_for_fit(self, fitted_operation, data, params, output_mode) 112 for example, is the operation predict probabilities or class labels 113 """ --> 114 return self._predict(fitted_operation, data, params, output_mode, is_fit_stage=True) 115 116 def _predict(self, fitted_operation, data: InputData, params: Optional[OperationParameters] = None,

/usr/local/lib/python3.8/dist-packages/fedot/core/operations/operation.py in _predict(self, fitted_operation, data, params, output_mode, is_fit_stage) 122 123 if is_fit_stage: --> 124 prediction = self._eval_strategy.predict_for_fit( 125 trained_operation=fitted_operation, 126 predict_data=data)

/usr/local/lib/python3.8/dist-packages/fedot/core/operations/evaluation/regression.py in predict_for_fit(self, trained_operation, predict_data) 84 :return: 85 """ ---> 86 prediction = trained_operation.transform_for_fit(predict_data) 87 converted = self._convert_to_output(prediction, predict_data) 88 return converted

/usr/local/lib/python3.8/dist-packages/fedot/core/operations/evaluation/operation_implementations/data_operations/sklearn_filters.py in transform_for_fit(self, input_data) 59 mask = self.operation.inlier_mask_ 60 if mask is not None: ---> 61 input_data = update_data(input_data, mask) 62 else: 63 self.log.info("Filtering Algorithm: didn't fit correctly. Return all objects")

/usr/local/lib/python3.8/dist-packages/fedot/core/operations/evaluation/operation_implementations/data_operations/sklearn_filters.py in update_data(input_data, mask) 231 old_idx = modified_input_data.idx 232 --> 233 modified_input_data.features = old_features[mask] 234 modified_input_data.target = old_target[mask] 235 modified_input_data.idx = np.array(old_idx)[mask]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 68 but corresponding boolean dimension is 55
bug

opened by valer1435 0
Proposed method of installation on MAC M1 and GPU utilisation

What is the proposed method of installation in Mac silicon M1?

After a lot of trials, my installation worked with the following set of commands on a x86 architecture conda env

in new python env:

conda config --env --set subdir osx-64 conda install python=3.8.13 pip install fedot brew install libomp conda install -c conda-forge lightgbm

Also, can't seem to get any GPU activity, is there any config script I can include in code for M1?

opened by gdamianakos 1

Releases(v0.6.1)

v0.6.1(Dec 12, 2022)
Hi, folk! We're making a new minor release with a number of improvements. This is an important release in a sense that this is a last release of self-contained FEDOT. The next major release will mark a separation of the optimizer core into the separate project.

New features, better quality & changes in API

More intuitive predict interface for time series forecasting (#930)

Pipeline save/load now have more intuitive behavior (#971)

Early stopping criteria now can take timeout into considerations, and not only number of iterations (early_stopping_timeout api parameter)

Graph nodes now can be accessed by name or uid (#982)

Tuner speed is better due to better initial params in the search space (#985)

Enhancements and fixes:

Fix inplace modification of data during data definition (resolves #943)

Fix regression preprocessing (#955)

Less evaluation errors during population selection in corner cases (#956)

Fix getting suitable operations for multi ts (#981)

Integration tests are fixed & passing now

More minor fixes & minor class interface refactorings

Important fix for multi-objective optimization (#996)

Documentation is extended

Architectural refactorings are continued:

Better PipelineAdapter (#941)

Abstracting optimiser core (most tasks in issue #713 are done) Notably, Serializer subsystem is now extendable (#969)

Source code(tar.gz)
Source code(zip)
v0.6.0(Oct 18, 2022)
Hi everyone! We released a new major version of FEDOT - 0.6.0

It includes a lot of major changes:

Improvement of API for multi-modal datasets and models;

New PipelineBuilder (#597) – that simplifies manual construction of ML Pipelines;

Joblib was embedded as a multiprocessing backend (#843). Data exchange between processes minimized (#926);

Embedding stratify k fold strategy for cases with imbalance data;

New visualization of graphs, pipelines and optimisation history;

Also, this release contains by a lot of architectural refactorings of the framework:

New Graph Adapter subsystem (#876);

Merging two different implementation of evolutionary optimizer (parameter-free & usual) into one EvoGraphOptimizer (#687)

Architectural refactorings of the Graph hierarchy (#750)

Introduce notions of Objective & Fitness (#654) – classes that substitutes simple float metric values & abstract single vs. multi-objective metrics

Refactored parameter classes – for more intuitive segregation of different parameters controlling optimization process (#852)

Refactored DataMerger facility

Refactoring of selection operator implementation (#918)

Also, there are various bug-fixes related to ML operations, evolutionary operators & internal Graph operations.
Source code(tar.gz)
Source code(zip)
v0.5.1(Feb 22, 2022)
The most important changes:

Cache support for the cross-validation implemented;

AutoML can be run without a time limit;

Graph operators improved;

Multi-task pipelines processing improved;

Custom parameters support for external optimizer;

Time series processing improved;

Multimodal table processing improved;

Lightweight docker prepared;

Isolation Forest added as new operation

Major and minor bugs are fixed.

Source code(tar.gz)
Source code(zip)
v0.5.0(Dec 31, 2021)
Hi everyone!

We released a new major version of FEDOT - 0.5.0 It includes several major changes

The new version is available and can be imported via pip: pip install fedot==0.5.0

The most important changes:

Preprocessing for tabular features and target variables improved dramatically.

API is refactored and improved (presets, parameters, etc). The postfix "_tun" has been removed from the presets, so now you have to specify composer_params={'with_tuning': True} to set tuning. Important changes in preset names: light - best_quality, ultra_light - fast_train.

Support for external optimisers is implemented.

Zero-code console interface is implemented.

Surrogate decision trees for pipeline interpretation are added.

Custom model support is implemented.

Better presets and models for time series forecasting (derivatives, polynomial models, cuts, etc)

Better integration with FEDOT.Web

Prototype for remote infrastructure support

Evolutionary optimiser improved (stopping criterion, progress bar, better mutations, etc)

Major and minor bugs are fixed.

Source code(tar.gz)
Source code(zip)
v0.4.1(Oct 8, 2021)
We released a new major version of FEDOT - 0.4.1 It includes several large changes, features, and fixes.

The new version is available and can be imported via pip: pip install fedot==0.4.1

The most important changes:

Major bugs fixed for evolutionary composing: we get rid of many annoying problems related to fitness evaluation and mutations

Multi-variate time series forecasting improved

Torch-based LSTM model added

Encoding stage for categorial features implemented

Docker containers updated, GPU example improved

Export of pipelines improved

Processing of hyperparameters improved

API refactored

Source code(tar.gz)
Source code(zip)
v0.4.0(Aug 18, 2021)
We released a new major version of FEDOT - 0.4.0 It includes several large changes, features, and fixes.

The new version is available and can be imported via pip: pip install fedot==0.4.0

The most important changes:

Infrastructure:

Docker version added;

GPU support added;

Requirements become more flexible

Optimizer:

Evolutionary optimizer generalized to allow the application to the custom non-ML tasks

Mutation schemes improved for a more explainable evolution process;

History saving extended

Time series:

Cross-validation for the time series implemented;

Sparse lagged transformation for time series implemented to improve performance;

Common:

API updated and simplified;

Processing of categorical features improved;

Fixes and improvements for the hyperparameters tuning;

The ‘chain’ term is replaced with ‘pipeline’ for better understandability.

Utilities:

Sensitivity analysis improved

Source code(tar.gz)
Source code(zip)
v0.3.1(May 30, 2021)
During the last month, we have merged several major features and fixed a bunch of bugs. Some of them are experimental and should be tested extensively in real-world cases. But we have tried our best and covered it with unit tests.

The new version (fedot == 0.3.1) is available and can be imported via pip.

The most important features:

ML pipelines for multi-modal datasets

Decompose operation in ML pipelines

Cross-validation in Composer

Add Memory and Time profilers

Memory consumption improving

For details, see the post in repository: https://github.com/nccr-itmo/FEDOT/discussions/317
Source code(tar.gz)
Source code(zip)
0.3.0(May 10, 2021)
Hello everyone!

Our team finally has finished preparing a new major release of fedot == 0.3.0. Thanks to all dev team who was working on it! It is available and can be imported via pip: https://pypi.org/project/fedot/0.3.0/. The most important changes:

Extended data operations and their automatic optimization

Previously, Fedot (Chain objects) allow one to automatically build ML pipelines including models, but data operations (like scaling or gap-filling) were embedded in the nodes and could be changed manually only. In the latest release we significantly refactored the core logic of the framework, thus data operations are fully supported as separate nodes. It can extend the overall search space of a suitable ML pipeline.

New AutoML for time-series forecasting

Now Fedot supports not the only manual building of ML pipelines for time-series forecasting but also in an automated mode via Composer! Fedot allow one to build pipelines and forecast time-series for a given window size and forecasting length. Also, it is possible to use exogen variables for forecasting. To check all features, see examplesin the repository.

Our early studies showed it is a promising approach that can improve AutoML field for time-series. We are actively working on the benchmarking of well-known SOTA frameworks for time-series forecasting and novel results will be published in a near future. Also, you can check our fresh preprint about gap filling in time-series using Fedot framework.

Black-box optimization of ML pipeline hyperparameters

During the experiments, we found out that our previous version of tuning of hyperparameters seems to be ineffective (also it didn't work out for preprocessing nodes). Therefore, we significantly refactored the tuning module and it provides several schemas for black-box optimization of ML pipelines hyperparameters. For details, check tuning module sources and the examples.

Multi-Objective AutoML for pipelines

Several months ago during the team discussion, we formulate a hypothesis: "Most of the AutoML frameworks are trying to maximize only one metric - prediction quality. But can we optimize several metrics (like pipeline complexity, for instance) simultaneously?" So we made research where evolutionary multi-objective optimization algorithms (like NSGA-II, SPEA-2) were adapted to the AutoML task. And it was concluded that it is a promising feature and we have integrated it into Fedot. The preprint is available, but also you can check the example how to use multi-objective optimization via Fedot API.

New input data support for image classification

Later, we have announced that images will be supported in Fedot. And we made several changes in InputData and now pipelines for image classification can be built manually. We also added several CNN architectures and example of its usage. Composer should also work for image classification but we have not tested extensively this functionality yet.

Also, we have fixed a bunch of bugs and improved Fedot API.

Thanks to everyone who is following our progress! Any issues and user reports are welcomed. Cya!
Source code(tar.gz)
Source code(zip)
v0.2.1(Mar 12, 2021)
Greetings to everyone who follows our team and FEDOT development progress!

Today, we released a new version of fedot == 0.2.1.

Here is the list of the main changes:

Main API is updated. The basic 'how-to-use is available in the https://github.com/nccr-itmo/FEDOT/blob/master/notebooks/intro_to_automl.ipynb

Support of the pandas dataframes is added

Logging is improved

The sensitivity analysis of the composite model (chain) structure is added. The description is available in https://fedot.readthedocs.io/en/latest/fedot/features/sensitivity_analysis.html

New version can be obtained using pip install fedot == 0.2.1

Оur team is very interested in any user feedback, the new issues are extremely welcomed! Thank you!
Source code(tar.gz)
Source code(zip)
v0.2.0(Jan 21, 2021)
Greetings to everyone who follows our team and FEDOT development progress!

Last week, we released a new version of fedot == 0.2.0. A bunch of bugs in framework were fixed and merged to master (main) and release branches. Here is the list of the main changes:

NLP tasks are now supported, a simple example of text classification were added (see here)

The first version of fedot high-level API were implemented, see readme for the instructions

Fixed several bugs with chain import/export

Composer now should work correctly for time-series task

Embedded visualization of composing and the resulted chains were improved, see the example here

Source code(tar.gz)
Source code(zip)

Owner

National Center for Cognitive Research of ITMO University

GitHub https://fedot.readthedocs.io

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning (FedML) developed and maintained by Scaleout Systems. FEDn enables highly scalable cross-silo and cross-device use-cases over FEDn networks.

75 Nov 9, 2022

Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model architecture for their classification problems (i.e., DNNs with different types of layers).

3.2k Dec 31, 2022

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

8.9k Dec 30, 2022

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

3 Nov 30, 2021

An AutoML Library made with Optuna and PyTorch Lightning

An AutoML Library made with Optuna and PyTorch Lightning Installation Recommended pip install -U gradsflow From source pip install git+https://github.

294 Dec 17, 2022

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

24 Oct 22, 2022

MMRazor: a model compression toolkit for model slimming and AutoML

Documentation: https://mmrazor.readthedocs.io/ English | 简体中文 Introduction MMRazor is a model compression toolkit for model slimming and AutoML, which

899 Jan 2, 2023

An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

170.1k Jan 4, 2023

An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

170.1k Jan 5, 2023

An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

153.2k Feb 13, 2021

Clairvoyance: a Unified, End-to-End AutoML Pipeline for Medical Time Series

Clairvoyance: A Pipeline Toolkit for Medical Time Series Authors: van der Schaar Lab This repository contains implementations of Clairvoyance: A Pipel

$van_der_Schaar \LAB$ 89 Dec 7, 2022

AutoDeeplab / auto-deeplab / AutoML for semantic segmentation, implemented in Pytorch

AutoML for Image Semantic Segmentation Currently this repo contains the only working open-source implementation of Auto-Deeplab which, by the way out-

299 Dec 17, 2022

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集，包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。人机交互主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022

LogDeep is an open source deeplearning-based log analysis toolkit for automated anomaly detection.

279 Dec 13, 2022

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

197 Nov 29, 2022

The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

MOTIF Dataset The Malware Open-source Threat Intelligence Family (MOTIF) dataset contains 3,095 disarmed PE malware samples from 454 families, labeled

112 Dec 13, 2022

This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

281 Dec 22, 2022

This repository contains the source code of our work on designing efficient CNNs for computer vision

Efficient networks for Computer Vision This repo contains source code of our work on designing efficient networks for different computer vision tasks:

386 Nov 26, 2022

⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework

Table of Contents Features Install Using PIP Stable version Development version Using Docker Stable version Development version Using HomeBrew Stable

2.2k Jan 8, 2023

This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML)

Related tags

Overview

FEDOT features

Installation

How to use

Examples & Tutorials

Publications about FEDOT

Project structure

Current R&D and future plans

Documentation

Contribution Guide

Acknowledgments

Side projects

Contacts

Supported by

Citation

Comments

in new python env:

Releases(v0.6.1)

v0.6.1(Dec 12, 2022)

v0.6.0(Oct 18, 2022)

v0.5.1(Feb 22, 2022)

v0.5.0(Dec 31, 2021)

v0.4.1(Oct 8, 2021)

v0.4.0(Aug 18, 2021)

v0.3.1(May 30, 2021)

0.3.0(May 10, 2021)

v0.2.1(Mar 12, 2021)

v0.2.0(Jan 21, 2021)

Owner

National Center for Cognitive Research of ITMO University

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

Model search is a framework that implements AutoML algorithms for model architecture search at scale

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

An AutoML Library made with Optuna and PyTorch Lightning

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

MMRazor: a model compression toolkit for model slimming and AutoML

An Open Source Machine Learning Framework for Everyone

An Open Source Machine Learning Framework for Everyone

An Open Source Machine Learning Framework for Everyone

Clairvoyance: a Unified, End-to-End AutoML Pipeline for Medical Time Series

AutoDeeplab / auto-deeplab / AutoML for semantic segmentation, implemented in Pytorch

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

LogDeep is an open source deeplearning-based log analysis toolkit for automated anomaly detection.

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

This repository contains the source code of our work on designing efficient CNNs for computer vision