A unified framework for machine learning with time series

Overview

Welcome to sktime

A unified framework for machine learning with time series

We provide specialized time series algorithms and scikit-learn compatible tools to build, tune and validate time series models for multiple learning problems, including:

  • Forecasting,
  • Time series classification,
  • Time series regression.

For deep learning, see our companion package: sktime-dl.

CI github appveyor azure codecov
Docs readthedocs binder tutorial
Community contributors gitter discord twitter
Code pypi conda python codestyle zenodo

Installation

The package is available via PyPI using:

pip install sktime

Alternatively, you can install it via conda:

conda install -c conda-forge sktime

The package is actively being developed and some features may not be stable yet.

Development version

To install the development version, please see our advanced installation instructions.

Quickstart

Forecasting

from sktime.forecasting.all import *

y = load_airline()
y_train, y_test = temporal_train_test_split(y)
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = ThetaForecaster(sp=12)  # monthly seasonal periodicity
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
smape_loss(y_test, y_pred)
>>> 0.08661468139978168

For more, check out the forecasting tutorial.

Time Series Classification

from sktime.classification.all import *
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_arrow_head(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
classifier = TimeSeriesForest()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
>>> 0.8679245283018868

For more, check out the time series classification tutorial.

Documentation

How to contribute

We follow the all-contributors specification - and all kinds of contributions are welcome!

If you have a question, chat with us or raise an issue. Your help and feedback is extremely welcome!

Development roadmap

  1. Multivariate/panel forecasting,
  2. Time series clustering,
  3. Time series annotation (segmentation and anomaly detection),
  4. Probabilistic time series modelling, including survival and point processes.

Read our detailed roadmap here.

How to cite sktime

If you use sktime in a scientific publication, we would appreciate citations to the following paper:

Markus Löning, Anthony Bagnall, Sajaysurya Ganesh, Viktor Kazakov, Jason Lines, Franz Király (2019): “sktime: A Unified Interface for Machine Learning with Time Series”

Bibtex entry:

@inproceedings{sktime,
    author = {L{\"{o}}ning, Markus and Bagnall, Anthony and Ganesh, Sajaysurya and Kazakov, Viktor and Lines, Jason and Kir{\'{a}}ly, Franz J},
    booktitle = {Workshop on Systems for ML at NeurIPS 2019},
    title = {{sktime: A Unified Interface for Machine Learning with Time Series}},
    date = {2019},
}
Issues
  • Pairwise transformers, kernels/distances on tabular data and panel data - base class, examples, extension templates

    Pairwise transformers, kernels/distances on tabular data and panel data - base class, examples, extension templates

    Related to #52 (oldest open issue!) and #1062.

    This PR contains:

    • base classes for a new scitype: pairwise transformers, i.e., distance matrix makers and kernel matrix makers; there are two base classes for two different inputs, tabular (classical distances/kernels) and time series panels (e.g., time warping distances)
    • an example tabular distance ScipyDist that interfaces the scipy distances
    • an example panel distance AggrDist that uses any tabular distance to create an aggregate sample distance between time series, e.g., the mean Euclidean distance
    • extension templates for both scitypes
    • tests for the concrete scitypes added

    This may be useful for future extension of the clustering module, or refactorings and extensions of the TSC module, @chrisholder, @TonyBagnall.

    Also pinging @jasonlines since this is a proposal how the existing distances (due to @jasonlines) could be refactored.

    opened by fkiraly 67
  • Forecasting support for multivariate y and multiple input/output types - working prototype

    Forecasting support for multivariate y and multiple input/output types - working prototype

    This is a merge/review ready sketch for how multivariate y could be supported for multivariate forecasters, alongside the possibility to pass np.array and pd.Series. This is based on the design in STEP no.5, adapted to the new forecasting base class.

    The key ingredient are:

    • converters parameterized by from, to, and as - in the convertIOmodule. Besides the obvious conversion functionality, the converters can be given access to a dictionary via reference in the store argument, where information for inverting lossy conversions (like from pd.DataFrame to np.array) can be stored
    • a new tag y_type which encodes the type of y that the private _fit, _predict, and _update assume internally - for now, it's just one type and not a list of compatible types
    • some logic in the public "plumbing" area of fit, predict, update, which converts inputs to the public layer to the desired input of the logic layer and back

    This is for discussion, and I expect major change before we merge.

    It might be especially interesting to discuss the following interaction with typing: the third argument of the converter, the "as" argument, is a scitype, and designed to have the same name as the respective typing composite type. Also see discussion in #965, #973 and #974.

    Thinking of the logical continuation of this:

    • extend this to the other arguments, e.g., X
    • the individual argument checks like check_X etc should be replaced by checks of the kind check(what, as : scitype), e.g., check(y, "Series") should replace the less generic check_y.
    • extend this to time series classification, to support 3D numpy arrays, awkward arrays, nested data frames

    A "visionary" pathway may see this working with type annotations that are scitypes, not typing machine types, e.g.,

    fit(X : Series, Y : Series)
    etc
    _fit(X: Series, Y : UnivariateSeries)
    

    and checks/conversions being done automatically.

    module:forecasting 
    opened by fkiraly 56
  • Added trend_ and resid_ attributes to STLTransformer. Bugfix.

    Added trend_ and resid_ attributes to STLTransformer. Bugfix.

    What does this implement/fix? Explain your changes.

    Added trend_ and resid_ attributes to STLTransformer . Refactor.

    bug module:forecasting 
    opened by aiwalter 40
  • Automated learning algorithm overview

    Automated learning algorithm overview

    Contributors: @fkiraly, @mloning, @afzal442, @abdulelahsm, @sparkingdark

    idea:

    web data base for learning algorithms with fields for:

    • class name,
    • scitype (e.g. forecaster, classifier, etc.),
    • original contributor,
    • maintainer/current code owner,
    • link to API reference,

    optional additional fields:

    • links to literature references,
    • package dependency,
    • code health/status (under development, mature, etc),
    • link to example notebooks

    use cases:

    • search for suitable algorithms given a problem
    • search for information about a given algorithm

    reference example from mlr3 library:

    • https://mlr3extralearners.mlr-org.com/articles/learners/learner_status.html
    • https://mlr3extralearners.mlr-org.com/articles/learners/list_learners.html
    • GitHub action to automatically generate table: https://github.com/mlr-org/mlr3extralearners/blob/main/.github/workflows/update_learner_table.yml

    related code in sktime:

    from sktime.utils import all_estimators
    all_estimators(estimator_types="classifier")
    

    https://github.com/alan-turing-institute/sktime/blob/ee7a06843a44f4aaec7582d847e36073a9ab0566/sktime/utils/init.py#L16

    questions:

    • integrate it with API reference: https://www.sktime.org/en/latest/api_reference.html (adding columns to that table)?

    notes:

    • make clear that not all algorithm are reference implementations, even though we link them with the original research paper, some may be re-implemented, adapted or interfaced from other packages
    • encourage people to ping algorithm maintainers directly when they have questions about algorithm using the GitHub user name
    • Adding col for maintainer using GH
    documentation 
    opened by mloning 38
  • Add Clasp for time series segmentation (CIKM'21 publication)

    Add Clasp for time series segmentation (CIKM'21 publication)

    Reference Issues/PRs

    #1346

    What does this implement/fix? Explain your changes.

    This request implements ClaSP, a novel time series segmentation method published at CIKM '21.

    The segmentation is implemented as an annotator, and the profile in ClaSP is implemented as a SeriesToSeriesTransformer.

    implementing algorithms 
    opened by patrickzib 37
  • Refactor tags

    Refactor tags

    Is your feature request related to a problem? Please describe.

    • use of "capabilities" rather than "tags" in classification module
    • unclear naming of _all_tags method
    • inconsistent tags for composition algorithms (see #977)
    • integration in all_estimators function to make them usable in finding/filtering models

    Describe the solution you'd like

    • refactor capabilities into tags
    • rename _all_tags to get_tags and make public
    • add tag kwarg to all_estimators
    • add tests to catch inconsistent tags
    API design 
    opened by mloning 37
  • Forecaster interface: getting in-sample forecasts, getting residuals

    Forecaster interface: getting in-sample forecasts, getting residuals

    Is your feature request related to a problem? Please describe. I think there should be a method or property to obtain in-sample fitted value or residuals, which are important in many forecasting methods, such as forecast combination.

    Describe the solution you'd like

    Add a method or property to BaseForecaster and it should be implemented for every forecaster.

    Additional context A traditional forecast combination method based on variance of in-sample residuals.

    feature request good first issue API design module:forecasting 
    opened by AngelPone 36
  • Plotting the prediction interval

    Plotting the prediction interval

    Reference Issues/PRs

    fixes the issue #541

    What does this implement/fix? Explain your changes.

    Adds the pred_int option to the plot_series function

    Does your contribution introduce a new dependency? If yes, which one?

    No

    What should a reviewer concentrate their feedback on?

    Focus on the plot_series function in the plotting.py file

    Any other comments?

    PR checklist

    For all contributions
    • [ ] I've added myself to the list of contributors.
    • [ ] Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
    • [ ] I've added unit tests and made sure they pass locally.
    For new estimators
    • [ ] I've added the estimator to the online documentation.
    • [ ] I've updated the existing example notebooks or provided a new one to showcase how my estimator works.
    opened by Ifeanyi30 35
  • New Ensemble Forecasting Methods

    New Ensemble Forecasting Methods

    Reference Issues/PRs

    Refers to #271 for implementing New Online Ensemble Forecasting Methods

    What does this implement/fix? Explain your changes.

    This implements hedge based DTOL for choosing how to weight different forecasters in an online format, I included some implementations of these algorithms with more to come:

    • NormalHedge
    • Hedge (Incremental/Doubling)

    I also included one that is very similar to stacking that performs a non-negative least squares fit from the forecasters to the output.

    Does your contribution introduce a new dependency? If yes, which one?

    Currently, the online experts algorithms are included as a script, but as more are added, it would be good to include more in a dedicated repository for online experts.

    PR checklist for new estimators

    For the 'OnlineEnsembleForecaster' I have:

    • [x] I have added unit tests and made sure they pass locally.
    • [x] I have updated the existing example notebooks or provided a new one to showcase how my algorithm works.
    • [x] I have updated sktime's estimator overview and I'm committed to maintain my contribution and provide bug fixes if necessary.

    Any other comments?

    I also added a cv method called 'OnlineSplitter' which helps me control the behavior of update_predict for forecasting in an online fashion.

    implementing algorithms 
    opened by magittan 33
  • TSC base template refactor

    TSC base template refactor

    I've started refactoring the time series classification base template according to #993.

    This is just a start, but I would like to:

    • move what is now "capabilities" to a joint "tags" systems across all estimator scitypes
    • add functionality to handle multiple input types along the lines of #980, especially since some estimators assume nested df and some assume numpy
    • refactor TSC similar to #955

    @TonyBagnall, we will be discussing tags today in the core dev sprint hours, with @mloning, @aiwalterand @thayeylolu - would be great if you could join.

    module:classification 
    opened by fkiraly 32
  • Adding soft dependencies instructions

    Adding soft dependencies instructions

    Reference Issues/PRs

    Fixes Issue #1759

    What does this implement/fix? Explain your changes.

    In this PR instructions related to soft dependencies will be added to extension template and documentation

    opened by Aparna-Sakshi 0
  • [ENH] remove readthedocs ads

    [ENH] remove readthedocs ads

    Now that sktime has a regular donation income, we should use it to remove the readthedocs ads (which come from readthedocs, not sktime).

    The requirement for open source projects is a gold membership which costs 5 dollars per month. https://readthedocs.org/sustainability/#gold-membership

    For discussion & decision in the next CC meeting.

    governance 
    opened by fkiraly 0
  • Lloyds refinement

    Lloyds refinement

    Reference Issues/PRs

    What does this implement/fix? Explain your changes.

    This adds fit attempts to the lloyds algorithm. This is needed because the result of kmeans is so dependent on the initial center intialisations, by rerunning the model multiple times (10 times by default) with different starting centers and then saving the best results, this overall improves the consitency and quality of the models produced.

    Does your contribution introduce a new dependency? If yes, which one?

    No

    What should a reviewer concentrate their feedback on?

    Any other comments?

    PR checklist

    For all contributions
    • [x] I've added myself to the list of contributors.
    • [x] Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
    • [x] I've added unit tests and made sure they pass locally.
    For new estimators
    • [x] I've added the estimator to the online documentation.
    • [x] I've updated the existing example notebooks or provided a new one to showcase how my estimator works.
    opened by chrisholder 0
  • [BUG] Potential issues with Theta Forecaster predict_quantiles

    [BUG] Potential issues with Theta Forecaster predict_quantiles

    Describe the bug

    There is a bug in the Theta Forecaster:

    • The score function called as z = _zscore(1 - a) is called with the default value for two_tailed (True), as seen in the function signature def _zscore(level: float, two_tailed: bool = True) -> float. Given that we are seeking quantiles and not coverage, the distribution should be one-tailed.

    Upon some discussion we had with @fkiraly there might be an additional bug related to predictive variance not increasing when calculating the error:

    z = _zscore(1 - a)
    error = z * sem
    

    I suspect that the following lines may account for the increase in variance, but need to verify:

    sem = self.sigma_ * np.sqrt(
                self.fh.to_relative(self.cutoff) * self.initial_level_ ** 2 + 1
            )
    
    

    I will be working on a PR shortly. Some code cleaning is also necessary.

    bug 
    opened by kejsitake 0
  • [BUG] Forecasting pipelines to infer correctly whether they support univariate data or not

    [BUG] Forecasting pipelines to infer correctly whether they support univariate data or not

    Forecasting pipelines need to infer correctly whether they support univariate data or not.

    For instance, the TransformedTargetForecaster will not support multivariate forecasts if one of the transformers is univariate.

    Arising from #1846.

    bug module:forecasting 
    opened by fkiraly 0
  • [ENH] data type conventions - should time index be unique?

    [ENH] data type conventions - should time index be unique?

    Should all data type conventions adopt the strict requirement that time indices be unique?

    For instance, should the pd.DataFrame format specification ensure that the DataFrame.index has no duplicates? Currently, the checks allow it, but if there are duplicates, there are errors in various places, for instance in the conversions.

    The "safe" fix seems to be to add checks to require the assumption for which our codebase is robust; the "soft" fix is to add tests which check correctness on fixtures where there are repeated indices for compliance.

    An example "bug" (bug depending on this assumption) is in the AA_datatypes notebook as of current main. Since load_arrow_head has a (potential) bug that loads a data frame with repeated indices, the subsequent call to convert_to results in a data frame that is still nested and does not comply with the df-multiindex specification.

    API design module:datatypes 
    opened by fkiraly 0
  • ipython: Notebook does not appear to be JSON

    ipython: Notebook does not appear to be JSON

    Describe the issue linked to the documentation

    When I opened the github repository documentation tutorial and jupyter Notebook file of the case, I found that he gave me an error: NotJSONError("Notebook does not appear to be JSON: '../../../examples/01_forecasting.ipynb'...., The link of the relevant problem is in , which probably means that there is an extra comma in the JSON format. How should I solve this problem? Officials can re-upload a file.

    image

    image

    An error is also reported directly in sktime projects: image

    Suggest a potential alternative/fix

    After the modification, upload a new project file

    documentation 
    opened by chenruhai 0
  • [ENH] weighted average of distances & distance composites

    [ENH] weighted average of distances & distance composites

    There should be a wrapper class CompositeDistance (or similar) that allows to specify, as parameters:

    • an iterable of distances, or just a single distance
    • a way to aggregate, e.g., "mean" or "median" (string) or a numpy function
    • an iterable of weights (optional)
    • an iterable of variables, or lists of variables (optional)

    If iterables are passed, they must be of same length.

    The distance computed is the aggregate/average of: the i-th distance (the one distance, if not an iterable), computed on the i-th variable (set; all variables if none provided), multiplied by the i-th weight (1 if not provided).

    good first issue module:distances 
    opened by fkiraly 0
  • [ENH] weighted version of scipy based distance class

    [ENH] weighted version of scipy based distance class

    Also related to #1884.

    I think it would be a good idea to extend ScipyDist with an optional parameter weights (1D iterable of float) which multiples the i-th column of X, X2 with the i-the element in weights before applying the distance. If weights are passed, there should be a check in _transform for weights having length equal to number of columns of X, `X2.

    Note: this should be done manually in sktime, since weighting is deprecated in scipy.

    good first issue module:distances 
    opened by fkiraly 0
  • [ENH] Arithmetic dunder methods for distances/kernels

    [ENH] Arithmetic dunder methods for distances/kernels

    From #1884, I think it would be useful to implement the following arithmetic dunder methods for distances-kernels, i.e., the two base classes BasePairwiseTransformer and BasePairwiseTransformerPanel:

    -[ ] __add__ and __radd__. If other is a float, then this should return a wrapped class whose transform(X, X2) returns other + self.transform(X, X2). If other is a transformer, then this should return self.transform(X, X2) + self.transform(X, X2) -[ ] __mul__ and __rmul__. Similar as above for float and transformer.

    Is this a good idea design-wise?

    feature request module:distances 
    opened by fkiraly 0
Releases(v0.9.0)
  • v0.9.0(Dec 8, 2021)

    What's New

    Please see our changelog for a description of all changes.

    New Contributors

    • @OliverMatthews made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1588
    • @Carlosbogo made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1600
    • @chernika158 made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1615
    • @fstinner made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1567
    • @MrPr3ntice made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1636
    • @lmmentel made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1644
    • @AngelPone made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1641
    • @marcio55afr made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1671

    All Contributors

    All contributors: @Carlosbogo, @MatthewMiddlehurst, @OliverMatthews, @RNKuhns, @TonyBagnall, @fkiraly, @freddyaboulton and @mloning

    Source code(tar.gz)
    Source code(zip)
  • v0.8.1(Oct 28, 2021)

    What's New

    Please see our changelog for a description of all changes.

    All contributors: @Aparna-Sakshi, @BINAYKUMAR943, @IlyasMoutawwakil, @MatthewMiddlehurst, @Piyush1729, @RNKuhns, @RavenRudi, @SveaMeyer13, @TonyBagnall, @afzal442, @aiwalter, @bobbys-dev, @boukepostma, @danbartl, @eyalshafran, @fkiraly, @freddyaboulton, @kejsitake, @mloning, @myprogrammerpersonality, @patrickzib, @ronnie-llamado, @xiaobenbenecho and @yairbeer

    New Contributors

    • @ronnie-llamado made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1437
    • @bobbys-dev made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1453
    • @myprogrammerpersonality made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1464
    • @xiaobenbenecho made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1409
    • @yairbeer made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1475
    • @kejsitake made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1473
    • @boukepostma made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1493
    • @RavenRudi made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1487
    • @eyalshafran made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1489
    • @danbartl made their first contribution in https://github.com/alan-turing-institute/sktime/pull/1356
    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(Sep 17, 2021)

    What's New

    Please see our changelog for a description of all changes.

    All contributors: @Aparna-Sakshi, @AreloTanoh, @BINAYKUMAR943, @Flix6x, @GuzalBulatova, @IlyasMoutawwakil, @Lovkush-A, @MatthewMiddlehurst, @RNKuhns, @SveaMeyer13, @TonyBagnall, @afzal442, @aiwalter, @bilal-196, @corvusrabus, @fkiraly, @freddyaboulton, @juanitorduz, @justinshenk, @ltoniazzi, @mathco-wf, @mloning, @moradabaz, @pul95, @tensorflow-as-tf, @thayeylolu, @victordremov, @whackteachers and @xloem

    Source code(tar.gz)
    Source code(zip)
  • v0.7.0(Jul 13, 2021)

    What's New

    Added

    • new module (experimental): Time Series Clustering (#1049) @TonyBagnall
    • new module (experimental): Pairwise transformers, kernels/distances on tabular data and panel data - base class, examples, extension templates (#1071) @fkiraly @chrisholder
    • new module (experimental): Series annotation and PyOD adapter (#1021) @fkiraly @satya-pattnaik
    • Clustering extension templates, docstrings & get_fitted_params (#1100) @fkiraly
    • New Classifier: Implementation of signature based methods. (#714) @jambo6
    • New Forecaster: Croston's method (#730) @Riyabelle25
    • New Forecaster: ForecastingPipeline for pipelining with exog data (#967) @aiwalter
    • New Transformer: Multivariate Detrending (#1042) @SveaMeyer13
    • New Transformer: ThetaLines transformer (#923) @GuzalBulatova
    • sktime registry (#1067) @fkiraly
    • Feature/information criteria get_fitted_params (#942) @ltsaprounis
    • Add plot_correlations() to plot series and acf/pacf (#850) @RNKuhns
    • Add doc-quality tests on changed files (#752) @mloning
    • Docs: Create add_dataset.rst (#970) @Riyabelle25
    • Added two new related software packages (#1019) @aiwalter
    • Added orbit as related software (#1128) @aiwalter
    • adding fkiraly as codeowner for forecasting base classes (#989) @fkiraly
    • added mloning and aiwalter as forecasting/base code owners (#1108) @fkiraly

    Changed

    • Update metric to handle y_train (#858) @RNKuhns
    • TSC base template refactor (#1026) @fkiraly
    • Forecasting refactor: base class refactor and extension template (#912) @fkiraly
    • Forecasting refactor: base/template docstring fixes, added fit_predict method (#1109) @fkiraly
    • Forecasters refactor: NaiveForecaster (#953) @fkiraly
    • Forecasters refactor: BaseGridSearch, ForecastingGridSearchCV, ForecastingRandomizedSearchCV (#1034) @GuzalBulatova
    • Forecasting refactor: polynomial trend forecaster (#1003) @thayeylolu
    • Forecasting refactor: Stacking, Multiplexer, Ensembler and TransformedTarget Forecasters (#977) @thayeylolu
    • Forecasting refactor: statsmodels and theta forecaster (#1029) @thayeylolu
    • Forecasting refactor: reducer (#1031) @Lovkush-A
    • Forecasting refactor: ensembler, online-ensembler-forecaster and descendants (#1015) @thayeylolu
    • Forecasting refactor: TbatAdapter (#1017) @thayeylolu
    • Forecasting refactor: PmdArimaAdapter (#1016) @thayeylolu
    • Forecasting refactor: Prophet (#1005) @thayeylolu
    • Forecasting refactor: CrystallBall Forecaster (#1004) @thayeylolu
    • Forecasting refactor: default tags in BaseForecaster; added some new tags (#1013) @fkiraly
    • Forecasting refactor: removing _SktimeForecaster and horizon mixins (#1088) @fkiraly
    • Forecasting tutorial rework (#972) @fkiraly
    • Added tuning tutorial to forecasting example notebook - fkiraly suggestions on top of #1047 (#1053) @fkiraly
    • Classification: Kernel based refactor (#875) @MatthewMiddlehurst
    • Classification: catch22 Remake (#864) @MatthewMiddlehurst
    • Forecasting: Remove step_length hyper-parameter from reduction classes (#900) @mloning
    • Transformers: Make OptionalPassthrough to support multivariate input (#1112) @aiwalter
    • Transformers: Improvement to Multivariate-Detrending (#1077) @SveaMeyer13
    • Update plot_series to handle pd.Int64 and pd.Range index uniformly (#892) @Dbhasin1
    • Including floating numbers as a window length (#827) @thayeylolu
    • update docs on loading data (#885) @SveaMeyer13
    • Update docs (#887) @mloning
    • [DOC] Updated docstrings to inform that methods accept ForecastingHorizon (#872) @julramos

    Fixed

    • Fix use of seasonal periodicity in naive model with mean strategy (from PR #917) (#1124) @mloning
    • Fix ForecastingPipeline import (#1118) @mloning
    • Bugfix - forecasters should use internal interface _all_tags for self-inspection, not _has_tag (#1068) @fkiraly
    • bugfix: Prophet adapter fails to clone after setting parameters (#911) @Yard1
    • Fix seeding issue in Minirocket Classifier (#1094) @Lovkush-A
    • fixing soft dependencies link (#1035) @fkiraly
    • Fix minor typos in docstrings (#889) @GuzalBulatova
    • Fix manylinux CI (#914) @mloning
    • Add limits.h to ensure pip install on certain OS's (#915) @tombh
    • Fix side effect on input for Imputer and HampelFilter (#1089) @aiwalter
    • BaseCluster class issues resolved (#1075) @chrisholder
    • Cleanup metric docstrings and fix bug in _RelativeLossMixin (#999) @RNKuhns
    • minor clarifications in forecasting extension template preamble (#1069) @fkiraly
    • Fix fh in imputer method based on in-sample forecasts (#861) @julramos
    • Arsenal fix, extended capabilities and HC1 unit tests (#902) @MatthewMiddlehurst
    • minor bugfix - setting _is_fitted to False before input checks in forecasters (#941) @fkiraly
    • Properly process random_state when fitting Time Series Forest ensemble in parallel (#819) @kachayev
    • bump nbqa (#998) @MarcoGorelli
    • datetime: Construct Timedelta from parsed pandas frequency (#873) @ckastner

    All contributors: @Dbhasin1, @GuzalBulatova, @Lovkush-A, @MarcoGorelli, @MatthewMiddlehurst, @RNKuhns, @Riyabelle25, @SveaMeyer13, @TonyBagnall, @Yard1, @aiwalter, @chrisholder, @ckastner, @fkiraly, @jambo6, @julramos, @kachayev, @ltsaprounis, @mloning, @thayeylolu and @tombh

    Source code(tar.gz)
    Source code(zip)
  • v0.6.1(May 14, 2021)

    What's New

    Fixed

    • Exclude Python 3.10 from manylinux CI (#870) @mloning
    • Fix AutoETS handling of infinite information criteria (#848) @ltsaprounis
    • Fix smape import (#851) @mloning

    Changed

    • ThetaForecaster now works with initial_level (#769) @yashlamba
    • Use joblib to parallelize ensemble fitting for Rocket classifier (#796) @kachayev
    • Update maintenance tools (#829) @mloning
    • Undo pmdarima hotfix and avoid pmdarima 1.8.1 (#831) @aaronreidsmith
    • Hotfix pmdarima version (#828) @aiwalter

    Added

    • Added Guerrero method for lambda estimation to BoxCoxTransformer (#778) (#791) @GuzalBulatova
    • New forecasting metrics (#801) @RNKuhns
    • Implementation of DirRec reduction strategy (#779) @luiszugasti
    • Added cutoff to BaseGridSearch to use any grid search inside evaluate… (#825) @aiwalter
    • Added pd.DataFrame transformation for Imputer and HampelFilter (#830) @aiwalter
    • Added default params for some transformers (#834) @aiwalter
    • Added several docstring examples (#835) @aiwalter
    • Added skip-inverse-transform tag for Imputer and HampelFilter (#788) @aiwalter
    • Added a reference to alibi-detect (#815) @satya-pattnaik

    All contributors: @GuzalBulatova, @RNKuhns, @aaronreidsmith, @aiwalter, @kachayev, @ltsaprounis, @luiszugasti, @mloning, @satya-pattnaik and @yashlamba

    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Apr 15, 2021)

    What's New

    Fixed

    • Fix counting for Github's automatic language discovery (#812) @xuyxu
    • Fix counting for Github's automatic language discovery (#811) @xuyxu
    • Fix examples CI checks (#793) @mloning
    • Fix TimeSeriesForestRegressor (#777) @mloning
    • Fix Deseasonalizer docstring (#737) @mloning
    • SettingWithCopyWarning in Prophet with exogenous data (#735) @jschemm
    • Correct docstrings for check_X and related functions (#701) @Lovkush-A
    • Fixed bugs mentioned in #694 (#697) @AidenRushbrooke
    • fix typo in CONTRIBUTING.md (#688) @luiszugasti
    • Fix duplicacy in the contribution's list (#685) @afzal442
    • HIVE-COTE 1.0 fix (#678) @MatthewMiddlehurst

    Changed

    • Update sklearn version (#810) @mloning
    • Remove soft dependency check for numba (#808) @mloning
    • Modify tests for forecasting reductions (#756) @Lovkush-A
    • Upgrade nbqa (#794) @MarcoGorelli
    • Enhanced exception message of splitters (#771) @aiwalter
    • Enhance forecasting model selection/evaluation (#739) @mloning
    • Pin PyStan version (#751) @mloning
    • master to main conversion in docs folder closes #644 (#667) @ayan-biswas0412
    • Update governance (#686) @mloning
    • remove MSM from unit tests for now (#698) @TonyBagnall
    • Make update_params=true by default (#660) @pabworks
    • update dataset names (#676) @TonyBagnall

    Added

    • Add support for exogenous variables to forecasting reduction (#757) @mloning
    • Added forecasting docstring examples (#772) @aiwalter
    • Added the agg argument to EnsembleForecaster (#774) @Ifeanyi30
    • Added OptionalPassthrough transformer (#762) @aiwalter
    • Add doctests (#766) @mloning
    • Multiplexer forecaster (#715) @koralturkk
    • Upload source tarball to PyPI during releases (#749) @dsherry
    • Create developer guide (#734) @mloning
    • Refactor TSF classifier into TSF regressor (#693) @luiszugasti
    • Outlier detection with HampelFilter (#708) @aiwalter
    • changes to contributing.md to include directions to installation (#695) @kanand77
    • Evaluate (example and fix) (#690) @aiwalter
    • Knn unit tests (#705) @TonyBagnall
    • Knn transpose fix (#689) @TonyBagnall
    • Evaluate forecaster function (#657) @aiwalter
    • Multioutput reduction strategy for forecasting (#659) @Lovkush-A

    All contributors: @AidenRushbrooke, @Ifeanyi30, @Lovkush-A, @MarcoGorelli, @MatthewMiddlehurst, @TonyBagnall, @afzal442, @aiwalter, @ayan-biswas0412, @dsherry, @jschemm, @kanand77, @koralturkk, @luiszugasti, @mloning, @pabworks and @xuyxu

    Source code(tar.gz)
    Source code(zip)
  • v0.5.3(Feb 6, 2021)

    What's New

    Fixed

    • Fix reduced regression forecaster reference (#658) @mloning
    • Address Bug #640 (#642) @patrickzib
    • Ed knn (#638) @TonyBagnall
    • Euclidean distance for KNNs (#636) @goastler

    Changed

    • Pin NumPy 1.19 (#643) @mloning
    • Update CoC committee (#614) @mloning
    • Benchmarking issue141 (#492) @ViktorKaz
    • Catch22 Refactor & Multithreading (#615) @MatthewMiddlehurst

    Added

    • Create new factory method for forecasting via reduction (#635) @Lovkush-A
    • Feature ForecastingRandomizedSearchCV (#634) @pabworks
    • Added Imputer for missing values (#637) @aiwalter
    • Add expanding window splitter (#627) @koralturkk
    • Forecasting User Guide (#595) @Lovkush-A
    • Add data processing functionality to convert between data formats (#553) @RNKuhns
    • Add basic parallel support for ElasticEnsemble (#546) @xuyxu

    All contributors: @Lovkush-A, @MatthewMiddlehurst, @RNKuhns, @TonyBagnall, @ViktorKaz, @aiwalter, @goastler, @koralturkk, @mloning, @pabworks, @patrickzib and @xuyxu

    Source code(tar.gz)
    Source code(zip)
  • v0.5.2(Jan 13, 2021)

    What's New

    Fixed

    • Fix ModuleNotFoundError issue (#613) @Hephaest
    • Fixes _fit(X) in KNN (#610) @TonyBagnall
    • UEA TSC module improvements 2 (#599) @TonyBagnall
    • Fix sktime.classification.frequency_based not found error (#606) @Hephaest
    • UEA TSC module improvements 1 (#579) @TonyBagnall
    • Relax numba pinning (#593) @dhirschfeld
    • Fix fh.to_relative() bug for DatetimeIndex (#582) @aiwalter

    All contributors: @Hephaest, @MatthewMiddlehurst, @TonyBagnall, @aiwalter and @dhirschfeld

    Source code(tar.gz)
    Source code(zip)
  • v0.5.1(Dec 29, 2020)

    What's New

    Added

    • Add ARIMA (#559) @HYang1996
    • Add fbprophet wrapper (#515) @aiwalter
    • Add MiniRocket and MiniRocketMultivariate (#542) @angus924
    • Add Cosine, ACF and PACF transformers (#509) @afzal442
    • Add example notebook Window Splitters (#555) @juanitorduz
    • Add SlidingWindowSplitter visualization on doctrings (#554) @juanitorduz

    Fixed

    • Pin pandas version to fix pandas-related AutoETS error on Linux (#581) @mloning
    • Fixed default argument in docstring in SlidingWindowSplitter (#556) @ngupta23

    All contributors: @HYang1996, @TonyBagnall, @afzal442, @aiwalter, @angus924, @juanitorduz, @mloning and @ngupta23

    Source code(tar.gz)
    Source code(zip)
  • v0.5.0(Dec 19, 2020)

    What's New

    Added

    • Add tests for forecasting with exogenous variables (#547) @mloning
    • Add HCrystalBall wrapper (#485) @MichalChromcak
    • Tbats (#527) @aiwalter
    • Added matrix profile using stumpy (#471) @utsavcoding
    • User guide (#377) @mloning
    • Add GitHub workflow for building and testing on macOS (#505) @mloning
    • [DOC] Add dtaidistance (#502) @mloning
    • Implement the feature_importances_ property for RISE (#497) @AaronX121
    • Add scikit-fda to the list of related software (#495) @vnmabus
    • [DOC] Add roadmap to docs (#467) @mloning
    • Add parallelization for RandomIntervalSpectralForest (#482) @AaronX121
    • New Ensemble Forecasting Methods (#333) @magittan
    • CI run black formatter on notebooks as well as Python scripts (#437) @MarcoGorelli
    • Implementation of catch22 transformer, CIF classifier and dictionary based clean-up (#453) @MatthewMiddlehurst
    • Added write dataset to ts file functionality (#438) @whackteachers
    • Added ability to load from csv containing long-formatted data (#442) @AidenRushbrooke
    • Transform typing (#420) @mloning

    Changed

    • Refactoring utils and transformer module (#538) @mloning
    • Update README (#454) @mloning
    • Clean up example notebooks (#548) @mloning
    • Update README.rst (#536) @aiwalter
    • [Doc]Updated load_data.py (#496) @Afzal-Ind
    • Update forecasting.py (#487) @raishubham1
    • update basic motion description (#475) @vollmersj
    • [DOC] Update docs in benchmarking/data.py (#489) @Afzal-Ind
    • Edit Jupyter Notebook 01_forecasting (#486) @bmurdata
    • Feature & Performance improvements of SFA/WEASEL (#457) @patrickzib
    • Moved related software from wiki to docs (#439) @mloning

    Fixed

    • Fixed issue outlined in issue 522 (#537) @ngupta23
    • Fix plot-series (#533) @gracewgao
    • added mape_loss and cosmetic fixes to notebooks (removed kernel) (#500) @tch
    • Fix azure pipelines (#506) @mloning
    • [DOC] Fix broken docstrings of RandomIntervalSpectralForest (#473) @AaronX121
    • Add back missing bibtex reference to classifiers (#468) @whackteachers
    • Avoid seaborn warning (#472) @davidbp
    • Bump pre-commit versions, run again on notebooks (#469) @MarcoGorelli
    • Fix series validation (#463) @mloning
    • Fix soft dependency imports (#446) @mloning
    • Fix bug in AutoETS (#445) @HYang1996
    • Add ForecastingHorizon class to docs (#444) @mloning

    Removed

    • Remove manylinux1 (#458) @mloning

    All contributors: @AaronX121, @Afzal-Ind, @AidenRushbrooke, @HYang1996, @MarcoGorelli, @MatthewMiddlehurst, @MichalChromcak, @TonyBagnall, @aiwalter, @bmurdata, @davidbp, @gracewgao, @magittan, @mloning, @ngupta23, @patrickzib, @raishubham1, @tch, @utsavcoding, @vnmabus, @vollmersj and @whackteachers

    Source code(tar.gz)
    Source code(zip)
  • v0.4.3(Oct 20, 2020)

    What's New

    Added

    • Support for 3d numpy array (#405) @mloning
    • Support for downloading dataset from UCR UEA time series classification data set repository (#430) @Emiliathewolf
    • Univariate time series regression example to TSFresh notebook (#428) @evanmiller29
    • Parallelized TimeSeriesForest using joblib. (#408) @kkoziara
    • Unit test for multi-processing (#414) @kkoziara
    • Add date-time support for forecasting framework (#392) @mloning

    Changed

    • Performance improvements of dictionary classifiers (#398) @patrickzib

    Fixed

    • Fix links in Readthedocs and Binder launch button (#416) @mloning
    • Fixed small bug in performance metrics (#422) @krumeto
    • Resolved warnings in notebook examples (#418) @alwinw
    • Better error handling for missing soft dependencies (#410) @alwinw

    All contributors: @Emiliathewolf, @alwinw, @evanmiller29, @kkoziara, @krumeto, @mloning and @patrickzib

    Source code(tar.gz)
    Source code(zip)
  • v0.4.2(Oct 1, 2020)

    What's new

    Added

    • ETSModel with auto-fitting capability (#393) @HYang1996
    • WEASEL classifier (#391) @patrickzib
    • Full support for exogenous data in forecasting framework (#382) @mloning, (#380) @mloning
    • Multivariate dataset for US consumption over time (#385) @SebasKoel
    • Governance document (#324) @mloning, @fkiraly

    Fixed

    • Documentation fixes (#400) @brettkoonce, (#399) @akanz1, (#404) @alwinw

    Changed

    • Move documentation to ReadTheDocs with support for versioned documentation (#395) @mloning
    • Refactored SFA implementation (additional features and speed improvements) (#389) @patrickzib
    • Move prediction interval API to base classes in forecasting framework (#387) @big-o
    • Documentation improvements (#364) @mloning
    • Update CI and maintenance tools (#394) @mloning

    All contributors: @HYang1996, @SebasKoel, @fkiraly, @akanz1, @alwinw, @big-o, @brettkoonce, @mloning, @patrickzib

    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Jul 9, 2020)

    What's new:

    Added

    • TemporalDictionaryEnsemble (#292) @MatthewMiddlehurst
    • ShapeDTW (#287) @Multivin12
    • Updated sktime artwork (logo) @mloning
    • Truncation transformer (#315) @ABostrom
    • Padding transformer (#316) @ABostrom
    • Example notebook with feature importance graph for time series forest (#319) @HYang1996
    • ACSF1 data set (#314) @BandaSaiTejaReddy
    • Data conversion function from 3d numpy array to nested pandas dataframe (#304) @vedazeren

    Changed

    • Replaced gunpoint dataset in tutorials, added OSULeaf dataset (#295) @marielledado
    • Updated macOS advanced install instructions (#306) (#308) @sophijka
    • Updated contributing guidelines (#301) @Ayushmaanseth

    Fixed

    • Typos (#293) @Mo-Saif, (#285) @Pangoraw, (#305) @hiqbal2
    • Manylinux wheel building (#286) @mloning
    • KNN compatibility with sklearn (#310) @Cheukting
    • Docstrings for AutoARIMA (#307) @btrtts

    All contributors: @Ayushmaanseth, @Mo-Saif, @Pangoraw, @marielledado, @mloning, @sophijka, @Cheukting, @MatthewMiddlehurst, @Multivin12, @ABostrom, @HYang1996, @BandaSaiTejaReddy, @vedazeren, @hiqbal2, @btrtts

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Jun 5, 2020)

    What's New:

    Added

    • Forecasting framework, including: forecasting algorithms (forecasters), tools for composite model building (meta-forecasters), tuning and model evaluation
    • Consistent unit testing of all estimators
    • Consistent input checks
    • Enforced PEP8 linting via flake8
    • Changelog
    • Support for Python 3.8
    • Support for manylinux wheels

    Changed

    • Revised all estimators to comply with common interface and to ensure scikit-learn compatibility

    Removed

    • A few redundant classes for the series-as-features setting in favour of scikit-learn's implementations: Pipeline and GridSearchCV
    • HomogeneousColumnEnsembleClassifier in favour of more flexible ColumnEnsembleClassifier

    Fixed

    • Deprecation and future warnings from scikit-learn
    • User warnings from statsmodels
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Apr 12, 2020)

    What's new:

    • Availability of compiled wheels for Windows, OSX and Linux on PyPI,
    • Encapsulated shapelet transform classifier,
    • Matrix profile transformer,
    • Revised benchmarking framework,
    • Bug fixes for time series classification and forecasting,
    • Updated documentation and added binder support for example notebooks.
    Source code(tar.gz)
    Source code(zip)
    sktime-0.3.1-cp37-cp37m-linux_x86_64.whl(2.50 MB)
    sktime-0.3.1-cp37-cp37m-macosx_10_7_x86_64.whl(2.30 MB)
    sktime-0.3.1-cp37-cp37m-macosx_10_9_x86_64.whl(2.30 MB)
    sktime-0.3.1-cp37-cp37m-win_amd64.whl(2.29 MB)
  • v0.3.0(Aug 6, 2019)

    What's new:

    • Tools for multivariate time series classification (time series column concatenation, column ensembling, bespoke methods),
    • Dictionary-based classifiers including BOSS,
    • Reorganisation of transformer and classifier submodule,
    • Updated algorithms including Proximity Forest and shapelet transforms,
    • New transformers including PlateauFinder and spectral-based transformations like power spectrum,
    • Composite forecasters such as TransformedTargetForecaster and ReducedRegressionForecaster,
    • Additional datasets.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(May 16, 2019)

    What's new:

    • Proximity Forest,
    • Elastic Ensemble,
    • BOSS Ensemble,
    • Time series neighbours,
    • Forecasting framework,
    • Framework for orchestrating and evaluating prediction experiments.
    Source code(tar.gz)
    Source code(zip)
Owner
The Alan Turing Institute
The UK's national institute for data science and artificial intelligence.
The Alan Turing Institute
Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion is a Python library for time series intelligence. It provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processing model outputs, and evaluating model performance. I

Salesforce 2.3k Jan 13, 2022
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 112 Jan 10, 2022
A machine learning toolkit dedicated to time-series data

tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti

null 2k Jan 12, 2022
A machine learning toolkit dedicated to time-series data

tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti

null 2k Jan 19, 2022
Python module for machine learning time series:

seglearn Seglearn is a python package for machine learning time series or sequences. It provides an integrated pipeline for segmentation, feature extr

David Burns 485 Jan 16, 2022
ETNA is an easy-to-use time series forecasting framework.

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from classic machine learning to SOTA neural networks, models combination methods and smart backtesting. ETNA is designed to make working with time series simple, productive, and fun.

Tinkoff.AI 342 Jan 17, 2022
ETNA – time series forecasting framework

ETNA Time Series Library Predict your time series the easiest way Homepage | Documentation | Tutorials | Contribution Guide | Release Notes ETNA is an

Tinkoff.AI 344 Jan 18, 2022
A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

null 3 Nov 24, 2021
Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

CompatibL 20 Aug 22, 2021
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Prophet: Automatic Forecasting Procedure Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends ar

Facebook 13.9k Jan 13, 2022
Open source time series library for Python

PyFlux PyFlux is an open source time series library for Python. The library has a good array of modern time series models, as well as a flexible array

Ross Taylor 1.9k Jan 16, 2022
Automatic extraction of relevant features from time series:

tsfresh This repository contains the TSFRESH python package. The abbreviation stands for "Time Series Feature extraction based on scalable hypothesis

Blue Yonder GmbH 6.1k Jan 21, 2022
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.1k Jan 14, 2022
Probabilistic time series modeling in Python

GluonTS - Probabilistic Time Series Modeling in Python GluonTS is a Python toolkit for probabilistic time series modeling, built around Apache MXNet (

Amazon Web Services - Labs 2.4k Jan 12, 2022
A python library for easy manipulation and forecasting of time series.

Time Series Made Easy in Python darts is a python library for easy manipulation and forecasting of time series. It contains a variety of models, from

Unit8 3.4k Jan 22, 2022
STUMPY is a powerful and scalable Python library for computing a Matrix Profile, which can be used for a variety of time series data mining tasks

STUMPY STUMPY is a powerful and scalable library that efficiently computes something called the matrix profile, which can be used for a variety of tim

TD Ameritrade 2k Jan 14, 2022
A Python package for time series classification

pyts: a Python package for time series classification pyts is a Python package for time series classification. It aims to make time series classificat

Johann Faouzi 1.1k Jan 14, 2022
Time series forecasting with PyTorch

Our article on Towards Data Science introduces the package and provides background information. Pytorch Forecasting aims to ease state-of-the-art time

Jan Beitner 1.7k Jan 21, 2022
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

Auto_TS: Auto_TimeSeries Automatically build multiple Time Series models using a Single Line of Code. Now updated with Dask. Auto_timeseries is a comp

AutoViz and Auto_ViML 377 Jan 21, 2022