A scikit-learn compatible neural network library that wraps PyTorch

Last update: Jan 3, 2023

Overview

A scikit-learn compatible neural network library that wraps PyTorch.

Resources

Examples

To see more elaborate examples, look here.

import numpy as np
from sklearn.datasets import make_classification
from torch import nn

from skorch import NeuralNetClassifier


X, y = make_classification(1000, 20, n_informative=10, random_state=0)
X = X.astype(np.float32)
y = y.astype(np.int64)

class MyModule(nn.Module):
    def __init__(self, num_units=10, nonlin=nn.ReLU()):
        super(MyModule, self).__init__()

        self.dense0 = nn.Linear(20, num_units)
        self.nonlin = nonlin
        self.dropout = nn.Dropout(0.5)
        self.dense1 = nn.Linear(num_units, num_units)
        self.output = nn.Linear(num_units, 2)
        self.softmax = nn.Softmax(dim=-1)

    def forward(self, X, **kwargs):
        X = self.nonlin(self.dense0(X))
        X = self.dropout(X)
        X = self.nonlin(self.dense1(X))
        X = self.softmax(self.output(X))
        return X


net = NeuralNetClassifier(
    MyModule,
    max_epochs=10,
    lr=0.1,
    # Shuffle training data on each epoch
    iterator_train__shuffle=True,
)

net.fit(X, y)
y_proba = net.predict_proba(X)

In an sklearn Pipeline:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler


pipe = Pipeline([
    ('scale', StandardScaler()),
    ('net', net),
])

pipe.fit(X, y)
y_proba = pipe.predict_proba(X)

With grid search:

from sklearn.model_selection import GridSearchCV


# deactivate skorch-internal train-valid split and verbose logging
net.set_params(train_split=False, verbose=0)
params = {
    'lr': [0.01, 0.02],
    'max_epochs': [10, 20],
    'module__num_units': [10, 20],
}
gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy', verbose=2)

gs.fit(X, y)
print("best score: {:.3f}, best params: {}".format(gs.best_score_, gs.best_params_))

skorch also provides many convenient features, among others:

Learning rate schedulers (Warm restarts, cyclic LR and many more)
Scoring using sklearn (and custom) scoring functions
Early stopping
Checkpointing
Parameter freezing/unfreezing
Progress bar (for CLI as well as jupyter)
Automatic inference of CLI parameters

Installation

skorch requires Python 3.5 or higher.

conda installation

You need a working conda installation. Get the correct miniconda for your system from here.

To install skorch, you need to use the conda-forge channel:

conda install -c conda-forge skorch

We recommend to use a conda virtual environment.

Note: The conda channel is not managed by the skorch maintainers. More information is available here.

pip installation

To install with pip, run:

pip install -U skorch

Again, we recommend to use a virtual environment for this.

From source

If you would like to use the most recent additions to skorch or help development, you should install skorch from source.

Using conda

To install skorch from source using conda, proceed as follows:

git clone https://github.com/skorch-dev/skorch.git
cd skorch
conda env create
source activate skorch
pip install .

If you want to help developing, run:

git clone https://github.com/skorch-dev/skorch.git
cd skorch
conda env create
source activate skorch
pip install -e .

py.test  # unit tests
pylint skorch  # static code checks

Using pip

For pip, follow these instructions instead:

git clone https://github.com/skorch-dev/skorch.git
cd skorch
# create and activate a virtual environment
pip install -r requirements.txt
# install pytorch version for your system (see below)
pip install .

If you want to help developing, run:

git clone https://github.com/skorch-dev/skorch.git
cd skorch
# create and activate a virtual environment
pip install -r requirements.txt
# install pytorch version for your system (see below)
pip install -r requirements-dev.txt
pip install -e .

py.test  # unit tests
pylint skorch  # static code checks

PyTorch

PyTorch is not covered by the dependencies, since the PyTorch version you need is dependent on your OS and device. For installation instructions for PyTorch, visit the PyTorch website. skorch officially supports the last four minor PyTorch versions, which currently are:

1.4.0
1.5.1
1.6.0
1.7.1

However, that doesn't mean that older versions don't work, just that they aren't tested. Since skorch mostly relies on the stable part of the PyTorch API, older PyTorch versions should work fine.

In general, running this to install PyTorch should work (assuming CUDA 10.2):

# using conda:
conda install pytorch cudatoolkit==10.2 -c pytorch
# using pip
pip install torch

External resources

@jakubczakon: blog post "8 Creators and Core Contributors Talk About Their Model Training Libraries From PyTorch Ecosystem" 2020
@BenjaminBossan: talk "skorch: A scikit-learn compatible neural network library" at PyCon/PyData 2019
@githubnemo: poster for the PyTorch developer conference 2019
@thomasjpfan: talk "Skorch: A Union of Scikit learn and PyTorch" at SciPy 2019
@thomasjpfan: talk "Skorch - A Union of Scikit-learn and PyTorch" at PyData 2018

Communication

GitHub issues: bug reports, feature requests, install issues, RFCs, thoughts, etc.
Slack: We run the #skorch channel on the PyTorch Slack server, for which you can request access here.

Comments

skorch.fit can't handle lists of lists with variable length
I'm having a hard time figuring out how to pass a list of lists (with variable length) to skorch's fit method.

Specifically, I have a feature that is a list of ID's (e.g. [[1, 12, 3], [6, 22]...]) which are converted to a dense representation using an embedding table in my PyTorch module's forward method:

def forward(self, X_float, X_id_list): ...

When I call net.fit() on my data set (e.g. {"X_float": ..., "X_id_list": ...} I get the following error caused by the list of lists:

ValueError: Dataset does not have consistent lengths.

I've also tried converting the list of lists to a pandas dataframe and numpy array (of objects) and neither works. How do you handle variable length lists of lists in skorch.fit?
question
opened by econti 39
ADABoosting with Skorch

I would like to use skorch to implement Sklearn's ADABoosting methods on torch based models. When implemented (naively), an error saying "NeuralNetClassifier doesn't support sample_weight". I kind of expected this, but is there a way to extend the models to support sample weights?

I apologize if this is an inappropriate question or if has been answered, but I could not find anyone else trying this

opened by QuantumChamploo 37
Checkpoint callback verbosity

Currently, the Checkpoint callback prints a message to terminal every time the model is saved. These messages get mixed with the PrintLog output, breaking the neat progress log table. One can set net.verbosity=False, but this disables output from both callbacks.

What's your opinion on adding a constructor argument to Checkpoint to control its verbosity?

One other idea I had is to let Checkpoint write whether the model was saved or not to the network history (say, under 'checkpoint' key). PrintLog will then have a separate column indicating that. Even fancier, it can analyze these keys (the same way it automatically analyzes '_best' keys) and highlight epoch number (with bold or some color) if checkpoint occurred.

Edit: while we are at this, what about adding an option to save the history? Useful for resuming training later.

opened by taketwo 31
[MRG] Adds optimizer and history to save/load_params
Fixes #357, #361

Adds optimizer and history to save_params and load_params. Deprecates using f.

Adds the optimizer option to Checkpoint.

Deprecates save_history and load_history in favor of save_params and load_params to save history. This simplifies the skorch api to only needing save_params and load_params to save/load state.

Deprecates using the positional argument in favor of keywords.

Adds history serialization (in the History class)

Adds loading params from checkpoint. net.load_params(checkpoint=cp)

Adds LoadInitState to load checkpoint during training.

Adds fn_prefix and dirname to Checkpoint.

This PR can merge easily with #358 by replacing _get_state_dict with the implementation in #358 and returning the state dict.
opened by thomasjpfan 26
Skorch GridSearch CV list index out of range

Im using SlideDataset But list index out of range. This is my code :

data_dir = b_cancer_data2 train_transforms = transforms.Compose([ transforms.RandomRotation(30), transforms.RandomResizedCrop(224), #transforms.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]) val_transforms = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ])

train_ds = datasets.ImageFolder( os.path.join(data_dir, 'train'), train_transforms) val_ds = datasets.ImageFolder( os.path.join(data_dir, 'valid'), val_transforms)

class MyDataset(Dataset): def init(self, train_ds, val_ds): self.train_ds = train_ds self.val_ds = val_ds def len(self): return len(self.train_ds) def getitem(self, i): Xi = self.train_ds[i] yi = self.val_ds[i] return self.transform(Xi, yi) ds = MyDataset(train_ds, val_ds)

class SliceDatasetX(Dataset): def init(self, dataset, collate_fn=default_collate): self.dataset = dataset self.collate_fn = collate_fn self._indices = list(range(len(self.dataset))) def len(self): return len(self.dataset) @property def shape(self): return len(self), def getitem(self, i): if isinstance(i, (int, np.integer)): Xb = self.transform(*self.dataset[i])[0] return Xb if isinstance(i, slice): i = self._indices[i] Xb = self.collate_fn([self.transform(*self.dataset[j])[0] for j in i]) return Xb

class PretrainedModel(nn.Module): def init(self, output_features): super().init() model = models.resnet152(pretrained=True) num_ftrs = model.fc.in_features model.fc = nn.Linear(num_ftrs, output_features) self.model = model def forward(self, x): return self.model(x)

from skorch.callbacks import Checkpoint net = NeuralNetClassifier( PretrainedModel, max_epochs=10, lr=0.1, iterator_train__shuffle=True, verbose=False, train_split=None, callbacks=[checkpoint, freezer], device='cuda' ) from sklearn.model_selection import GridSearchCV params = { 'lr': [0.01, 0.02], 'max_epochs': [10, 20] } gs = GridSearchCV(net, params, refit=False, cv=5, scoring='accuracy') y_from_ds = np.asarray([ds[i][1] for i in range(len(ds))]) ds_sliceable = SliceDatasetX(ds)

But Index Error IndexError: list index out of range Help me please ???

opened by BayuSasongko 23
Document how to freeze layers
See #236.

We should add a minimal example of how to filter parameters with requires_grad=False from the optimizer. E.g. as proposed in https://github.com/dnouri/skorch/issues/236#issuecomment-392579800:

def filtered_optimizer(pgroups, **kwargs): params = filter(lambda t: t[1].requires_grad, pgroups[0]['params']) return torch.optim.SGD(params, **kwargs) net = Net(module, optimizer=filtered_optimizer)

This depends on #260 which would help generalize this example by supporting lists and param group dictionaries alike.
enhancement
opened by ottonemo 23
(WIP) Notebook that shows how to use triplet loss

After some private discussions with @ottonemo and the discussion in #489, I attempted to implement a net with triplet loss using skorch. The result is shown in the notebook below. It is a first draft, intended for discussion.

At the moment, triplets are mined naively, i.e. for each sample, find a random sample of the same class and a random sample of another class. However, this could be used as a base to implement more sophisticated methods like online mining, though others who are more familiar with the topic probably need to comment.

As a first instinct, I wanted to implement the triplet mining on the Sampler level of pytorch. However, this quickly turned out not to work well. The main problem is that although we can pass samplers to the DataLoader in skorch, pytorch expects the sampler to be instantiated. This requires the sampler to have a reference to the data. However, at the time we initialize the net, we may not have a reference to the data yet (e.g. because the data is not split into train/valid yet). Using samplers thus didn't work.

The most practical approach then seemed to be within the Dataset, which is what is shown in the notebook. Currently, the dataset will return a triplet of input data and a triplet of dummy target data. The module then has to deal with the triplet.

Another thing that I wanted to achieve initially was to just stack each triplet before passing it to the module and then to unstack it in get_loss. The main advantage of that approach is that it wouldn't need any special treatment within the Module itself. However, this would result in the Dataset class returning three samples at once, which can lead to all kinds of problems. Also, it would effectively triple the batch size, resulting in unexpectedly high memory needs. Therefore, I didn't take this route at the end and had to modify the module to be able to work with triplets.

If anyone has a better suggestion, please comment. Overall, apart from the logic of finding the positive/negative sample itself, the code didn't seem very complicated.

opened by BenjaminBossan 22
Add support for 'event_' columns in history

(This implements the ideas discussed in #246.)

PrintLog will output such columns on the right, just before 'dur'. When the value stored in history is True, a plus sign will be printed to the table. When the value stored in history is False or None, nothing will be printed, i.e. the cell will be empty.

The 'event_' prefix is stripped off from the key so as to not make the table header unnecessary long.

opened by taketwo 19

Error message when gridSearch with y=None: TypeError: fit() missing 1 required positional argument: 'y'

Hi, I am doing grid search with Skorch. Before the grid search I just do regular fit with Skorch and no error or warning msg. However when I am add grid search I got below error msg:

gs = GridSearchCV(net, params, refit=False, cv=5, scoring='neg_mean_squared_error', verbose=2) # score: higher is better
gs.fit(train_and_val_ds,y=None)

Fitting 5 folds for each of 3600 candidates, totalling 18000 fits
[CV] lr=0.001, module__dropout=0.1, module__num_S=2, module__num_T=2, optimizer=<class 'torch.optim.adam.Adam'>, optimizer__weight_decay=0 
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
/usr/local/lib/python3.7/dist-packages/sklearn/model_selection/_validation.py:536: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: 
TypeError: fit() missing 1 required positional argument: 'y'

  FitFailedWarning)
[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:   28.2s remaining:    0.0s
[CV]  lr=0.001, module__dropout=0.1, module__num_S=2, module__num_T=2, optimizer=<class 'torch.optim.adam.Adam'>, optimizer__weight_decay=0, total=  22.7s
[CV] lr=0.001, module__dropout=0.1, module__num_S=2, module__num_T=2, optimizer=<class 'torch.optim.adam.Adam'>, optimizer__weight_decay=0 
[CV]  lr=0.001, module__dropout=0.1, module__num_S=2, module__num_T=2, optimizer=<class 'torch.optim.adam.Adam'>, optimizer__weight_decay=0, total=  16.7s
[CV] lr=0.001, module__dropout=0.1, module__num_S=2, module__num_T=2, optimizer=<class 'torch.optim.adam.Adam'>, optimizer__weight_decay=0

The fitting is still going even through the error msg. Should I concern about this error msg? (I have my training and label in the same dataset, so I set y=none.)

opened by xiaolongwu0713 18

Save model/ model parameters after cross-validation

Hi, I am using Skorch and Sklearn with PyTorch for cross-validation. This is my code right now, `
train_data_skorch = SliceDict(**train_data)

                               logistic = NeuralNetClassifier(model,
                               lr=opt.lr,
                               batch_size=opt.batch_size,
                               max_epochs=opt.num_epochs,
                               train_split=None,
                               criterion=CrossEntropyLoss,
                               optimizer=optim.Adam,
                               iterator_train__shuffle=False,
                               device="cuda" if torch.cuda.is_available() else "cpu"
                               )

scores = cross_val_score(logistic, train_data_skorch, all_labels_numpy, cv=10, scoring="accuracy")

` How can I save the model after cross-validation is done?

question

opened by rudra0713 18

CLEAN : rm duplicate code in fit_loop.

For the third time, I found that I had to modify fit_loop (once for skipping some validation steps when doing few shot learning, an other time for a workaround of #245 ). Every time I'm a bit hesitant because it's a large function so my changes will likely break for new versions of skorch. I think this should be made a little cleaner + in addition there's a large chunk of duplicate code, which is bad practice. I think it should be split in 2 simple functions.

Nothing very important but it's a bit better. I also give epoch as argument because this is something I usually need and makes sense to give to a function that computes a single epoch (e.g. for logging).

opened by YannDubs 18
Address #925

Changing the _get_param_names method to return a list instead of a generator to fix the exception error message when passing unknown parameters to set_params. Before the error message just included the generator repr-string as the list of possible parameters. Now the string contains the possible parameter names instead.

opened by githubnemo 0

Unhelpful error message when setting invalid parameter with `set_params`

Minimal example:

net.set_params(this_does_not_exist=True)

Expected result:

Invalid parameter 'this_does_not_exist' for estimator ...
Valid parameters are: ['module', 'criterion', ...]

Actual result:

Invalid parameter 'this_does_not_exist' for estimator ...
Valid parameters are: <generator object NeuralNet._get_param_names.<locals>.<genexpr> at 0x7f1cf9d1da10>.

Exception raising code of sklearn (version 1.2.0) looks like this:

    203             if key not in valid_params:
    204                 local_valid_params = self._get_param_names()
--> 205                 raise ValueError(
    206                     f"Invalid parameter {key!r} for estimator {self}. "
    207                     f"Valid parameters are: {local_valid_params!r}."

The fix is probably just to convert the generator expression to a list comprehension.

opened by githubnemo 0

Add support for torch.compile

Description

PyTorch announced the torch.compile feature for torch 2.0:

https://pytorch.org/get-started/pytorch-2.0

It is necessary that we modify skorch if we want to support the feature, since torch.compile needs to be called on the initialized modules and initialization happens inside of skorch.

Now users can create a net like NeuralNet(..., compile=True) and all torch modules should be compiled. By default, compile=False.

Status

Torch 2.0 is not officially release yet, but the feature can already be tested.

For this, follow the install instructions here:

https://pytorch.org/get-started/pytorch-2.0/#requirements

(Note that in some places, like PyTorch nightly, the version is called 1.14.)

To actually see speed improvements, a GPU with a new architecture (Volta, Ampere) is required.

Implementation

I opted for implementing a single new argument, compile, which can be set to True. In that case, during initialization of the module and criterion, the method net.torch_compile will be called on each module separately and the compiled modules will be stored on the net.

So e.g. the net.module_ will be the compiled module. The original module would be accessible through net.module_._orig_module:

https://github.com/pytorch/pytorch/blob/b0bd5c4508a8923685965614c2c74e6a8c82f7ba/torch/_dynamo/eval_frame.py#L71

However, given that it's a private attribute, I assume that the compiled module should just be used, and work, like the original module and that shouldn't cause any trouble.

Furthermore, I wanted users to have the option to easily pass more arguments to torch.compile, which, as always, works through the use of dunder names, e.g. NeuralNet(compile=True, compile__mode=...).

If users set compile=True with a torch version that doesn't support this, an error is raised.

Furthermore, I decided to add a public torch_compile method on NeuralNet. This method gets the module and the module name, and returns the compiled module. The idea here is that if users, for instance, want to skip compiling specific modules, they can do so by checking the module name.

Testing

Since we introduce a new prefix with this PR, testing is a bit more comprehensive than normal. Please take a look at the TestTorchCompile test class, which collects all tests related to this new feature.

Since torch 2.0 is not officially released, I did not include it yet in the CI. Instead, I ran tests locally and it works. However, my GPU architecture is too old to see any speed improvements.

opened by BenjaminBossan 1
Test and update skorch notebooks on Colab
I was notified that the command we use in our notebooks to install packages on Colab no longer works:

! [ ! -z "$COLAB_GPU" ] && pip install ...

The problem is that apparently, the $COLAB_GPU env var was removed. I did a quick check and maybe another env var could be used to check, but who knows if that would be robust. Another approach would be something along the lines of:

try: import google.colab subprocess.run(['python', '-m', 'pip', 'install', ...]) except ImportError: pass

but I haven't tested that.

Other than these issues, it would be good to check if these notebooks still run successfully.
help wanted good first issue
opened by BenjaminBossan 18
Add an example of finetuning a HF VisionTranformer

Take a look at the notebook here. It sticks close to the example given in this blog post.

It uses very little custom code, as everything works almost out of the box.

Basically the same code is also added to a script, together with skorch's Fire-based CLI utility, to show how easy it is to transform the training code into a CLI.

Note: ~This is not quite ready to be merged, the notebook needs some more explanations and the colab link, and the docs have to be updated too.~ Done
enhancement

opened by BenjaminBossan 1

Releases(v0.12.1)

v0.12.1(Nov 18, 2022)
This is a small release which consists mostly of a couple of bug fixes. The standout feature here is the update of the NeptuneLogger, which makes it work with the latest Neptune client versions and adds many useful features, check it out. Big thanks to @twolodzko and colleagues for this update.

Here is the list of all changes:

Add Hugging Face integration tests #904

The entry for the HF badge was missing #905

Fix false warning if iterator_valid__shuffle=False #908

Update the Neptune integration by @twolodzko #906

DOC Update the documentation in several places #909

Don't fail when gpytorch import fails #913

Source code(tar.gz)
Source code(zip)
v0.12.0(Oct 7, 2022)
We're pleased to announce a new skorch release, bringing new features that might interest you.

The main changes relate to better integration with the Hugging Face ecosystem:

Benefit from faster training and inference times thanks to easy integration with accelerate via skorch's AccelerateMixin.

Better integration of tokenizers via skorch's HuggingfaceTokenizer and HuggingfacePretrainedTokenizer; you can even put Hugging Face tokenizers into an sklearn Pipeline and perform a grid search to find the best tokenizer hyperparameters.

Automatically upload model checkpoints to Hugging Face Hub via skorch's HfHubStorage.

Check out this notebook to see how to use skorch and Hugging Face together.

But this is not all. We have added the possibility to load the best model parameters at the end of training when using the EarlyStopping callback. We also added the possibility to remove unneeded attributes from the net after training when it is intended to be only used for prediction by calling the trim_for_prediction method. Moreover, we now show how to use skorch with PyTorch Geometric in this notebook.

As always, this release was made possible by outside contributors. Many thanks to:

Alan deLevie (@adelevie)

Cédric Rommel (@cedricrommel)

Florian Pinault (@floriankrb)

@terminator-ger

Timo Kaufmann (@timokau)

@TrellixVulnTeam

Find below the list of all changes:

Added

Added load_best attribute to EarlyStopping callback to automatically load module weights of the best result at the end of training

Added a method, trim_for_prediction, on the net classes, which trims the net from everything not required for using it for prediction; call this after fitting to reduce the size of the net

Added experimental support for huggingface accelerate; use the provided mixin class to add advanced training capabilities provided by the accelerate library to skorch

Add integration for Huggingface tokenizers; use skorch.hf.HuggingfaceTokenizer to train a Huggingface tokenizer on your custom data; use skorch.hf.HuggingfacePretrainedTokenizer to load a pre-trained Huggingface tokenizer

Added support for creating model checkpoints on Hugging Face Hub using HfHubStorage

Added a notebook that shows how to use skorch with PyTorch Geometric (#863)

Changed

The minimum required scikit-learn version has been bumped to 0.22.0

Initialize data loaders for training and validation dataset once per fit call instead of once per epoch (migration guide)

It is now possible to call np.asarray with SliceDatasets (#858)

Fixed

Fix a bug in SliceDataset that prevented it to be used with to_numpy (#858)

Fix a bug that occurred when loading a net that has device set to None (#876)

Fix a bug that in some cases could prevent loading a net that was trained with CUDA without CUDA

Enable skorch to work on M1/M2 Apple MacBooks (#884)

Source code(tar.gz)
Source code(zip)
v0.11.0(Oct 31, 2021)
We are happy to announce the new skorch 0.11 release:

Two basic but very useful features have been added to our collection of callbacks. First, by setting load_best=True on the Checkpoint callback, the snapshot of the network with the best score will be loaded automatically when training ends. Second, we added a callback InputShapeSetter that automatically adjusts your input layer to have the size of your input data (useful e.g. when that size is not known beforehand).

When it comes to integrations, the MlflowLogger now allows to automatically log to MLflow. Thanks to a contributor, some regressions in net.history have been fixed and it even runs faster now.

On top of that, skorch now offers a new module, skorch.probabilistic. It contains new classes to work with Gaussian Processes using the familiar skorch API. This is made possible by the fantastic GPyTorch library, which skorch uses for this. So if you want to get started with Gaussian Processes in skorch, check out the documentation and this notebook. Since we're still learning, it's possible that we will change the API in the future, so please be aware of that.

Morever, we introduced some changes to make skorch more customizable. First of all, we changed the signature of some methods so that they no longer assume the dataset to always return exactly 2 values. This way, it's easier to work with custom datasets that return e.g. 3 values. Normal users should not notice any difference, but if you often create custom nets, take a look at the migration guide.

And finally, we made a change to how custom modules, criteria, and optimizers are handled. They are now "first class citizens" in skorch land, which means: If you add a second module to your custom net, it is treated exactly the same as the normal module. E.g., skorch takes care of moving it to CUDA if needed and of switching it to train or eval mode. This way, customizing your networks architectures with skorch is easier than ever. Check the docs for more details.

Since these are some big changes, it's possible that you encounter issues. If that's the case, please check our issue page or create a new one.

As always, this release was made possible by outside contributors. Many thanks to:

Autumnii

Cebtenzzre

Charles Cabergs

Immanuel Bayer

Jake Gardner

Matthias Pfenninger

Prabhat Kumar Sahu

Find below the list of all changes:

Added

Added load_best attribute to Checkpoint callback to automatically load state of the best result at the end of training

Added a get_all_learnable_params method to retrieve the named parameters of all PyTorch modules defined on the net, including of criteria if applicable

Added MlflowLogger callback for logging to Mlflow (#769)

Added InputShapeSetter callback for automatically setting the input dimension of the PyTorch module

Added a new module to support Gaussian Processes through GPyTorch. To learn more about it, read the GP documentation or take a look at the GP notebook. This feature is experimental, i.e. the API could be changed in the future in a backwards incompatible way (#782)

Changed

Changed the signature of validation_step, train_step_single, train_step, evaluation_step, on_batch_begin, and on_batch_end such that instead of receiving X and y, they receive the whole batch; this makes it easier to deal with datasets that don't strictly return an (X, y) tuple, which is true for quite a few PyTorch datasets; please refer to the migration guide if you encounter problems (#699)

Checking of arguments to NeuralNet is now during .initialize(), not during __init__, to avoid raising false positives for yet unknown module or optimizer attributes

Modules, criteria, and optimizers that are added to a net by the user are now first class: skorch takes care of setting train/eval mode, moving to the indicated device, and updating all learnable parameters during training (check the docs for more details, #751)

CVSplit is renamed to ValidSplit to avoid confusion (#752)

Fixed

Fixed a few bugs in the net.history implementation (#776)

Fixed a bug in TrainEndCheckpoint that prevented it from being unpickled (#773)

Source code(tar.gz)
Source code(zip)
v0.10.0(Mar 23, 2021)
This one is a smaller release, but we have some bigger additions waiting for the next one.

First we added support for Sacred to help you better organize your experiments. The CLI helper now also works with non-skorch estimators, as long as they are sklearn compatible. Some issues related to learning rate scheduling have been solved.

A big topic this time was also working on performance. First of all, we added a performance section to the docs. Furthermore, we facilitated switching off callbacks completely if performance is absolutely critical. Finally, we improved the speed of some internals (history logging). In sum, that means that skorch should be much faster for small network architectures.

We are grateful to the contributors, new and recurring:

Fariz Rahman

Han Bao

Scott Sievert

supetronix

Timo Kaufmann

Source code(tar.gz)
Source code(zip)
v0.9.0(Aug 30, 2020)
This release of skorch contains a few minor improvements and some nice additions. As always, we fixed a few bugs and improved the documentation. Our learning rate scheduler now optionally logs learning rate changes to the history; moreover, it now allows the user to choose whether an update step should be made after each batch or each epoch.

If you always longed for a metric that would just use whatever is defined by your criterion, look no further than loss_scoring. Also, skorch now allows you to easily change the kind of nonlinearity to apply to the module's output when predict and predict_proba are called, by passing the predict_nonlinearity argument.

Besides these changes, we improved the customization potential of skorch. First of all, the criterion is now set to train or valid, depending on the phase -- this is useful if the criterion should act differently during training and validation. Next we made it easier to add custom modules, optimizers, and criteria to your neural net; this should facilitate implementing architectures like GANs. Consult the docs for more on this. Conveniently, net.save_params can now persist arbitrary attributes, including those custom modules. As always, these improvements wouldn't have been possible without the community. Please keep asking questions, raising issues, and proposing new features. We are especially grateful to those community members, old and new, who contributed via PRs:

Aaron Berk guybuk kqf Michał Słapek Scott Sievert Yann Dubois Zhao Meng

Here is the full list of all changes:

Added

Added the event_name argument for LRScheduler for optional recording of LR changes inside net.history. NOTE: Supported only in Pytorch>=1.4

Make it easier to add custom modules or optimizers to a neural net class by automatically registering them where necessary and by making them available to set_params

Added the step_every argument for LRScheduler to set whether the scheduler step should be taken on every epoch or on every batch.

Added the scoring module with loss_scoring function, which computes the net's loss (using get_loss) on provided input data.

Added a parameter predict_nonlinearity to NeuralNet which allows users to control the nonlinearity to be applied to the module output when calling predict and predict_proba (#637, #661)

Added the possibility to save the criterion with save_params and with checkpoint callbacks

Added the possibility to save custom modules with save_params and with checkpoint callbacks

Changed

Removed support for schedulers with a batch_step() method in LRScheduler.

Raise FutureWarning in CVSplit when random_state is not used. Will raise an exception in a future (#620)

The behavior of method net.get_params changed to make it more consistent with sklearn: it will no longer return "learned" attributes like module_; therefore, functions like sklearn.base.clone, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net; if you want a copy of a fitted net, use copy.deepcopy instead;net.get_params is used under the hood by many sklearn functions and classes, such as GridSearchCV, whose behavior may thus be affected by the change. (#521, #527)

Raise FutureWarning when using CyclicLR scheduler, because the default behavior has changed from taking a step every batch to taking a step every epoch. (#626)

Set train/validation on criterion if it's a PyTorch module (#621)

Don't pass y=None to NeuralNet.train_split to enable the direct use of split functions without positional y in their signatures. This is useful when working with unsupervised data (#605).

to_numpy is now able to unpack dicts and lists/tuples (#657, #658)

When using CrossEntropyLoss, softmax is now automatically applied to the output when calling predict or predict_proba

Fixed

Fixed a bug where CyclicLR scheduler would update during both training and validation rather than just during training.

Fixed a bug introduced by moving the optimizer.zero_grad() call outside of the train step function, making it incompatible with LBFGS and other optimizers that call the train step several times per batch (#636)

Fixed pickling of the ProgressBar callback (#656)

Source code(tar.gz)
Source code(zip)
v0.8.0(Apr 12, 2020)
This release contains improvements on the callback side of things. Thanks to new contributors, skorch now integrates with neptune through NeptuneLogger and Weights & Biases through WandbLogger. We also added PassthroughScoring, which automatically creates epoch level scores based on computed batch level scores.

If you want skorch not to meddle with moving modules and data to certain devices, you can now pass device=None and thus have full control. And if you would like to pass pandas DataFrames as input data but were unhappy with how skorch currently handles them, take a look at DataFrameTransformer. Moreover, we cleaned up duplicate code in the fit loop, which should make it easier for users to make their own changes to it. Finally, we improved skorch compatibility with sklearn 0.22 and added minor performance improvements.

As always, we're very thankful for everyone who opened issues and asked questions on diverse channels; all forms of feedback and questions are welcome. We're also very grateful for all contributors, some old but many new:

Alexander Kolb Benjamin Ajayi-Obe Boris Dayma Jakub Czakon Riccardo Di Maio Thomas Fan Yann Dubois

Here is a list of all the changes and their corresponding ticket numbers in detail:

Added

Added NeptuneLogger callback for logging experiment metadata to neptune.ai (#586)

Add DataFrameTransformer, an sklearn compatible transformer that helps working with pandas DataFrames by transforming the DataFrame into a representation that works well with neural networks (#507)

Added WandbLogger callback for logging to Weights & Biases (#607)

Added None option to device which leaves the device(s) unmodified (#600)

Add PassthroughScoring, a scoring callback that just calculates the average score of a metric determined at batch level and then writes it to the epoch level (#595)

Changed

When using caching in scoring callbacks, no longer uselessly iterate over the data; this can save time if iteration is slow (#552, #557)

Cleaned up duplicate code in the fit_loop (#564)

Fixed

Make skorch compatible with sklearn 0.22 (#571, #573, #575)

Fixed a bug that could occur when a new "settable" (via set_params) attribute was added to NeuralNet whose name starts the same as an existing attribute's name (#590)

Source code(tar.gz)
Source code(zip)
v0.7.0(Nov 29, 2019)
Version 0.7.0

Notable additions are TensorBoard support through a callback and several improvements to the NeuralNetClassifier and NeuralNetBinaryClassifier to make them more compatible with sklearn metrics and packages by adding support for class inference among other things. We are actively pursuing some bigger topics which did not fit in this release such as scoring caching improvements (#557), a DataFrameTransformer (#507) and improvements to the training loop layout (#564) which we hope to bring to the next release.

WARNING: In a future release, the behavior of method net.get_params will change to make it more consistent with sklearn: it will no longer return "learned" attributes like module_. Therefore, functions like sklearn.base.clone, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net. If you want a copy of a fitted net, use copy.deepcopy instead. Note that net.get_params is used under the hood by many sklearn functions and classes, such as GridSearchCV, whose behavior may thus be affected by the change. (#521, #527)

We had an influx of new contributors and users whom we thank for their support by adding pull requests and filing issues! Most notably, thanks to the individual contributors that made this release possible:

Alexander Kolb

Janaki Sheth

Joshy Cyriac

Matthias Gazzari

Sergey Alexandrov

Thomas Fan

Zhao Meng

Here is a list of all the changes and their coresponding ticket numbers in detail:

Added

More careful check for wrong parameter names being passed to NeuralNet (#500)

More helpful error messages when trying to predict using an uninitialized model

Add TensorBoard callback for automatic logging to tensorboard

Make NeuralNetBinaryClassifier work with sklearn.calibration.CalibratedClassifierCV

Improve NeuralNetBinaryClassifier compatibility with certain sklearn metrics (#515)

NeuralNetBinaryClassifier automatically squeezes module output if necessary (#515)

NeuralNetClassifier now has a classes_ attribute after fit is called, which is inferred from y by default (#465, #486)

NeuralNet.load_params with a checkpoint now initializes when needed (#497)

Changed

Improve numerical stability when using NLLLoss in NeuralNetClassifer (#491)

Refactor code to make gradient accumulation easier to implement (#506)

NeuralNetBinaryClassifier.predict_proba now returns a 2-dim array; to access the "old" y_proba, take y_proba[:, 1] (#515)

net.history is now a property that accesses net.history_, which stores the History object (#527)

Remove deprecated skorch.callbacks.CyclicLR, use torch.optim.lr_scheduler.CyclicLR instead

Future Changes

WARNING: In a future release, the behavior of method net.get_params will change to make it more consistent with sklearn: it will no longer return "learned" attributes like module_. Therefore, functions like sklearn.base.clone, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net. If you want a copy of a fitted net, use copy.deepcopy instead. Note that net.get_params is used under the hood by many sklearn functions and classes, such as GridSearchCV, whose behavior may thus be affected by the change. (#521, #527)

Fixed

Fixed a bug that caused LoadInitState not to work with TrainEndCheckpoint (#528)

Fixed NeuralNetBinaryClassifier wrongly squeezing the batch dimension when using batch_size = 1 (#558)

Source code(tar.gz)
Source code(zip)
v0.6.0(Jun 19, 2019)
[0.6.0] - 2019-07-19

This release introduces convenience features such as SliceDataset which makes using torch datasets (e.g. from torchvision) easier in combination with sklearn features such as GridSearchCV. There was also some work to make the transition from CUDA trained models to CPU smoother and learning rate schedulers were upgraded to use torch builtin functionality.

Here's the full list of changes:

Added

Adds FAQ entry regarding the initialization behavior of NeuralNet when passed instantiated models. (#409)

Added CUDA pickle test including an artifact that supports testing on CUDA-less CI machines

Adds train_batch_count and valid_batch_count to history in training loop. (#445)

Adds score method for NeuralNetClassifier, NeuralNetBinaryClassifier, and NeuralNetRegressor (#469)

Wrapper class for torch Datasets to make them work with some sklearn features (e.g. grid search). (#443)

Changed

Repository moved to https://github.com/skorch-dev/skorch/, please change your git remotes

Treat cuda dependent attributes as prefix to cover values set using set_params since previously "criterion_" would not match net.criterion__weight as set by net.set_params(criterion__weight=w)

skorch pickle format changed in order to improve CUDA compatibility, if you have pickled models, please re-pickle them to be able to load them in the future

net.criterion_ and its parameters are now moved to target device when using criteria that inherit from torch.nn.Module. Previously the user had to make sure that parameters such as class weight are on the compute device

skorch now assumes PyTorch >= 1.1.0. This mainly affects learning rate schedulers, whose inner workings have been changed with version 1.1.0. This update will also invalidate pickled skorch models after a change introduced in PyTorch optimizers.

Fixed

Include requirements in MANIFEST.in

Add criterion_ to NeuralNet.cuda_dependent_attributes_ to avoid issues with criterion weight tensors from, e.g., NLLLoss (#426)

TrainEndCheckpoint can be cloned by sklearn.base.clone. (#459)

Thanks to all the contributors:

Bram Vanroy

Damien Lancry

Ethan Rosenthal

Sergey Alexandrov

Thomas Fan

Zayd Hammoudeh

Source code(tar.gz)
Source code(zip)
v0.5.0(Dec 13, 2018)
Version 0.5.0

Added

Basic usage notebook now runs on Google Colab

Advanced usage notebook now runs on Google Colab

MNIST with scikit-learn and skorch now runs on Google Colab

Better user-facing messages when module or optimizer are re-initialized

Added an experimental API (net._register_virtual_param) to register "virtual" parameters on the network with custom setter functions. (#369)

Setting parameters lr, momentum, optimizer__lr, etc. no longer resets the optmizer. As of now you can do net.set_params(lr=0.03) or net.set_params(optimizer__param_group__0__momentum=0.86) without triggering a re-initialization of the optimizer (#369)

Support for scipy sparse CSR matrices as input (as, e.g., returned by sklearn's CountVectorizer); note that they are cast to dense matrices during batching

Helper functions to build command line interfaces with almost no boilerplate, example that shows usage

Changed

Reduce overhead of BatchScoring when using train_loss_score or valid_loss_score by skipping superfluous inference step (#381)

The on_grad_computed callback function will yield an iterable for named_parameters only when it is used to reduce the run-time overhead of the call (#379)

Default fn_prefix in TrainEndCheckpoint is now train_end_ (#391)

Issues a warning when Checkpoints's monitor parameter is set to monitor and the history contains <monitor>_best. (#399)

Fixed

Re-initialize optimizer when set_params is called with lr argument (#372)

Copying a SliceDict now returns a SliceDict instead of a dict (#388)

Calling == on SliceDicts now works as expected when values are numpy arrays and torch tensors

Source code(tar.gz)
Source code(zip)
v0.4.0(Oct 24, 2018)
Organisational

From now on we will organize a change log and document every change directly. If you are a contributor we encourage you to document your changes directly in the change log when submitting a PR to reduce friction when preparing new releases.

Added

Support for PyTorch 0.4.1

There is no need to explicitly name callbacks anymore (names are assigned automatically, name conflicts are resolved).

You can now access the training data in the on_grad_computed event

There is a new image segmentation example

Easily create toy network instances for quick experiments using skorch.toy.make_classifier and friends

New ParamMapper callback to modify/freeze/unfreeze parameters at certain point in time during training:

>>> from sklearn.callbacks import Freezer, Unfreezer >>> net = Net(module, callbacks=[Freezer('layer*.weight'), Unfreezer('layer*.weight', at=10)])

Refactored EpochScoring for easier sub-classing

Checkpoint callback now supports saving the optimizer, this avoids problems with stateful optimizers such as Adam or RMSprop (#360)

Added LoadInitState callback for easy continued training from checkpoints (#360)

NeuralNetwork.load_params now supports loading from Checkpoint instances

Added documentation for saving and loading highlighting the new features

Changed

The ProgressBar callback now determines the batches per epoch automatically by default (batches_per_epoch=auto)

The on_grad_computed event now has access to the current training data batch

Deprecated

Deprecated filtered_optimizer in favor of Freezer callback (#346)

NeuralNet.load_params and NeuralNet.save_params deprecate f parameter for the sake of f_optimizer, f_params and f_history (#360)

Removed

skorch.net.NeuralNetClassifier and skorch.net.NeuralNetRegressor are removed. Use from skorch import NeuralNetClassifier or skorch.NeuralNetClassifier instead.

Fixed

uses_placeholder_y should not require existence of y field (#311)

LR scheduler creates batch_idx on first run (#314)

Use OrderedDict for callbacks to fix python 3.5 compatibility issues (#331)

Make to_tensor work correctly with PackedSequence (#335)

Rewrite History to not use any recursion to avoid memory leaks during exceptions (#312)

Use flaky in some neural network tests to hide platform differences

Fixes ReduceLROnPlateau when mode == max (#363)

Fix disconnected weights between net and optimizer after copying the net with copy.deepcopy (#318)

Fix a bug that interfered with loading CUDA models when the model was a CUDA tensor but the net was configured to use the CPU (#354, #358)

Contributors

Again we'd like to thank all the contributors for their awesome work. Thank you

Andrew Spott

Dave Hirschfeld

Scott Sievert

Sergey Alexandrov

Thomas Fan

Source code(tar.gz)
Source code(zip)
v0.3.0(Jul 26, 2018)
Features

significantly reduced overhead of skorch over pytorch for small/medium loads

predefined splits are easier to use (skorch.helper.predefined_split)

freezing layers is now easier with skorch.helper.filtered_optimizer

introduce NeuralNetBinaryClassifier

introduce early stopping callback

support parallel grid search using Dask

support for LBFGS

history can be saved/loaded independently

learning rate scheduler have a method to simulate behavior

Checkpoint callback supports pickling and history saving

Checkpoint callback is less noisy

added transfer learning tutorial

added tutorial how to expose skorch via REST API

improved documentation

API changes

train_step is now split in train_step and train_step_single in order to support LBFGS, where train_step_single takes the role of your typical training inner-loop when writing PyTorch models

device parameter on skorch.dataset.Dataset is now deprecated

Checkpoint parameter target is deprecated in favor of f_params

Contributors

A big thanks to our contributors who helped making this release possible:

Andrew Spott

Scott Sievert

Sergey Alexandrov

Thomas Fan

Tomasz Pietruszka

Source code(tar.gz)
Source code(zip)
v0.2.0(May 4, 2018)
Features

PyTorch 0.4 support

Add GradNormClipping callback

Add generic learning rate scheduler callback

Add CyclicLR learning rate scheduler

Add WarmRestartLR learning rate scheduler

Scoring callbacks now re-use predictions, accelerating training

fit() and inference methods (e.g., predict()) now support torch.util.data.Dataset as input as long as (X, y) pairs are returned

forward and forward_iter now allow you to specify on which device to store intermediate predictions

Support for setting optimizer param groups using wildcards (e.g., {'layer*.bias': {'lr': 0}})

Computed gradients can now be processed by callbacks using on_grad_computed

Support for fit_params parameter which gets passed directly to the module

Add skorch.helper.SliceDict so that you can use dict as X with sklearn's GridSearchCV, etc.

Add Dockerfile

API changes

Deprecated use_cuda parameter in favor of device parameter

skorch.utils.to_var is gone in favor of skorch.utils.to_tensor

training_step and validation_step now return a dict with the loss and the module's prediction

predict and predict_proba now handle multiple outputs by assuming the first output to be the prediction

NeuralNetClassifier now only takes log of prediction if the criterion is set to NLLLoss

Examples

RNN sentiment classification

Communication

We now run the #skorch channel on the PyTorch slack workspace

Contributors

A big thanks to our contributors who helped making this release possible:

Felipe Ribeiro

Grzegorz Rygielski

Juri Paern

Thomas Fan

Source code(tar.gz)
Source code(zip)