Lightweight library to build and train neural networks in Theano

Lasagne

Last update: Dec 29, 2022

Overview

https://readthedocs.org/projects/lasagne/badge/

Lasagne

Lasagne is a lightweight library to build and train neural networks in Theano. Its main features are:

Supports feed-forward networks such as Convolutional Neural Networks (CNNs), recurrent networks including Long Short-Term Memory (LSTM), and any combination thereof
Allows architectures of multiple inputs and multiple outputs, including auxiliary classifiers
Many optimization methods including Nesterov momentum, RMSprop and ADAM
Freely definable cost function and no need to derive gradients due to Theano's symbolic differentiation
Transparent support of CPUs and GPUs due to Theano's expression compiler

Its design is governed by six principles:

Simplicity: Be easy to use, easy to understand and easy to extend, to facilitate use in research
Transparency: Do not hide Theano behind abstractions, directly process and return Theano expressions or Python / numpy data types
Modularity: Allow all parts (layers, regularizers, optimizers, ...) to be used independently of Lasagne
Pragmatism: Make common use cases easy, do not overrate uncommon cases
Restraint: Do not obstruct users with features they decide not to use
Focus: "Do one thing and do it well"

Installation

In short, you can install a known compatible version of Theano and the latest Lasagne development version via:

pip install -r https://raw.githubusercontent.com/Lasagne/Lasagne/master/requirements.txt
pip install https://github.com/Lasagne/Lasagne/archive/master.zip

For more details and alternatives, please see the Installation instructions.

Documentation

Documentation is available online: http://lasagne.readthedocs.org/

For support, please refer to the lasagne-users mailing list.

Example

import lasagne
import theano
import theano.tensor as T

# create Theano variables for input and target minibatch
input_var = T.tensor4('X')
target_var = T.ivector('y')

# create a small convolutional neural network
from lasagne.nonlinearities import leaky_rectify, softmax
network = lasagne.layers.InputLayer((None, 3, 32, 32), input_var)
network = lasagne.layers.Conv2DLayer(network, 64, (3, 3),
                                     nonlinearity=leaky_rectify)
network = lasagne.layers.Conv2DLayer(network, 32, (3, 3),
                                     nonlinearity=leaky_rectify)
network = lasagne.layers.Pool2DLayer(network, (3, 3), stride=2, mode='max')
network = lasagne.layers.DenseLayer(lasagne.layers.dropout(network, 0.5),
                                    128, nonlinearity=leaky_rectify,
                                    W=lasagne.init.Orthogonal())
network = lasagne.layers.DenseLayer(lasagne.layers.dropout(network, 0.5),
                                    10, nonlinearity=softmax)

# create loss function
prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.categorical_crossentropy(prediction, target_var)
loss = loss.mean() + 1e-4 * lasagne.regularization.regularize_network_params(
        network, lasagne.regularization.l2)

# create parameter update expressions
params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.nesterov_momentum(loss, params, learning_rate=0.01,
                                            momentum=0.9)

# compile training function that updates parameters and returns training loss
train_fn = theano.function([input_var, target_var], loss, updates=updates)

# train network (assuming you've got some training data in numpy arrays)
for epoch in range(100):
    loss = 0
    for input_batch, target_batch in training_data:
        loss += train_fn(input_batch, target_batch)
    print("Epoch %d: Loss %g" % (epoch + 1, loss / len(training_data)))

# use trained network for predictions
test_prediction = lasagne.layers.get_output(network, deterministic=True)
predict_fn = theano.function([input_var], T.argmax(test_prediction, axis=1))
print("Predicted class for first test input: %r" % predict_fn(test_data[0]))

For a fully-functional example, see examples/mnist.py, and check the Tutorial for in-depth explanations of the same. More examples, code snippets and reproductions of recent research papers are maintained in the separate Lasagne Recipes repository.

Citation

If you find Lasagne useful for your scientific work, please consider citing it in resulting publications. We provide a ready-to-use BibTeX entry for citing Lasagne.

Development

Lasagne is a work in progress, input is welcome.

Please see the Contribution instructions for details on how you can contribute!

Comments

Recurrence

layers.py, on which most of the nntools code is based, has always been geared towards feed-forward neural networks. We should look into recurrent neural networks as well. Personally I don't have a lot of hands-on experience with this type of models. I'm not entirely sure how hard it would be to implement the necessary elements - would modifications to the library design be required? Does anyone have any insights about this?

opened by benanne 234
Allow Layers to be constructed from input shapes
As discussed in #17, it seemed to be a good idea to allow Layer instances to be constructed from input shapes rather than other Layer instances. While get_output() would not work for such layers, they could still be used standalone with get_output_for() and get_output_shape_for() -- after all, a layer only needs to know about its input and input shape to function.

We thought that implementing this would only touch the Layer base class (and the MultipleInputsLayer base class). However, while implementing it, I found that several other layer implementations rely on self.input_layer.get_output_shape() to work. There are about four ways of solving it:

Distinguishing between self.input_layer being a shape tuple or a layer in each of these cases

Moving this distinction into a new get_input_shape() method in the base class

Relying on this distinction being present in Layer.get_output_shape() and using super(XYZLayer, self).get_output_shape() instead of self.input_layer.get_output_shape() (this only works for classes directly derived from Layer, because in this case the super get_output_shape() will just return the input shape, so that's both ugly and fragile)

Wrapping a shape tuple in something that implements get_output_shape()

It turns out that we have a ready-made class for option 4, which is InputLayer. By wrapping shape tuples given in the constructor in InputLayer instances, only the base classes need to be changed. However, it feels like an unnecessary / misplaced shortcut now... what do we gain compared to requiring the user to provide an InputLayer instance herself or himself? Should we possibly choose option 2 instead? Should we create an InputShape class that really just wraps a shape tuple with a get_output_shape() method? Or is this proposal a good solution anyway?
opened by f0k 92
Add recurrent layers
Addresses #17. Includes layers for LSTM, GRU, "vanilla" (dense) RNN, and an RNN with arbitrary connection patterns. Thanks to @skaae for the tests and the GRU layer (+more), and @JackKelly for some comment fixes and additional functionality. Some misc comments that may warrant addressing -

We added examples to the examples directory. Do you want these? The usage of the recurrent layers is different enough that an example is probably warranted, but there is also an example in the top docstring of the recurrent submodule, which covers the absolute basics in terms of how to reshape things to make it work. Any examples should probably be supplemented with more fleshed-out stuff, like @skaae's penn treebank example and my speech rec example.

We didn't resolve these two issues: https://github.com/craffel/nntools/issues/30 https://github.com/craffel/nntools/issues/15 which essentially are the same thing - i.e., the ability to use the RNNs for sequence prediction. It's probably a feature we want to have, as it's a very hot topic at the moment.

The LSTM layer (and the GRU layer, to some extent) have a LOT of __init__ arguments because there are so many things to initialize. Most of the time people initialize all weight matrices in the same way, although I'm sure that's not always true. It's actually quite common to initialize the forget gate bias vector differently than the rest.

The peepholes kwarg of the LSTM layer decides whether to include connections from the cell to the gates; sometimes people use these, sometimes they don't, so it's important that we include this functionality. However, people also sometimes don't include other connections, and we don't have functionality for this. What I'd really like is something where if one of the arguments is passed as None, then the connection is not used, but this would make the code pretty ugly - we'd essentially have to define a new dot method which does nothing when one of the arguments is None or something. So, in short, I'd rather not address this now.

The grad_clipping argument in all layers clips the gradients, which is very common for RNNs. I think, however, this is pretty common for other network architectures too, so what I'd really like to see is it be moved out to a global function in updates, or something. But, this isn't something I'm familiar with enough to know how feasible that is, @skaae would be better to comment on that.

hid_init (and cell_init) are allowed to be TensorVariables, see our discussion here. https://github.com/craffel/nntools/pull/27#issuecomment-107989355 To summarize that, I'd rather have them be able to be variable-size shared variables which are added with add_param, but we can't because of the way we allow for variable batch size. Any ideas on that discussion are appreciated.

Ok, that's about it, let me know what you think! Would be great to get this merged before the first release.
opened by craffel 87
Documentation: API reference to do list
Here is a to do list for the API reference. Each of these could be a separate PR, or you can tackle multiple in a single PR. I guess it's a good idea to post here if you plan to work on one or more of these, so we don't end up doing any double work. Then I'll add your name on the list for an overview. Let's try to get this done within the next two weeks!

[x] convert lasagne.layers.helper docs to numpydoc format (@MartinThoma)

[x] convert lasagne.layers.base docs to numpydoc format (@MartinThoma)

[x] convert lasagne.layers.input docs to numpydoc format (@MartinThoma)

[x] complete lasagne.layers.dense docs and convert to numpydoc format (@skaae)

[x] document lasagne.layers.conv (@benanne)

[x] document lasagne.layers.pool (@ebenolson, @benanne)

[x] document lasagne.layers.noise (@skaae)

[x] complete lasagne.layers.shape docs and convert to numpydoc format (@skaae)

[x] document lasagne.layers.merge (@skaae)

[x] figure out how to make the docs for corrmm, cuda_convnet and dnn show up on readthedocs (these submodules cannot be imported without a gpu) (@benanne)

[x] document lasagne.layers.dnn (@benanne)

[x] document lasagne.layers.corrmm (@benanne)

[x] document lasagne.layers.cuda_convnet (@benanne)

[x] complete lasagne.updates docs and convert to numpydoc format (@skaae)

[x] document lasagne.init (@skaae)

[x] document lasagne.nonlinearities (@MartinThoma)

[x] complete lasagne.objectives docs and convert to numpydoc format (@f0k)

[x] complete lasagne.regularization docs and convert to numpydoc format (@skaae)

[x] complete lasagne.utils docs and convert to numpydoc format (@JeffreyDF)

[x] change the layout so it looks decent (@f0k)

[x] get rid of string representations of objects in docstrings everywhere (e.g. W=<lasagne.init.GlorotUniform object at 0x7f699026a690>) (@benanne)

[x] split lasagne.layers documentation into separate pages for each submodule (@f0k)

documentation
opened by benanne 72
Move this repository to an organization?

I created this repository on my personal GitHub account, but since this is a joint project, maybe it's better to create an organization and move this repository to it. Added benefits would be more granular access control (commit rights etc.).

Also I would just be able to submit pull requests like everyone else, instead of committing to the repo directly. Currently this is convenient because a lot of the 'base' library is still missing, but I probably don't want to make a habit out of this in the long term.

What do you guys think?

If we decide to do this, it would be a good time to change the name as well (if we're going to do that, see #3), so we only have to update configurations once.

opened by benanne 64
Layer outputs being computed multiple times
Lasagne Layers compute their outputs by calling get_output() recursively on their parents. When a network has a tree structure, which is usually the case, this works fine. But when a network has a directed acyclic graph (DAG) structure, i.e. some layers get their input from the same layer, this leads to get_output() being called multiple times on this layer. As a result the output expression of this shared input layer (and all layers further down) is constructed multiple times.

In terms of performance this is not an issue since all these computations are symbolic anyway. We count on Theano's smarts to notice that these sub-expressions are the same, so it only computes them once at execution time. This works great in most cases, but unfortunately there are exceptions:

it fails when scan is used (see https://github.com/benanne/Lasagne/issues/17#issuecomment-70317085 and onwards, also https://github.com/benanne/Lasagne/issues/97#issuecomment-70718707). ~~This is because Theano does not seem to merge two scan operations that do the same computations.~~ EDIT: this is wrong, see @craffel's comment below. Theano does merge the scan operations, but it doesn't merge their gradients, which is also a problem.

it fails when layer outputs are nondeterministic (because of e.g. dropout), see #97. Random variates are re-sampled for each call to get_output(), so Theano rightly does not merge the sub-expressions here. But of course this isn't what we want.

there may be other cases that have not come up yet or have not been discovered yet.

If we are able to ensure that get_output() is called only once on each layer in a DAG, we can solve this issue. There are a couple of ways to do this:

memoization: make get_output_for() remember the output associated with a given input after it's been computed for the first time. The second time it just returns the remembered output (which will be the same Theano expression). Doing this library-wide may result in some gotchas (people expecting code to be executed that will no longer be executed, etc.), so I don't know if this is a good idea. Making it a per-layer thing that should be enabled or disabled seems like it wouldn't solve the problem adequately (and is hard to implement as well), and also adds a cognitive burden for the user.

making get_output() iterative instead of recursive: currently, get_output() calls get_output() on the parents and then feeds the result to get_output_for(). Instead of recursively computing output expressions like this, we could make this function iterative. We would still have to recurse to grab all the 'ancestors' of the layer (we can use get_all_layers() for this), but after that we can ensure that get_output_for() is only called once for each of them, even when the network structure is a DAG.

there may be other ways to solve this, ideas are welcome!

I think I like the second option best so far because the interface would remain the same. I also feel like this might be a less intrusive change than introducing memoization. I have to admit I haven't really thought it through yet, though. We'll also have to think about how this will handle InputLayers and such (they don't define a get_output_for() method, only get_output()).
opened by benanne 61
Modify examples to be simpler

I'm curious if other people would be interested in having examples in this fashion: https://github.com/enlitic/lasagne4newbs/blob/master/mnist_conv.py. I've found the existing examples to be a little confusing to people new to Lasagne, and made that version to help them out. I think the biggest downside would be teaching slightly less than optimal practices (specifically not transfering a large amount of input data to the GPU at once).

opened by diogo149 59
Add BatchNormLayer
Finally started integrating my batch normalization implementation in Lasagne. Changes from the implementation in my gist:

Adding epsilon before taking the square root of the variance

Change parameters to have reduced dimensionality rather than broadcastable singleton dimensions, to match with BiasLayer (which in turn was made to match DenseLayer, Conv2DLayer etc.)

TODO:

[x] add tests

[x] fix LocalResponseNormalization2DLayer docstring while we're at it

Resolves #141.
opened by f0k 58
Weight Norm Constraints

Here's an attempt at weight norm constraints as we discussed in #84. I also did a little PEP8 cleanup in update.py (mostly line lengths stuff).

Unless I totally overlooked an easy assumption, it seems like we have to handle all the different types of parameter dimensionalities differently (not quite as clean as in the default Uniform initializer). So, it makes assumptions about the mean of the dimensions in a 2D and 4D parameter array.

Let me know what you guys think. I'll add better docs later.

opened by ebattenberg 46
First release

People have already started using the library, so we should make an effort to put out a first release.I made a milestone to tag issues and pull requests that need to be sorted out before we can make a release (thanks to @craffel for the suggestion).

The most important things will be sorting out our test coverage, and writing some basic documentation. I've been adding some docstrings now and then, but progress is slow and we will probably need a concerted effort to get this done in a reasonable amount of time.

What else should we take care of for the first release? Are there any other issues that need to be tagged?

opened by benanne 46
API: get_params()
As observed in #141 (but also in #110 and https://github.com/benanne/Lasagne/issues/136#issuecomment-75169847), it's not entirely clear what Layer.get_params() means and what it should mean.

The documentation of get_params() says: "Returns a list of all the Theano variables that parameterize the layer.", and further below it says "it should be overridden in a subclass that has trainable parameters". In addition, we have get_bias_params(), which is documented as: "Returns a list of all the Theano variables that are bias parameters for the layer.", and a note further below says "This is useful when specifying regularization (it is often undesirable to regularize bias parameters)."

Now when we include the use case of Batch Normalization, we actually have three overlapping classes of variables:

variables involved in the forward pass (currently get_params() according to its docstring)

variables to be updated wrt. the loss function (currently get_params() according to its usage in Lasagne)

variables to be updated wrt. the loss function that are not to be included in weight regularization (currently get_bias_params())

Before the first release of Lasagne we should re-think the get_params and get_bias_params API. It obviously does not exactly match what comes up in the use cases. Should we possibly replace this with a single get_params() method with optional keyword arguments to filter out certain things? Most Layer classes would not need to bother with the keywords or just react on the bias-related keyword, so it wouldn't increase the complexity of adding a new Layer class. Shall we think that through?
opened by f0k 37

`Notes on AdaGrad` is not found

$ curl -L -v http://www.ark.cs.cmu.edu/cdyer/adagrad.pdf
*   Trying 128.2.42.94:80...
* Connected to www.ark.cs.cmu.edu (128.2.42.94) port 80 (#0)
> GET /cdyer/adagrad.pdf HTTP/1.1
> Host: www.ark.cs.cmu.edu
> User-Agent: curl/7.77.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 302 Found
< Date: Sat, 26 Mar 2022 02:54:35 GMT
< Server: Apache
< Location: http://www.cs.cmu.edu/~ark/cdyer/adagrad.pdf
< Content-Length: 228
< Content-Type: text/html; charset=iso-8859-1
< Set-Cookie: BIGipServer~SCS~wehost-pool-80=181011072.20480.0000; path=/; Httponly
< 
* Ignoring the response-body
* Connection #0 to host www.ark.cs.cmu.edu left intact
* Issue another request to this URL: 'http://www.cs.cmu.edu/~ark/cdyer/adagrad.pdf'
*   Trying 128.2.42.95:80...
* Connected to www.cs.cmu.edu (128.2.42.95) port 80 (#1)
> GET /~ark/cdyer/adagrad.pdf HTTP/1.1
> Host: www.cs.cmu.edu
> User-Agent: curl/7.77.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Date: Sat, 26 Mar 2022 02:54:36 GMT
< Server: Apache/2.4.18 (Ubuntu)
< Set-Cookie: SHIBLOCATION=tilde; path=/; domain=.cs.cmu.edu
< Content-Length: 1467
< Content-Type: text/html; charset=UTF-8
< Set-Cookie: BALANCEID=balancer.web39.srv.cs.cmu.edu; path=/;
< Set-Cookie: BIGipServer~SCS~cs-userdir-pool-80=533332608.20480.0000; path=/; Httponly
< 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<title>404 - Document not found</title>
...

opened by wafuwafu13 0

Error with mock in Python 3.8.3 and 3.9

Hello,

While testing on Debian unstable, the test_helper.py has failed with an error related to Mock when applied to MergeLayer. I have been trying to fix it but I do not know the mock library well and haven't yet succeeded. Any ideas? It seems the output_shape property is triggering an unimplemented function in MergeLayer.

Output of pytest:

$ pytest-3 -k helper -x
======================================== test session starts ========================================
platform linux -- Python 3.8.3, pytest-4.6.11, py-1.8.1, pluggy-0.13.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/sinclairs/projects/debian/lasagne, inifile: setup.cfg
collected 1404 items / 1348 deselected / 56 selected                                                
  
lasagne/layers/helper.py::lasagne.layers.helper.count_params PASSED                           [  1%]
lasagne/layers/helper.py::lasagne.layers.helper.get_all_layers PASSED                         [  3%]
lasagne/layers/helper.py::lasagne.layers.helper.get_all_param_values PASSED                   [  5%]
lasagne/layers/helper.py::lasagne.layers.helper.get_all_params PASSED                         [  7%]
lasagne/layers/helper.py::lasagne.layers.helper.set_all_param_values PASSED                   [  8%]
lasagne/tests/layers/test_helper.py::TestGetAllLayers::test_stack PASSED                      [ 10%]
lasagne/tests/layers/test_helper.py::TestGetAllLayers::test_merge PASSED                      [ 12%]
lasagne/tests/layers/test_helper.py::TestGetAllLayers::test_split PASSED                      [ 14%]
lasagne/tests/layers/test_helper.py::TestGetAllLayers::test_bridge PASSED                     [ 16%]
lasagne/tests/layers/test_helper.py::TestGetOutput_InputLayer::test_get_output_without_arguments PASSED [ 17%]
lasagne/tests/layers/test_helper.py::TestGetOutput_InputLayer::test_get_output_input_is_variable PASSED [ 19%]
lasagne/tests/layers/test_helper.py::TestGetOutput_InputLayer::test_get_output_input_is_array PASSED [ 21%]
lasagne/tests/layers/test_helper.py::TestGetOutput_InputLayer::test_get_output_input_is_a_mapping PASSED [ 23%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_without_arguments PASSED [ 25%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_with_single_argument PASSED [ 26%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_input_is_a_mapping PASSED [ 28%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_input_is_a_mapping_no_key PASSED [ 30%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_input_is_a_mapping_to_array PASSED [ 32%]
lasagne/tests/layer/test_helper.py::TestGetOutput_Layer::test_get_output_input_is_a_mapping_for_layer PASSED [ 33%]lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_input_is_a_mapping_for_input_layer PASSED [ 35%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_with_unused_kwarg PASSED [ 37%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_get_output_with_no_unused_kwarg PASSED [ 39%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_layer_from_shape_invalid_get_output PASSED [ 41%]
lasagne/tests/layers/test_helper.py::TestGetOutput_Layer::test_layer_from_shape_valid_get_output PASSED [ 42%]
lasagne/tests/layers/test_helper.py::TestGetOutput_MergeLayer::test_get_output_without_arguments ERROR [ 44%]

============================================== ERRORS ===============================================
___________ ERROR at setup of TestGetOutput_MergeLayer.test_get_output_without_arguments ____________

self = <test_helper.TestGetOutput_MergeLayer object at 0x7f27f7479a30>
  
    @pytest.fixture
    def layers(self):
        from lasagne.layers.base import Layer, MergeLayer
        from lasagne.layers.input import InputLayer
        # create two mocks of the same attributes as an InputLayer instance
        l1 = [Mock(InputLayer((None,)), output_shape=(None,),
                   get_output_kwargs=[]),
              Mock(InputLayer((None,)), output_shape=(None,),
                   get_output_kwargs=[])]
        # create two mocks of the same attributes as a Layer instance
        l2 = [Mock(Layer(l1[0]), output_shape=(None,),
                   get_output_kwargs=[]),
              Mock(Layer(l1[1]), output_shape=(None,),
                   get_output_kwargs=[])]
        # link them to the InputLayer mocks
        l2[0].input_layer = l1[0]
        l2[1].input_layer = l1[1]
        # create a mock that has the same attributes as a MergeLayer
>       l3 = Mock(MergeLayer(l2), get_output_kwargs=['kwarg'])
  
lasagne/tests/layers/test_helper.py:306:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/lib/python3/dist-packages/mock/mock.py:1082: in __init__
    _safe_super(CallableMixin, self).__init__(
/usr/lib/python3/dist-packages/mock/mock.py:439: in __init__
    self._mock_add_spec(spec, spec_set, _spec_as_instance, _eat_self)
/usr/lib/python3/dist-packages/mock/mock.py:494: in _mock_add_spec
    if iscoroutinefunction(getattr(spec, attr, None)):
lasagne/layers/base.py:269: in output_shape
    shape = self.get_output_shape_for(self.input_shapes)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  
self = <lasagne.layers.base.MergeLayer object at 0x7f27f7479c70>, input_shapes = [(None,), (None,)]
  
    def get_output_shape_for(self, input_shapes):
        """
        Computes the output shape of this layer, given a list of input shapes.
    
        Parameters
        ----------
        input_shape : list of tuple
            A list of tuples, with each tuple representing the shape of one of
            the inputs (in the correct order). These tuples should have as many
            elements as there are input dimensions, and the elements should be
            integers or `None`.
    
        Returns
        -------
        tuple
            A tuple representing the shape of the output of this layer. The
            tuple has as many elements as there are output dimensions, and the
            elements are all either integers or `None`.
    
        Notes
        -----
        This method must be overridden when implementing a new
        :class:`Layer` class with multiple inputs. By default it raises
        `NotImplementedError`.
        """
>       raise NotImplementedError
E       NotImplementedError
  
lasagne/layers/base.py:303: NotImplementedError
...

opened by radarsat1 3

Center Loss as an Objective Function?

Can center loss be added as an objective function in Lasagne? It would help the network in learning highly discriminating features.

Link to a relevant paper : https://ydwen.github.io/papers/WenECCV16.pdf

opened by Divya1612 0

LocallyConnected2DLayer params not initialized correctly

The following code shows a LocallyConnected2DLayer with W initialized using HeNormal(1.0) give a 1/(width*height)**0.5 result std than Conv2DLayer with the same initialization.

import numpy as np
import theano
import theano.tensor as T

import lasagne
from lasagne.layers import *

input_var = T.tensor4('inputs')
def build_network(input_var, using_local):
    network = InputLayer(shape=(None,3,64,64), input_var=input_var)
    if using_local:
        network = LocallyConnected2DLayer(
                network, num_filters=256, filter_size=(3,3), untie_biases=True, pad='same',
                nonlinearity=None,
                W=lasagne.init.HeNormal(1.0)
                )
    else:
        network = Conv2DLayer(
                network, num_filters=256, filter_size=(3,3), pad='same',
                nonlinearity=None,
                W=lasagne.init.HeNormal(1.0)
                )
    return network

local_fn = theano.function([input_var],get_output(build_network(input_var,True)).std())
conv_fn = theano.function([input_var],get_output(build_network(input_var,False)).std())

data = np.random.normal(0,1,(64,3,64,64)).astype('float32')
print local_fn(data)
print conv_fn(data)

output is

0.015465997 0.9949956

opened by guoxuesong 1

Theano discontinuation

What are your plans now that theano is going to be discontinued?

We, the pymc3 devs, have discussed all kinds of options. One of them is taking over theano maintenance (not pushing new features just making sure it doesn't go stale and critical bugs get fixed). Is that anything to talk about?

opened by twiecki 5
AttributeError: 'Conv2DLayer' object has no attribute 'flip_filters'

Theano==0.8.2 Lasagne==0.2.dev1

Code: val_prediction = lasagne.layers.get_output(network, input_var, deterministic=True)

Error: File "../algorithms/python/env/lib/python3.6/site-packages/lasagne/layers/conv.py", line 611, in convolve filter_flip=self.flip_filters) AttributeError: 'Conv2DLayer' object has no attribute 'flip_filters'

Any insight into what I might be doing wrong, or if I'm missing something in my installation?

opened by ays0110 1

Releases(v0.1)

v0.1(Aug 13, 2015)
core contributors, in alphabetical order:

Eric Battenberg (@ebattenberg)

Sander Dieleman (@benanne)

Daniel Nouri (@dnouri)

Eben Olson (@ebenolson)

Aäron van den Oord (@avdnoord)

Colin Raffel (@craffel)

Jan Schlüter (@f0k)

Søren Kaae Sønderby (@skaae)

extra contributors, in chronological order:

Daniel Maturana (@dimatura): documentation, cuDNN layers, LRN

Jonas Degrave (@317070): get_all_param_values() fix

Jack Kelly (@JackKelly): help with recurrent layers

Gábor Takács (@takacsg84): support broadcastable parameters in lasagne.updates

Diogo Moitinho de Almeida (@diogo149): MNIST example fixes

Brian McFee (@bmcfee): MaxPool2DLayer fix

Martin Thoma (@MartinThoma): documentation

Jeffrey De Fauw (@JeffreyDF): documentation, ADAM fix

Michael Heilman (@mheilman): NonlinearityLayer, lasagne.random

Gregory Sanders (@instagibbs): documentation fix

Jon Crall (@erotemic): check for non-positive input shapes

Hendrik Weideman (@hjweide): set_all_param_values() test, MaxPool2DCCLayer fix

Kashif Rasul (@kashif): ADAM simplification

Peter de Rivaz (@peterderivaz): documentation fix

Source code(tar.gz)
Source code(zip)

Lightweight library to build and train neural networks in Theano

Related tags

Overview

Lasagne

Installation

Documentation

Example

Citation

Development

Comments

Releases(v0.1)

v0.1(Aug 13, 2015)

Owner

Lasagne

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Lightweight mmm - Lightweight (Bayesian) Media Mix Model

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation