A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Overview
Comments
  • Wrong log_softmax dimension in model

    Wrong log_softmax dimension in model

    The following line computes the log_softmax over the batch dimension (default: dim=0): https://github.com/ikostrikov/pytorch-meta-optimizer/blob/0154d4d4fc856163885a62bac06174311aa58caf/model.py#L19

    This should include a dim=-1 to result in combination with F.nll_loss into the cross entropy loss.

    opened by timmeinhardt 1
  • Why do you input grad & params into metaoptimizer?

    Why do you input grad & params into metaoptimizer?

    In line 81 & 135 of meta_optimizer.py Why do you input grads & params (and loss as well in line 135) into metaoptimizer?

    Did you try with & without, and one version worked better? In 1606.04474, they seem to just input grads into metaoptimizer.

    opened by ethancaballero 1
  • FastMetaLearner equal to Optimization as a model for few-shot learning

    FastMetaLearner equal to Optimization as a model for few-shot learning

    Hi, your implementation on the whole model is really cool! I learned a lot from reading your code.

    And I want to know whether the FastMetaLearner class you implement is just equal to the paper Optimization as a model for few-shot learning?

    opened by yuhui-zh15 0
  • pytorch 0.3 correction

    pytorch 0.3 correction

    The current version of the code produce a NxN matrix of weight instead of a N vector of weight (N the number of weight of the model to optimize). I modify a bit the code to produce the correct vector.

    opened by Forbu 0
  • Might work better with a different meta-optimizer - enhancement

    Might work better with a different meta-optimizer - enhancement

    opened by AjayTalati 0
  • Which article does Fast-Meta-Optimizer come from?

    Which article does Fast-Meta-Optimizer come from?

    Thanks for your code, i learn a lot from it. But i can't find the source of Fast-Meta-Optimizer. It looks like meta-SGD, but not common entirely. Please tell me the source of this method, thanks!

    opened by ghost 1
  • Visualize the best LSTM loss

    Visualize the best LSTM loss

    Hi,

    I am new to meta-learning and trying to understand your code. How do I get a figure of the LSTM loss-steps like the author showed in the paper if I use matplotlib?

    Shall I catch the best final_loss as y-axis and use optimizer_steps as x-axis?

    opened by NaV1YChen 0
  • LSTM weights not optimised

    LSTM weights not optimised

    The LayerNormLSTMCell modules initialised in the MetaOptimizer class are not properly registered as parameters of the MetaOptimizer model. Appending them to the self.lstms list: https://github.com/ikostrikov/pytorch-meta-optimizer/blob/0154d4d4fc856163885a62bac06174311aa58caf/meta_optimizer.py#L27

    will not add their trainable parameters to the model parameter list in:

    https://github.com/ikostrikov/pytorch-meta-optimizer/blob/0154d4d4fc856163885a62bac06174311aa58caf/main.py#L63

    If I am not mistaken, the current version will not train the LSTM weights at all. In general, I would suggest to restructure the initialisation and MetaOptimizer.forward method, but as a quick fix one could replace the entire self.lstms initialization block with this:

    self.lstms = nn.Sequential(*[LayerNormLSTMCell(hidden_size, hidden_size)
                                                    for _ in range(num_layers)])
    
    opened by timmeinhardt 3
  • the meaning of preprocess_gradients function

    the meaning of preprocess_gradients function

    Hi, I've learned so much from your code, but stiil have a small question. What does the function 'preprocess_gradients ' do in utils.py? Could you please explain it?

    opened by littleso-so 0
  • Error when running main.py

    Error when running main.py

    Hi, I am using pytorch 0.2 and when I run the script, it generated following errors:

    Traceback (most recent call last):
      File "main.py", line 142, in <module>
        main()
      File "main.py", line 116, in main
        meta_model = meta_optimizer.meta_update(model, loss.data)
      File "/home/rvl224/pytorch-meta-optimizer/meta_optimizer.py", line 135, in meta_update
        inputs = Variable(torch.cat((preprocess_gradients(flat_grads), flat_params.data, loss), 1))
      File "/home/rvl224/pytorch-meta-optimizer/utils.py", line 11, in preprocess_gradients
        return torch.cat((x1, x2), 1)
    RuntimeError: dim out of range - got 1 but the tensor is only 1D
    
    

    Is it because I am using a different version of pytorch?

    opened by ghost 5
  • Out of memory when the meta optimizer updates parameters

    Out of memory when the meta optimizer updates parameters

    Hello, I find your code very helpful, but too much memory is consumed when the meta optimizer updates parameters of the model. On my computer, it always raises an error 'out of memory' when executes Line 140 of meta_optimizer.py.

    I think it could consume less memory if the MetaModel class holds a flat version of parameters instead of wrapping a model. In this way, the MetaModel reshapes the parameters and computes result through nn.functional.conv/linear, so that the meta optimizer can directly use this flat version of parameters, without allocating extra memory for flatted parameters.

    opened by zengxianyu 3
Owner
Ilya Kostrikov
Post doc
Ilya Kostrikov
Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

Martin Krasser 251 Dec 25, 2022
Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does

abhishek thakur 1.1k Jan 4, 2023
A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch

Torchmeta A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch. Torchmeta contains popular meta-learning bench

Tristan Deleu 1.7k Jan 6, 2023
null 270 Dec 24, 2022
A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

Fidelity Investments 56 Sep 13, 2022
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

null 878 Dec 30, 2022
PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

Cong Cai 12 Dec 19, 2021
A PyTorch implementation of EfficientNet

EfficientNet PyTorch Quickstart Install with pip install efficientnet_pytorch and load a pretrained EfficientNet with: from efficientnet_pytorch impor

Luke Melas-Kyriazi 7.2k Jan 6, 2023
PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf

README TabNet : Attentive Interpretable Tabular Learning This is a pyTorch implementation of Tabnet (Arik, S. O., & Pfister, T. (2019). TabNet: Attent

DreamQuark 2k Dec 27, 2022
An implementation of Performer, a linear attention-based transformer, in Pytorch

Performer - Pytorch An implementation of Performer, a linear attention-based transformer variant with a Fast Attention Via positive Orthogonal Random

Phil Wang 900 Dec 22, 2022
PyTorch implementation of Glow, Generative Flow with Invertible 1x1 Convolutions

glow-pytorch PyTorch implementation of Glow, Generative Flow with Invertible 1x1 Convolutions

Kim Seonghyeon 433 Dec 27, 2022
This is an differentiable pytorch implementation of SIFT patch descriptor.

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can

Dmytro Mishkin 150 Dec 24, 2022
GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks This repository implements a capsule model Inten

Joel Huang 15 Dec 24, 2022
Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Tacotron 2 (without wavenet) PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions. This implementati

NVIDIA Corporation 4.1k Jan 3, 2023
A Pytorch Implementation for Compact Bilinear Pooling.

CompactBilinearPooling-Pytorch A Pytorch Implementation for Compact Bilinear Pooling. Adapted from tensorflow_compact_bilinear_pooling Prerequisites I

null 169 Dec 23, 2022
A pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch.

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 234 Dec 7, 2022
Pytorch implementation of Distributed Proximal Policy Optimization

Pytorch-DPPO Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286 Using PPO with clip loss (from https

Alexis David Jacq 164 Jan 5, 2023
A PyTorch implementation of L-BFGS.

PyTorch-LBFGS: A PyTorch Implementation of L-BFGS Authors: Hao-Jun Michael Shi (Northwestern University) and Dheevatsa Mudigere (Facebook) What is it?

Hao-Jun Michael Shi 478 Dec 27, 2022
PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

PyTorch implementation of [1611.06440 Pruning Convolutional Neural Networks for Resource Efficient Inference] This demonstrates pruning a VGG16 based

Jacob Gildenblat 836 Dec 26, 2022