Riemannian Adaptive Optimization Methods with pytorch optim

Last update: Jan 3, 2023

Related tags

Pytorch Utilities geoopt

Overview

geoopt

Manifold aware pytorch.optim.

Unofficial implementation for “Riemannian Adaptive Optimization Methods” ICLR2019 and more.

Installation

Make sure you have pytorch>=1.9.0 installed

There are two ways to install geoopt:

GitHub (preferred so far) due to active development

pip install git+https://github.com/geoopt/geoopt.git

pypi (this might be significantly behind master branch)

pip install geoopt

The preferred way to install geoopt will change once stable project stage is achieved. Now, pypi is behind master as we actively develop and implement new features.

PyTorch Support

Geoopt officially supports 2 latest stable versions (1.9.0 so far) of pytorch upstream or the latest major release. We also test (TODO: there were complications with github workflows, need help) against the nightly build, but do not be 100% sure about compatibility. As for older pytorch versions, you may use it on your own risk (do not forget to run tests).

What is done so far

Work is in progress but you can already use this. Note that API might change in future releases.

Tensors

geoopt.ManifoldTensor – just as torch.Tensor with additional manifold keyword argument.
geoopt.ManifoldParameter – same as above, recognized in torch.nn.Module.parameters as correctly subclassed.

All above containers have special methods to work with them as with points on a certain manifold

.proj_() – inplace projection on the manifold.
.proju(u) – project vector u on the tangent space. You need to project all vectors for all methods below.
.egrad2rgrad(u) – project gradient u on Riemannian manifold
.inner(u, v=None) – inner product at this point for two tangent vectors at this point. The passed vectors are not projected, they are assumed to be already projected.
.retr(u) – retraction map following vector u
.expmap(u) – exponential map following vector u (if expmap is not available in closed form, best approximation is used)
.transp(v, u) – transport vector v with direction u
.retr_transp(v, u) – transport self, vector v (and possibly more vectors) with direction u (returns are plain tensors)

Manifolds

geoopt.Euclidean – unconstrained manifold in R with Euclidean metric
geoopt.Stiefel – Stiefel manifold on matrices A in R^{n x p} : A^t A=I, n >= p
geoopt.Sphere - Sphere manifold ||x||=1
geoopt.BirkhoffPolytope - manifold of Doubly Stochastic matrices
geoopt.Stereographic - Constant curvature stereographic projection model
geoopt.SphereProjection - Sphere stereographic projection model
geoopt.PoincareBall - Poincare ball model
geoopt.Lorentz - Hyperboloid model
geoopt.ProductManifold - Product manifold constructor
geoopt.Scaled - Scaled version of the manifold. Similar to Learning Mixed-Curvature Representations in Product Spaces if combined with ProductManifold
geoopt.SymmetricPositiveDefinite - SPD matrix manifold
geoopt.UpperHalf - Siegel Upper half manifold. Supports Riemannian and Finsler metrics, as in Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach.
geoopt.BoundedDomain - Siegel Bounded domain manifold. Supports Riemannian and Finsler metrics.

All manifolds implement methods necessary to manipulate tensors on manifolds and tangent vectors to be used in general purpose. See more in documentation.

Optimizers

geoopt.optim.RiemannianSGD – a subclass of torch.optim.SGD with the same API
geoopt.optim.RiemannianAdam – a subclass of torch.optim.Adam

Samplers

geoopt.samplers.RSGLD – Riemannian Stochastic Gradient Langevin Dynamics
geoopt.samplers.RHMC – Riemannian Hamiltonian Monte-Carlo
geoopt.samplers.SGRHMC – Stochastic Gradient Riemannian Hamiltonian Monte-Carlo

Citing Geoopt

If you find this project useful in your research, please kindly add this bibtex entry in references and cite.

@misc{geoopt2020kochurov,
    title={Geoopt: Riemannian Optimization in PyTorch},
    author={Max Kochurov and Rasul Karimov and Serge Kozlukov},
    year={2020},
    eprint={2005.02819},
    archivePrefix={arXiv},
    primaryClass={cs.CG}
}

Comments

Line search

I made a Riemannian line search optimizer with strong Wolfe conditions.

It's not yet perfect. I think it makes some redundant calls to closure during stepping, and when it's close to a local minimum it suffers from numerical errors and can sometimes take strange steps

opened by RikVoorhaar 26
Poincare ball model
Huray, we are ready to start.

Image from here

Interesting reading I've done so far:

Hyperbolic Networks Hyperbolic Entailment Cones for Learning Hierarchical Embeddings Poincaré GloVe: Hyperbolic Word Embeddings

Some implementation takeouts (mostly from here):

Project results of all operations in the ball of radius 1 − eps, where eps = 10^{-5}

Numerical errors also appear when hyperbolic vectors get closer to 0, perturb with eps = 10^{-15}

Pass clipped to [-15, 15] input to tanh and clip tanh^{-1} to [-1+eps, 1-eps]

CC @leuchine!
opened by ferrine 23
StereographicProductManifold to use gyrovector space functions in product manifolds

Hi,

I subclassed the ProductManifold and created a StereographicProductManifold. Here I added the functions dist2plane, expmap0and mobius_add by calling the respective functions in the underlying Stereographic manifolds. I added some test functions for this. Additionally, I added wrapped_normal to Stereographic as an alternative random function and I added scipy to the requirements in setup.py as mentioned in #161

opened by gatoniel 20
$c$ counter-inuitively stands for **negative** curvature

Currently the c parameter in PoincareBall defines negative curvature though it is most natural to assume c to be just curvature. That is unconventional, misleading, and is likely to cause inconsistencies some time later

opened by SomeoneSerge 17
How to properly use ManifoldParameter
I currently use the following piece of code:

if self.init_id: init = torch.eye(input_dims[0]).view((input_dims[0], input_dims[0], 1, 1)) else: init = torch.randn(input_dims[0], input_dims[0])# + input_dims[0]*torch.eye(input_dims[0]) self.orth_w = geoopt.ManifoldParameter(init, manifold=stiefel_man)

but if init is random, it won't be on the stiefel-manifold! So i tried self.orth_w.proj_(), but it tells me:

RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.

I can guard the .proj_() with a with torch.no_grad(), but I don't know whether this is the right approach.

I just want to do a straightforward matrix-multiplication on the forward pass and want geoopt to not leave the stiefel-manifold during my training.
opened by LeanderK 12
Hyperboloid
There are some numerical issues, and I'm starting to think that there is no easy solution for it at this point. Maybe something like this paper should be investigated more.

[x] Add curvature

[x] Add tests

[x] Add an example with hgnn/attention [example with basic usage]

[x] Add docs

Example with optimization should be added on the separate PR
opened by rrkarim 11

Not working properly with CUDA

I am working with a model similar to the example below:

class Model(nn.Module):
    def __init__(self, word2vec):
        super(Model, self).__init__()
        self.word_lut = gt.ManifoldParameter(word2vec, manifold=gt.PoincareBall())

When I move the model to CUDA, and run it there is a problem during the optimization (See full traceback below [1]): RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other'

The reason of the error is that when I move the model to CUDA, the tensor in the ManifoldParameter is moved, but its manifold is not, therefore the attribute tensor self.c of the manifold is still allocated in the CPU.

I solved it doing this:

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")    
for p in model.parameters():
    if isinstance(p, (ManifoldParameter, ManifoldTensor)):
        p.manifold.to(device)

I don't know if issue #49 was an attempt to solve this and I didn't moved the model in the proper way to CUDA or this is not completely documented. In case that the explanation on how to do this properly exists and I couldn't find it, my apologies for creating this issue.

[1] Full traceback

Traceback (most recent call last):
  File "./train.py", line 122, in <module>
    main()
  File "./train.py", line 113, in main
    coach.train()
  File "/home/lopezfo/projects/hyfi/hyfi/Coach.py", line 43, in train
    train_loss = self.train_epoch(epoch)
  File "/home/lopezfo/projects/hyfi/hyfi/Coach.py", line 99, in train_epoch
    self.optim.step()
  File "/home/lopezfo/anaconda3/envs/hyfi/lib/python3.6/site-packages/geoopt/optim/radam.py", line 145, in step
    state["exp_avg_sq"],
  File "/home/lopezfo/anaconda3/envs/hyfi/lib/python3.6/site-packages/geoopt/optim/tracing.py", line 34, in partial
    step(manifold, *args, **kwargs)
  File "/home/lopezfo/anaconda3/envs/hyfi/lib/python3.6/site-packages/geoopt/optim/radam.py", line 191, in perform_step
    point, -step_size * direction, exp_avg
  File "/home/lopezfo/anaconda3/envs/hyfi/lib/python3.6/site-packages/geoopt/manifolds/poincare/__init__.py", line 110, in retr_transp
    y = self.retr(x, u)
  File "/home/lopezfo/anaconda3/envs/hyfi/lib/python3.6/site-packages/geoopt/manifolds/poincare/__init__.py", line 69, in retr
    return math.project(approx, c=self.c)
  File "/home/lopezfo/anaconda3/envs/hyfi/lib/python3.6/site-packages/geoopt/manifolds/poincare/math.py", line 78, in project
    return _project(x, c, dim, eps)
  File "/home/lopezfo/anaconda3/envs/hyfi/lib/python3.6/site-packages/geoopt/manifolds/poincare/math.py", line 86, in _project
    cond = norm > maxnorm
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other'

opened by fedelopez77 11

implementation of SPD manifolds
I'm trying to implement the SPD manifolds, which is widely used in many papers.

Current commits have implemented all abstract methods of base.Manifold, while more method implementations are on the way. Some test has been done locally and constructions of the testing module are also on the way.

Progress

[x] move symmetric matrix operations to batch_linalg

[x] keepdims functionality for inner and _stein_metric.

[x] mention in documentation about SPD manifolds.

[x] implementation for random and origin

[x] manifold test module.

[x] shape case for a symmetric positive-definite matrix.

[x] test basic operation of SPD manifolds.

[x] simple optimization problem on SPD manifolds.

[ ] mention the PR in the CHANGELOG.

Reference Implementation

psd.py in pymanopt

spd.py in matrix-manifolds

psd.py in geotorch

spd_matrices.py in geotorch

Some paper using SPD Manifolds

Computationally Tractable Riemannian Manifolds for Graph Embeddings [arxiv]

A Riemannian Network for SPD Matrix Learning [arxiv]

All suggestions will be accepted with an open mind. Hoping this pull request will be merged when all works have done.
enhancement
opened by tao-harald 10

geoopt.optim.RiemannianSGD does not work with Distributed Data Parallel

First of all, thank you for this library!

Description of the bug

When training with Distributed Data Parallel (DDP), the gradient between different devices is not correctly synchronized when using RiemannianSGD (or RiemannianAdam). Replacing it with a standard torch.optim.SGD works well. Note that when using DDP the gradient is synchronized during .backprop() (see this link).

To Reproduce

Simple code training on ImageNet:

import os

import geoopt
import torch
import torch.distributed
import torch.multiprocessing as mp
import torchvision
import torchvision.models as models
from torch.nn.parallel import DistributedDataParallel as DDP
from torch.utils import data
from torchvision import transforms


def process_ddp(master_port, local_rank, world_size):

    os.environ['MASTER_ADDR'] = 'localhost'
    os.environ['MASTER_PORT'] = str(master_port)
    torch.cuda.set_device(local_rank)
    device = torch.device("cuda", local_rank)
    torch.distributed.init_process_group("nccl", rank=local_rank, world_size=world_size, init_method='env://')
    assert world_size == torch.distributed.get_world_size()

    return device


def main(local_rank, world_size):

    path_dataset = '/path/to/ImageNet'  # Any other dataset should result in a similar behavior
    master_port = 9999

    device = process_ddp(master_port, local_rank, world_size)

    model = models.resnet18()
    model = model.to(device)

    # optimizer = geoopt.optim.RiemannianSGD(model.parameters(), lr=0.1, stabilize=10)
    optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

    # Data parallelization
    model = DDP(model, device_ids=[local_rank], output_device=local_rank)

    # Prepare dataset
    transform = transforms.Compose([
        transforms.CenterCrop(size=256),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])
    dataset = torchvision.datasets.ImageNet(split='train', root=path_dataset, transform=transform)
    sampler = torch.utils.data.distributed.DistributedSampler(dataset, shuffle=True)
    data_loader = torch.utils.data.DataLoader(dataset, batch_size=4, sampler=sampler, shuffle=False, num_workers=8)

    # Train part of the first epoch
    model.train()

    for idx, (images, labels) in enumerate(data_loader):
        if idx >= 10:
            break
        images = images.to(device)
        labels = labels.to(device)

        with torch.set_grad_enabled(True):
            features = model(images)
            loss = torch.nn.functional.cross_entropy(features, labels)

        loss.backward()
        print(f'grad iteration {idx} on gpu {device}: {model.module.conv1.weight.grad.mean()}', flush=True)
        optimizer.step()
        optimizer.zero_grad()
        print(f'weight iteration {idx} on gpu {device}: {model.module.conv1.weight.mean()}', flush=True)

    # cleanup
    torch.distributed.destroy_process_group()


if __name__ == '__main__':
    world_size_main = torch.cuda.device_count()
    mp.spawn(main,
             args=(world_size_main,),
             nprocs=world_size_main,
             join=True)

In order to run, use: CUDA_VISIBLE_DEVICES=0,1 python run.py

Expected behavior

The expected behavior is the one that occurs when the line optimizer = torch.optim.SGD(model.parameters(), lr=0.1) is uncommented, and the line optimizer = geoopt.optim.RiemannianSGD(model.parameters(), lr=0.1, , stabilize=10) is commented. In that case, the output is:

grad iteration 0 on gpu cuda:1: 0.011769304051995277
grad iteration 0 on gpu cuda:0: 0.011769304051995277
weight iteration 0 on gpu cuda:1: -0.001249525579623878
weight iteration 0 on gpu cuda:0: -0.001249525579623878
grad iteration 1 on gpu cuda:0: -0.015764284878969193
grad iteration 1 on gpu cuda:1: -0.015764284878969193
weight iteration 1 on gpu cuda:1: 0.0003269027511123568
weight iteration 1 on gpu cuda:0: 0.0003269027511123568
grad iteration 2 on gpu cuda:1: -0.006310341879725456
grad iteration 2 on gpu cuda:0: -0.006310341879725456
weight iteration 2 on gpu cuda:1: 0.000957937038037926
weight iteration 2 on gpu cuda:0: 0.000957937038037926
grad iteration 3 on gpu cuda:0: 0.0021547293290495872
grad iteration 3 on gpu cuda:1: 0.0021547293290495872
weight iteration 3 on gpu cuda:1: 0.000742464151699096
weight iteration 3 on gpu cuda:0: 0.000742464151699096
grad iteration 4 on gpu cuda:1: -0.002606849418953061
grad iteration 4 on gpu cuda:0: -0.002606849418953061
weight iteration 4 on gpu cuda:1: 0.001003148965537548
weight iteration 4 on gpu cuda:0: 0.001003148965537548
grad iteration 5 on gpu cuda:1: 0.00043087091762572527
grad iteration 5 on gpu cuda:0: 0.00043087091762572527
weight iteration 5 on gpu cuda:1: 0.0009600619086995721
weight iteration 5 on gpu cuda:0: 0.0009600619086995721
grad iteration 6 on gpu cuda:0: 0.00014396056940313429
grad iteration 6 on gpu cuda:1: 0.00014396056940313429
weight iteration 6 on gpu cuda:1: 0.0009456658735871315
weight iteration 6 on gpu cuda:0: 0.0009456658735871315
grad iteration 7 on gpu cuda:1: -0.002603260101750493
grad iteration 7 on gpu cuda:0: -0.002603260101750493
weight iteration 7 on gpu cuda:1: 0.001205991953611374
weight iteration 7 on gpu cuda:0: 0.001205991953611374
grad iteration 8 on gpu cuda:0: 0.000458348571555689
grad iteration 8 on gpu cuda:1: 0.000458348571555689
weight iteration 8 on gpu cuda:0: 0.0011601571459323168
weight iteration 8 on gpu cuda:1: 0.0011601571459323168
grad iteration 9 on gpu cuda:1: -0.0004215179360471666
grad iteration 9 on gpu cuda:0: -0.0004215179360471666
weight iteration 9 on gpu cuda:1: 0.0012023089220747352
weight iteration 9 on gpu cuda:0: 0.0012023089220747352

The gradients in the two GPUs are correctly synchronized. However, when using RiemannianSGD, the output is:

grad iteration 0 on gpu cuda:1: 0.0035285688936710358
grad iteration 0 on gpu cuda:0: 0.0035285688936710358
weight iteration 0 on gpu cuda:0: -7.928906597953755e-06
weight iteration 0 on gpu cuda:1: -7.928906597953755e-06
grad iteration 1 on gpu cuda:0: -0.04444637894630432
grad iteration 1 on gpu cuda:1: 0.002550020581111312
weight iteration 1 on gpu cuda:0: 0.00018905977776739746
weight iteration 1 on gpu cuda:1: -2.1470928913913667e-05
grad iteration 2 on gpu cuda:0: 0.009863540530204773
grad iteration 2 on gpu cuda:1: 0.0026304360944777727
weight iteration 2 on gpu cuda:0: 0.00013691956701222807
weight iteration 2 on gpu cuda:1: -3.7374120438471437e-05
grad iteration 3 on gpu cuda:0: -0.0161017756909132
grad iteration 3 on gpu cuda:1: -0.0023103044368326664
weight iteration 3 on gpu cuda:0: 0.0002405263076070696
weight iteration 3 on gpu cuda:1: -2.4383840354857966e-05
grad iteration 4 on gpu cuda:0: 0.010763526894152164
grad iteration 4 on gpu cuda:1: -0.017034146934747696
weight iteration 4 on gpu cuda:0: 0.00017218326684087515
weight iteration 4 on gpu cuda:1: 7.665574958082289e-05
grad iteration 5 on gpu cuda:0: 0.008465449325740337
weight iteration 5 on gpu cuda:0: 0.00012061965389875695
grad iteration 5 on gpu cuda:1: 0.0011690922547131777
weight iteration 5 on gpu cuda:1: 6.942617619642988e-05
grad iteration 6 on gpu cuda:0: 0.0013559082290157676
weight iteration 6 on gpu cuda:0: 0.00011242596519878134
grad iteration 6 on gpu cuda:1: 0.0008932517375797033
weight iteration 6 on gpu cuda:1: 6.43157254671678e-05
grad iteration 7 on gpu cuda:0: 0.02651313878595829
weight iteration 7 on gpu cuda:0: -1.233588955074083e-05
grad iteration 7 on gpu cuda:1: -0.007853103801608086
weight iteration 7 on gpu cuda:1: 0.00010782096069306135
grad iteration 8 on gpu cuda:0: 0.009321866557002068
weight iteration 8 on gpu cuda:0: -7.130965968826786e-05
grad iteration 8 on gpu cuda:1: -0.0039948648773133755
grad iteration 9 on gpu cuda:0: -0.0119229881092906
weight iteration 8 on gpu cuda:1: 0.00013168319128453732
weight iteration 9 on gpu cuda:0: -1.202533780997328e-06
grad iteration 9 on gpu cuda:1: 0.002446404891088605
weight iteration 9 on gpu cuda:1: 0.00011651107342913747

There is some problem with the gradient synchronization, which causes the weights in the two devices to diverge.

Library version information:

python -c 'import torch;print("torch:", torch.version.__version__, end=" ");print("cuda:", torch.version.cuda)' torch: 1.8.1 cuda: 11.1
the way you installed geoopt, github, pip pip
OS Ubuntu 18.04.5 LTS

EDIT: I simplified a little bit the code by removing mixed precision.

bug

opened by surisdi 9

Regression to points in Poincarè Disk Model
Hi,

I'm building neural network in pytorch which has to learn to make regression from vectors in Euclidean Space to some vectors in Poincarè Disk Model, so I think that the usage of RiemannianSGD can be a good choice as the optimizer.

I'm trying to use the library but I have some question:

When and how I have to cast tensors? Now I transform the target tensors with ManifoldTensor with Y = ManifoldTensor(Y, manifold = geoopt.manifolds.PoincareBall(), I think that even the prediction has to be a Manifold Tensor, but if I transform the prediction outside the model the ManifoldTensor does not maintain the grad value

Have I to build each layer of the network with ManifoldTensor or torch.nn.Linear can be used?

Have I to specify to the optimizer RiemannianSGD the manifold? From the documentation I think that I don't have to do it

Anyway thanks for your work on this project ; - )
opened by NooneBug 8

Error on import geoopt

Hi,

I recently updated the pytorch environment library and so I installed again geoopt via : pip install git+https://github.com/geoopt/geoopt.git

When I try to import the library this is the error that is reported to me:


  import geoopt
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/geoopt/__init__.py", line 1, in <module>
    from . import manifolds
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/geoopt/manifolds/__init__.py", line 3, in <module>
    from .stiefel import Stiefel, EuclideanStiefel, CanonicalStiefel, EuclideanStiefelExact
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/geoopt/manifolds/stiefel.py", line 4, in <module>
    from .. import linalg
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/geoopt/linalg/__init__.py", line 1, in <module>
    from .batch_linalg import svd, qr, sym, extract_diag, matrix_rank, expm, block_matrix
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/geoopt/linalg/batch_linalg.py", line 3, in <module>
    from . import _expm
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/geoopt/linalg/_expm.py", line 8, in <module>
    @torch.jit.script
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/torch/jit/__init__.py", line 364, in script
    graph = _script_graph(fn, _frames_up=_frames_up + 1)
  File "/home/vmanuel/.conda/envs/MTNCI/lib/python3.6/site-packages/torch/jit/__init__.py", line 360, in _script_graph
    return _jit_script_compile(ast, rcb)
RuntimeError: 
builtin cannot be used as a value:
        33522128640.0,
        1323241920.0,
        40840800.0,
        960960.0,
        16380.0,
        182.0,
        1.0,
    )

    ident = torch.eye(A.shape[1], dtype=A.dtype, device=A.device)
                      ~~~~~~~ <--- HERE
    A2 = torch.matmul(A, A)
    A4 = torch.matmul(A2, A2)
    A6 = torch.matmul(A4, A2)
    U = torch.matmul(
        A,
        torch.matmul(A6, b13 * A6 + b11 * A4 + b9 * A2)
        + b7 * A6
        + b5 * A4
        + b3 * A2

Can you help me?

opened by NooneBug 7

`_dist2plane` triggers codegen Warning

Describe the bug/To Reproduce The following warning appears when I use Distance2PoincareHyperplanes as the classifier for torchvision.models.efficientnet_v2_s.

~/miniconda3/envs/a5000/lib/python3.10/site-packages/geoopt/manifolds/stereographic/math.py:1562: UserWarning: operator() profile_node %301 : int = prim::profile_ivalue(%299)
 does not have profile information (Triggered internally at  /opt/conda/conda-bld/pytorch_1656352645774/work/torch/csrc/jit/codegen/cuda/graph_fuser.cpp:104.)
  return _dist2plane(
~/miniconda3/envs/a5000/lib/python3.10/site-packages/geoopt/manifolds/stereographic/math.py:1562: UserWarning: FALLBACK path has been taken inside: compileCudaFusionGroup. This is an indication that codegen Failed for some reason.
To debug try disable codegen fallback path via setting the env variable `export PYTORCH_NVFUSER_DISABLE=fallback`
To report the issue, try enable logging via setting the envvariable ` export PYTORCH_JIT_LOG_LEVEL=manager.cpp`
 (Triggered internally at  /opt/conda/conda-bld/pytorch_1656352645774/work/torch/csrc/jit/codegen/cuda/manager.cpp:237.)
  return _dist2plane(

Expected behavior Warning appears during model training.

Please complete the following information:

torch: 1.12.0, cuda: 11.3, geoopt 0.4.1
The way you installed geoopt: pip
OS: Ubuntu 20.04.4 LTS (GNU/Linux 5.4.0-109-generic x86_64)

Additional context NIL

bug

opened by jin-zhe 2

Manifold projection fails
Describe the bug With certain matrices, the use of the projx under the Lorentz Manifold fails with_check_point_on_manifold

To Reproduce Steps to reproduce the behavior:

I provided an example problem tensor for this case

Call projx on the tensor with a k of 5.0

Check if the result is on the manifold

Expected behavior With float 64 and a defined error value, check on manifold should result in True

Screenshots

It may be a training/testing loss, other information for more context

Please complete the following information:

torch: 1.11.0+cu113 cuda: 11.3

Installed through Github

Windows 11 trouble_tensor_projection.zip

bug
opened by inboxedshoe 4
Add mobius methods for Lorentz model

Thanks for your useful library. However, I found that Lorentz model has not implemented any mobius methods such as mobius_add() or mobius_matvec(). I hope you will update these methods to this library soon.
enhancement

opened by IceIce1ce 1
Extended lorenz
following #142

[ ] mobius_add

[ ] mobius_matvec

[ ] mobius_scalar_mul

[ ] mobius_pointwise_mul

[ ] mobius_fn_apply

[ ] test new functions

[ ] re-review Rasul's implementation

help wanted wip
opened by ferrine 1
Add mobius_add() and mobius_matvec() methods for Lorentz manifold

Hi,

Thanks for the useful tool you've developed, and I really appreciate it. When using geoopt, I found that the mobius_add() and mobius_matvec() methods for Lorentz manifold are missing, and there's an implementation at https://www.github.com/HazyResearch/hgcn. Could those methods be added into the package soon? Thanks a lot!
enhancement

opened by martinwhl 1

Releases(v.0.5.1)

v.0.5.1(Nov 28, 2022)
What's Changed

Update testing.yml by @ferrine in https://github.com/geoopt/geoopt/pull/198

Full Changelog: https://github.com/geoopt/geoopt/compare/v0.5.0...v.0.5.1
Source code(tar.gz)
Source code(zip)
v0.5.0(Jun 29, 2022)
What's Changed

fix typos by @ferrine in https://github.com/geoopt/geoopt/pull/190

StereographicProductManifold to use gyrovector space functions in product manifolds by @gatoniel in https://github.com/geoopt/geoopt/pull/163

Seminar by @ferrine in https://github.com/geoopt/geoopt/pull/192

New Contributors

@gatoniel made their first contribution in https://github.com/geoopt/geoopt/pull/163

Full Changelog: https://github.com/geoopt/geoopt/compare/v0.4.1...v0.5.0
Source code(tar.gz)
Source code(zip)
v0.4.1(Mar 15, 2022)
What's Changed

add tests for pytorch 1.10.0 by @ferrine in https://github.com/geoopt/geoopt/pull/186

add a test, fix deepcopy and copy by @ferrine in https://github.com/geoopt/geoopt/pull/189

Full Changelog: https://github.com/geoopt/geoopt/compare/v0.4.0...v0.4.1
Source code(tar.gz)
Source code(zip)
v0.4.0(Sep 2, 2021)
geoopt (0.4.0)

New Features

new Symmetric Positive Definite manifold (#153)

new Siegel manifolds: Upper half model and Bounded domain model, with support for Riemannian and Finsler metrics (#179)

Maintainance

create pull request templates (#154)

update tests for pytorch 1.9.0

Bug Fixes

fix step increments in optimizers (#165)

Source code(tar.gz)
Source code(zip)
v0.4.0rc1(Jul 1, 2021)
geoopt (0.4.0rc1)

New Features

new Symmetric Positive Definite manifold (#153)

new Siegel manifolds: Upper half model and Bounded domain model, with support for Riemannian and Finsler metrics (#179)

Maintainance

create pull request templates (#154)

update tests for pytorch 1.9.0

Bug Fixes

fix step increments in optimizers (#165)

Source code(tar.gz)
Source code(zip)
v0.3.1(Oct 29, 2020)

Resolves issue with pytorch nightly and upcoming pytorch release
Source code(tar.gz)
Source code(zip)
v0.3.0(Oct 7, 2020)
New Features

Riemannian Line Search (#140)

Per group stabilization (#149)

Maintenance

Fix API warnings (mentioned in #148)

support torch >= 1.4.0

Source code(tar.gz)
Source code(zip)
v0.2.0(Jun 12, 2020)
geoopt v0.2.0

New Features

BirkhoffPolytope (#125)

Lorenz Manifold (#121)

kappa-Stereographic model (#126)

Sparse optimizers (#130)

Maintenance

Tests for pytorch>=1.4, cpuonly (#133)

Source code(tar.gz)
Source code(zip)
v0.1.2(Nov 30, 2019)
Bug Fixes

Fix scaling issues with random methods

Fix poincare methods cosub and norm that were working not properly

Fix Sphere distance for small values

Source code(tar.gz)
Source code(zip)

Owner

GitHub https://geoopt.readthedocs.io

On the Variance of the Adaptive Learning Rate and Beyond

RAdam On the Variance of the Adaptive Learning Rate and Beyond We are in an early-release beta. Expect some adventures and rough edges. Table of Conte

2.5k Dec 27, 2022

OptNet: Differentiable Optimization as a Layer in Neural Networks

OptNet: Differentiable Optimization as a Layer in Neural Networks This repository is by Brandon Amos and J. Zico Kolter and contains the PyTorch sourc

428 Dec 24, 2022

Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does

1.1k Jan 4, 2023

ONNX Runtime for PyTorch accelerates PyTorch model training using ONNX Runtime.

Accelerate PyTorch models with ONNX Runtime

270 Dec 24, 2022

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

56 Sep 13, 2022

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

878 Dec 30, 2022

Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

251 Dec 25, 2022

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

12 Dec 19, 2021

Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.

Pretrained models for Pytorch (Work in progress) The goal of this repo is: to help to reproduce research papers results (transfer learning setups for

8.7k Dec 31, 2022

Model summary in PyTorch similar to `model.summary()` in Keras

Keras style model.summary() in PyTorch Keras has a neat API to view the visualization of the model which is very helpful while debugging your network.

3.7k Dec 29, 2022

torch-optimizer -- collection of optimizers for Pytorch

torch-optimizer torch-optimizer -- collection of optimizers for PyTorch compatible with optim module. Simple example import torch_optimizer as optim

2.6k Jan 3, 2023

A PyTorch implementation of EfficientNet

EfficientNet PyTorch Quickstart Install with pip install efficientnet_pytorch and load a pretrained EfficientNet with: from efficientnet_pytorch impor

7.2k Jan 6, 2023

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

News March 3: v0.9.97 has various bug fixes and improvements: Bug fixes for NTXentLoss Efficiency improvement for AccuracyCalculator, by using torch i

5k Jan 2, 2023

A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch

Torchmeta A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch. Torchmeta contains popular meta-learning bench

1.7k Jan 6, 2023

PyTorch Extension Library of Optimized Scatter Operations

PyTorch Scatter Documentation This package consists of a small extension library of highly optimized sparse update (scatter and segment) operations fo

1.2k Jan 7, 2023

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

PyTorch Sparse This package consists of a small extension library of optimized sparse matrix operations with autograd support. This package currently

757 Jan 4, 2023

Reformer, the efficient Transformer, in Pytorch

Reformer, the Efficient Transformer, in Pytorch This is a Pytorch implementation of Reformer https://openreview.net/pdf?id=rkgNKkHtvB It includes LSH

1.8k Jan 6, 2023

higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual training steps.

higher is a library providing support for higher-order optimization, e.g. through unrolled first-order optimization loops, of "meta" aspects of these

1.5k Jan 3, 2023

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf

README TabNet : Attentive Interpretable Tabular Learning This is a pyTorch implementation of Tabnet (Arik, S. O., & Pfister, T. (2019). TabNet: Attent

2k Dec 27, 2022

Riemannian Adaptive Optimization Methods with pytorch optim

Related tags

Overview

geoopt

Installation

PyTorch Support

What is done so far

Tensors

Manifolds

Optimizers

Samplers

Citing Geoopt

Comments

Progress

Reference Implementation

Some paper using SPD Manifolds

Releases(v.0.5.1)

v.0.5.1(Nov 28, 2022)

What's Changed

v0.5.0(Jun 29, 2022)

What's Changed

New Contributors

v0.4.1(Mar 15, 2022)

What's Changed

v0.4.0(Sep 2, 2021)

geoopt (0.4.0)

New Features

Maintainance

Bug Fixes

v0.4.0rc1(Jul 1, 2021)

geoopt (0.4.0rc1)

New Features

Maintainance

Bug Fixes

v0.3.1(Oct 29, 2020)

v0.3.0(Oct 7, 2020)

New Features

Maintenance

v0.2.0(Jun 12, 2020)

geoopt v0.2.0

New Features

Maintenance

v0.1.2(Nov 30, 2019)

Bug Fixes

Owner

On the Variance of the Adaptive Learning Rate and Beyond

OptNet: Differentiable Optimization as a Layer in Neural Networks

Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

ONNX Runtime for PyTorch accelerates PyTorch model training using ONNX Runtime.

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.

Model summary in PyTorch similar to `model.summary()` in Keras

torch-optimizer -- collection of optimizers for Pytorch

A PyTorch implementation of EfficientNet

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch

PyTorch Extension Library of Optimized Scatter Operations

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

Reformer, the efficient Transformer, in Pytorch

higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual training steps.

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf