PASSL包含 SimCLR，MoCo，BYOL，CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer，Swin-Transformer，BEiT，CVT，T2T，MLP_Mixer等视觉Transformer算法

Last update: Dec 29, 2022

Related tags

Deep Learning deep-learning clip paddle moco self-supervised-learning cvt simclr beit vision-transformer moco-v2 swin-transformer

Overview

PASSL

Introduction

PASSL is a Paddle based vision library for state-of-the-art Self-Supervised Learning research with PaddlePaddle. PASSL aims to accelerate research cycle in self-supervised learning: from designing a new self-supervised task to evaluating the learned representations.

Reproducible implementation of SOTA in Self-Supervision: Existing SOTA in Self-Supervision are implemented - SimCLR, MoCo(v1),MoCo(v2), MoCo-BYOL, CLIP. BYOL is coming soon. Also supports supervised trainings.
Modular: Easy to build new tasks and reuse the existing components from other tasks (Trainer, models and heads, data transforms, etc.).

Installation

See INSTALL.md.

Implemented Models

Benchmark Linear Image Classification on ImageNet-1K

	epochs	official results	passl results	Backbone	Model
MoCo	200	60.6	60.64	ResNet-50	download
SimCLR	100	64.5	65.3	ResNet-50	download
MoCo v2	200	67.7	67.72	ResNet-50	download
MoCo-BYOL	300	71.56	72.10	ResNet-50	download
BYOL	300	72.50	71.62	ResNet-50	download

Getting Started

Please see GETTING_STARTED.md for the basic usage of PASSL.

Tutorials

Comments

MLP-Mixer: An all-MLP Architecture for Vision

readme文件里的两个模型的TOP1 是不是写反了？模型大的准确度比模型小的准确度小一些？

Arch | Weight | Top-1 Acc | Top-5 Acc | Crop ratio | # Params -- | -- | -- | -- | -- | -- mlp_mixer_b16_224 | pretrain 1k | 76.60 | 92.23 | 0.875 | 60.0M mlp_mixer_l16_224 | pretrain 1k | 72.06 | 87.67 | 0.875 | 208.2M

opened by gaorui999 3
我很关注图像分类的自监督进展

小弟想问问,对于图像分类的自监督,目前是什么进展呢?比如猫狗分类这种典型的二分类准确率如何?imagenet1k分类准确率如何?PASSL里面的关于图像分类的自监督算法或者模型,有哪些?能给个例子,让我知道如何使用吗?目前看到PASSLissues才1条,文档完全没看到.方便加个微信或者QQ聊几句吗?小弟对于图像分类的自监督高度重视.还有一个疑问,关于图像分类的自监督模型,是不是我给一堆图片,模型运行后,就会把图片归类呢?我需不需要给出类别的数量呢?说白了,我想知道图像分类的自监督的一个使用流程.现在都1.0了,该有点用处了吧.如果一个模型运行后,图像就分好类了,归纳为N类,我有什么办法判断分类的正确性呢?这方面有算法吗? 提了很多问题,跪求每个问题都回答一下,谢谢大佬.

opened by yuwoyizhan 2
Unintended behavior in clip_logit_scale
https://github.com/PaddlePaddle/PASSL/blob/83c49e6a5ba3444cee7f054122559d7759152764/passl/modeling/backbones/clip.py#L317

check this issue for reference https://github.com/PaddlePaddle/Paddle/issues/43710

Suggested approach (with non-public API)

logit_scale_buffer = self.logit_scale.clip(-4.6, 4.6) logit_scale_buffer._share_buffer_to(self.logit_scale)
opened by minogame 1
建议

1.passl很多文字都是英文的,包括快速使用等文档,希望可以提供中文文档. 2.希望知道图像分类自监督学习的技术研究目前到达什么程度了.比如猫狗这种二分类准确率如何,imagenet准确率如何,使用passl进行图像分类,需要给类别总数量吗? 3.能加个QQ或者微信聊几句吗?有些疑问,拜托了,大佬. QQ:1226194560 微信:18820785964

opened by yuwoyizhan 1

fix bug of mixup for DeiT

DeiT/B-16 pretrained on ImageNet1K:

[01/21 02:54:46] passl.engine.trainer INFO: Validate Epoch [290] acc1 (81.336), acc5 (95.544)
[01/21 03:02:31] passl.engine.trainer INFO: Validate Epoch [291] acc1 (81.328), acc5 (95.580)
[01/21 03:10:20] passl.engine.trainer INFO: Validate Epoch [292] acc1 (81.390), acc5 (95.608)
[01/21 03:18:10] passl.engine.trainer INFO: Validate Epoch [293] acc1 (81.484), acc5 (95.636)
[01/21 03:26:00] passl.engine.trainer INFO: Validate Epoch [294] acc1 (81.452), acc5 (95.600)
[01/21 03:33:52] passl.engine.trainer INFO: Validate Epoch [295] acc1 (81.354), acc5 (95.528)
[01/21 03:41:38] passl.engine.trainer INFO: Validate Epoch [296] acc1 (81.338), acc5 (95.562)
[01/21 03:49:25] passl.engine.trainer INFO: Validate Epoch [297] acc1 (81.344), acc5 (95.542)
[01/21 03:57:15] passl.engine.trainer INFO: Validate Epoch [298] acc1 (81.476), acc5 (95.550)
[01/21 04:05:03] passl.engine.trainer INFO: Validate Epoch [299] acc1 (81.476), acc5 (95.572)
[01/21 04:12:51] passl.engine.trainer INFO: Validate Epoch [300] acc1 (81.386), acc5 (95.536)

opened by GuoxiaWang 1

BYOL的预训练中好像使用了gt_label？
在byol的config 中设置了 num_classes=1000: https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/configs/byol/byol_r50_IM.yaml#L34

在model中设置了self.classifier = nn.Linear(embedding_dim, num_classes)，并且forward中将classif_out和label一起传给了head

https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/passl/modeling/architectures/BYOL.py#L263

在L2 Head中将对比loss和有监督的CE loss加在了一起返回

https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/passl/modeling/heads/l2_head.py#L43
opened by youqingxiaozhua 0
[飞桨论文复现挑战赛(第六期)] (85) Emerging Properties in Self-Supervised Vision Transformers
PR types

New features

PR changes

APIs

Describe

Task: https://github.com/PaddlePaddle/Paddle/issues/41482

添加 passl.model.architectures.dino

Peformance

| Model | Official | Passl | | ---- | ---- | ---- | | DINO | 74.0 | 73.6 |

[x] 预训练和linear probe代码

[ ] 预训练和linear probe权重

[ ] 文档

[ ] TIPC
opened by fuqianya 0

Releases(v1.0.0)

v1.0.0(Feb 24, 2022)
新增 XCiT 视觉 Transformer 模型 xcit_nano_12_p8_224 蒸馏模型训练指标对齐，感谢 @BrilliantYuKaimin 的高质量贡献 🎉 🎉 🎉

PASSL飞桨自监督领域核心学习库，提供大量高精度的视觉自监督模型、视觉 Transformer 模型，并支持超大视觉模型分布式训练功能，旨在提升飞桨开发者在自监督领域建模效率，并提供基于飞桨框架2.2的超大视觉模型领域最佳实践
Source code(tar.gz)
Source code(zip)

Owner

GitHub

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

408 Dec 30, 2022

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

175 Jan 8, 2023

A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)

14 Jul 7, 2022

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

1.4k Dec 30, 2022

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Swin-Transformer-Tensorflow A direct translation of the official PyTorch implementation of "Swin Transformer: Hierarchical Vision Transformer using Sh

52 Dec 29, 2022

An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

48 Sep 27, 2022

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

160 Jan 4, 2023

custom pytorch implementation of MoCo v3

MoCov3-pytorch custom implementation of MoCov3 [arxiv]. I made minor modifications based on the official MoCo repository [github]. No ViT part code an

39 Nov 14, 2022

SelfAugment extends MoCo to include automatic unsupervised augmentation selection.

SelfAugment extends MoCo to include automatic unsupervised augmentation selection. In addition, we've included the ability to pretrain on several new datasets and included a wandb integration.

24 Oct 26, 2022

PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

MoCo v3 for Self-supervised ResNet and ViT Introduction This is a PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT. The original M

887 Jan 8, 2023

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

2.2k Jan 8, 2023

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

1.7k Dec 28, 2022

Unofficial PyTorch implementation of SimCLR by Google Brain

2 Oct 13, 2021

Implementation of the Swin Transformer in PyTorch.

Swin Transformer - PyTorch Implementation of the Swin Transformer architecture. This paper presents a new vision Transformer, called Swin Transformer,

597 Jan 3, 2023

Tensorflow implementation of Swin Transformer model.

Swin Transformer (Tensorflow) Tensorflow reimplementation of Swin Transformer model. Based on Official Pytorch implementation. Requirements tensorflow

167 Jan 8, 2023

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

869 Jan 7, 2023

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Updates (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training. Pyr

1.3k Jan 4, 2023

SwinIR: Image Restoration Using Swin Transformer

SwinIR: Image Restoration Using Swin Transformer This repository is the official PyTorch implementation of SwinIR: Image Restoration Using Shifted Win

2.4k Jan 8, 2023

Image Restoration Using Swin Transformer for VapourSynth

SwinIR SwinIR function for VapourSynth, based on https://github.com/JingyunLiang/SwinIR. Dependencies NumPy PyTorch, preferably with CUDA. Note that t

11 Jun 19, 2022

PASSL包含 SimCLR，MoCo，BYOL，CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer，Swin-Transformer，BEiT，CVT，T2T，MLP_Mixer等视觉Transformer算法

Related tags

Overview

PASSL

Introduction

Installation

Implemented Models

Getting Started

Tutorials

Comments

MLP-Mixer: An all-MLP Architecture for Vision

我很关注图像分类的自监督进展

Unintended behavior in clip_logit_scale

建议

fix bug of mixup for DeiT

BYOL的预训练中好像使用了gt_label？

[飞桨论文复现挑战赛(第六期)] (85) Emerging Properties in Self-Supervised Vision Transformers

PR types

PR changes

Describe

Peformance

Releases(v1.0.0)

v1.0.0(Feb 24, 2022)

Owner

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

An essential implementation of BYOL in PyTorch + PyTorch Lightning

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

custom pytorch implementation of MoCo v3

SelfAugment extends MoCo to include automatic unsupervised augmentation selection.

PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Unofficial PyTorch implementation of SimCLR by Google Brain

Implementation of the Swin Transformer in PyTorch.

Tensorflow implementation of Swin Transformer model.

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

SwinIR: Image Restoration Using Swin Transformer

Image Restoration Using Swin Transformer for VapourSynth