Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Google Research

Last update: Dec 12, 2022

Related tags

Deep Learning head2toe

Overview

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Code for reproducing our results in the Head2Toe paper.

Paper: arxiv.org/abs/2201.03529

Instructions to run experiments coming soon.

Disclaimer

This is not an officially supported Google product.

Comments

Could you please share your model conversion file?

Thanks for your great work and you're so kind to share the code. When I ran the code, I found if I wanted to use the custom model or different pre-training parameters (e.g., ImageNet-21k pretrained ViT-16/B), I needed to convert the model file from jax to tensorflow. And the process is complicated and frustrating. Could you please share your model conversion file (especially ViT model), so that this process would not be that hard? By the way, I found there are 4 extra group of features, i.e, {'cls_embedded', 'encoded_sequence', 'position_embedded_input', 'root_output_with_cls'}. And in the appendix it was wrote "Additionally we use patched and embedded image (i.e. tokenized image input), pre-logits (final CLS-embedding) and logits." What are 'encoded_sequence', 'cls_embedded', 'root_output_with_cls' represent for? I notice all of them have 768 dims, but logits seem should have 1000 dims. Thank you in advance!

opened by nuaajeff 4

error with imagenetvitB16 [Solved]

Hello Utku, Thanks for this great research The evaluate shell with imagenetr50 works well but when I run the evaluate shell with imagenetvitB16 like python -m head2toe.evaluate --config=head2toe/configs_eval/finetune.py:imagenetvitB16 --config.dataset='data.caltech101' --config.max_num_gpus 1

I got the TypeError :

  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/congyu/TextSum/head2toe/head2toe/evaluate.py", line 148, in <module>
    app.run(main)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/home/congyu/TextSum/head2toe/head2toe/evaluate.py", line 71, in main
    model = finetune_fs_models.FinetuneFS(config)
  File "/home/congyu/TextSum/head2toe/head2toe/models/finetune.py", line 118, in __init__
    res = self.load_backbones()
  File "/home/congyu/TextSum/head2toe/head2toe/models/finetune.py", line 145, in load_backbones
    outputs = backbone(resized_inputs)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/keras/engine/base_layer.py", line 977, in __call__
    input_list)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/keras/engine/base_layer.py", line 1115, in _functional_construction_call
    inputs, input_masks, args, kwargs)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/keras/engine/base_layer.py", line 848, in _keras_tensor_symbolic_call
    return self._infer_output_signature(inputs, args, kwargs, input_masks)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/keras/engine/base_layer.py", line 888, in _infer_output_signature
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py", line 695, in wrapper
    raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:

    /home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow_hub/keras_layer.py:237 call  *
        result = smart_cond.smart_cond(training,
    /home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/eager/function.py:1707 __call__  **
        return self._call_impl(args, kwargs)
    /home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/eager/function.py:1723 _call_impl
        raise structured_err
    /home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/eager/function.py:1717 _call_impl
        cancellation_manager)
    /home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/eager/function.py:1794 _call_with_structured_signature
        self._structured_signature_check_missing_args(args, kwargs)
    /home/congyu/miniconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/eager/function.py:1815 _structured_signature_check_missing_args
        ", ".join(sorted(missing_arguments))))

    TypeError: signature_wrapper(*, input_1) missing required arguments: input_1```

Is there something wrong in the vit-pipeline or the shell?

bug

opened by wxf12345 3

Instructions to run experiments

Hello Utku, Thanks for this great research. Just wanted to know if you could publish the instructions to run the codes and possibly training on custom datasets and models. Best regards, Aryan

opened by arxyzan 2
Chosen Layers

I'm a little bit confused about which layers are being selected in the Resnet50 Head2Toe implementation. It says in the paper that you select features before and after the final pooling layer. Aren't the features from before the final pooling layer the same as the outputs of the final activation function in the previous Bottleneck block? Wouldn't this result in selecting the same features twice?

opened by ShivainVij 1
Feature Selection Method

Hi! This is some really cool work.

I was hoping if you could clarify some parts of the research paper for me.

How are you able to choose the target size? What I mean is, since the final avgpooling layer in ResNet50 has 2048 input features, does that mean that the maximum number of features you can select from each layer (after pooling to reduce dimensionality) is 2048?

opened by ShivainVij 1
vtab data preprocess with input range (-1, 1)

Hi,

Thanks for sharing the code. I wonder if there a reason why setting input range to (-1, 1) (https://github.com/google-research/head2toe/blob/main/head2toe/input_pipeline.py#L152) instead of (0, 1) in vtab(https://github.com/google-research/task_adaptation/blob/master/scripts/run_all_tasks.sh#L57)？

opened by KMnP 1
Fix paper link

Without the protocol it's treated as a sub directory instead Edit: CLA signed but cannot re-trigger, for such a trivial fix no need to bother please edit directly

opened by m3at 1

Fixed an error when running run.sh

./run.sh
+ pip install -r head2toe/requirements.txt
ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'head2toe/requirements.txt'

opened by GitHub30 1

Owner

Google Research

GitHub

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

FOOD - Fast OOD Detector Pytorch implamentation of the confernce peper FOOD arxiv link. Abstract Deep neural networks (DNNs) perform well at classifyi

17 Jun 19, 2022

OOD Dataset Curator and Benchmark for AI-aided Drug Discovery

?? DrugOOD ?? : OOD Dataset Curator and Benchmark for AI Aided Drug Discovery This is the official implementation of the DrugOOD project, this is the

108 Dec 17, 2022

ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

23 Oct 9, 2022

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

Investigating U-NETS With Various Intermediate Blocks For Spectrogram-based Singing Voice Separation A Pytorch Implementation of the paper "Investigat

63 Nov 14, 2022

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

138 Dec 9, 2022

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation Ported from https://github.com/hzwer/arXiv2020-RIFE Dependencies NumPy

49 Jan 7, 2023

This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin

SPARQLing Database Queries from Intermediate Question Decompositions This repo is the implementation of the following paper: SPARQLing Database Querie

20 Dec 19, 2022

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real Time Video Interpolation arXiv | YouTube | Colab | Tutorial | Demo Table of Contents Introduction Collection Usage Evaluation Training and

3k Jan 4, 2023

Llvlir - Low Level Variable Length Intermediate Representation

Low Level Variable Length Intermediate Representation Low Level Variable Length

2 Jan 24, 2022

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation YouTube | BiliBili 16X interpolation results from two input images: Introd

28 Dec 9, 2022

Library for machine learning stacking generalization.

stacked_generalization Implemented machine learning *stacking technic[1]* as handy library in Python. Feature weighted linear stacking is also availab

114 Jul 19, 2022

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

sam.pytorch A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization ( Foret+2020) Paper, Official implementa

102 Dec 28, 2022

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

178 Jan 6, 2023

The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

56 Dec 28, 2022

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Related tags

Overview

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Disclaimer

Comments

Could you please share your model conversion file?

error with imagenetvitB16 [Solved]

Instructions to run experiments

Chosen Layers

Feature Selection Method

vtab data preprocess with input range (-1, 1)

Fix paper link

Fixed an error when running run.sh

Owner

Google Research

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

OOD Dataset Curator and Benchmark for AI-aided Drug Discovery

ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Llvlir - Low Level Variable Length Intermediate Representation

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Library for machine learning stacking generalization.

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

A list of papers regarding generalization in (deep) reinforcement learning

Domain Generalization with MixStyle, ICLR'21.

Benchmarks for semi-supervised domain generalization.

Sharpness-Aware Minimization for Efficiently Improving Generalization

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.