[SIGMETRICS 2022] One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Last update: Dec 16, 2022

Related tags

Overview

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

paper | website

Bingqian Lu, Jianyi Yang, Weiwen Jiang, Yiyu Shi, Shaolei Ren, Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 5, no. 3, Dec, 2021. (SIGMETRICS 2022)

@article{
  luOneProxy2021,
  title={One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search},
  author={Bingqian Lu and Jianyi Yang and Weiwen Jiang and Yiyu Shi and Shaolei Ren},
  journal = {Proceedings of the ACM on Measurement and Analysis of Computing Systems}, 
  month = Dec,
  year = 2021,
  volume = {5}, 
  number = {3},
  articleno = {34}, 
  numpages = {35},
}

In a Nutshell

Given N target devices, our OneProxy approach can keep the total neural architecture search cost at O(1).

Hardware-aware NAS Dilemma

CNNs are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices.

Overview of SOTA NAS algorithms

Left: NAS without a supernet. Right: One-shot NAS with a supernet.

Cost Comparison of Hardware-aware NAS Algorithms for 𝑛 Target Devices.

Our approach: exploiting latency monotonicity

We address the scalability challenge by exploiting latency monotonicity — the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality.

Using SRCC to measure latency monotonicity

To quantify the degree of latency monotonicity, we use the metric of Spearman’s Rank Correlation Coefficient (SRCC), which lies between -1 and 1 and assesses statistical dependence between the rankings of two variables using a monotonic function.

SRCC of 10k sampled models latencies in MobileNet-V2 space on different pairs of mobile and non-mobile devices.

In the absence of strong latency monotonicity: adapting the proxy latency predictor

AdaProxy for boosting latency monotonicity

We exploit the correlation among devices and propose efficient transfer learning to boost the otherwise possibly weak latency monotonicity for a target device.

In the MobileNet-V2 space, with S5e as default proxy device

In the NAS-Bench-201 search space on CIFAR-10 (left), CIFAR-100 (middle) and ImageNet16-120 (right) datasets, with Pixel3 as our proxy device

In the FBNet search spaces on CIFAR-100 (left) and ImageNet16-120 (right) datasets, with Pixel3 as our proxy device

SRCC for various devices in the NAS-Bench-201 search space with latencies collected from [19, 29, 49, 50]

Using one proxy device for hardware-aware NAS

One proxy for hardware-aware NAS

Results for non-mobile target devices with the default S5e proxy and AdaProxy. The top row shows the evolutionary search results with real measured accuracies, and the bottom row shows the exhaustive search results based on 10k random architectures (in the MobileNet-V2 space) and predicted accuracies.

Exhaustive search results for different target devices on NAS-Bench-201 architectures (CIFAR-10 dataset). Pixel3 is the proxy.

Public latency datasets used in this work

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

Eagle: Efficient and Agile Performance Estimator and Dataset

nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices

Once for All: Train One Network and Specialize it for Efficient Deployment

You might also like...

Deep Multimodal Neural Architecture Search

MMNas: Deep Multimodal Neural Architecture Search This repository corresponds to the PyTorch implementation of the MMnas for visual question answering

23 Dec 21, 2022

Official implementation of Rethinking Graph Neural Architecture Search from Message-passing (CVPR2021)

Rethinking Graph Neural Architecture Search from Message-passing Intro The GNAS can automatically learn better architecture with the optimal depth of

48 Sep 30, 2022

Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

DNA This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation. Illustration of DNA

215 Dec 19, 2022

"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

NAS-Bench-301 This repository containts code for the paper: "NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search". The

57 Nov 30, 2022

Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets This is the official PyTorch implementation for the paper Rapid Neural A

48 Dec 26, 2022

[SIGMETRICS 2022] One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Related tags

Overview

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

paper | website

In a Nutshell

Hardware-aware NAS Dilemma

Overview of SOTA NAS algorithms

Our approach: exploiting latency monotonicity

Using SRCC to measure latency monotonicity

In the absence of strong latency monotonicity: adapting the proxy latency predictor

AdaProxy for boosting latency monotonicity

Using one proxy device for hardware-aware NAS

One proxy for hardware-aware NAS

Public latency datasets used in this work

You might also like...

Deep Multimodal Neural Architecture Search

Official implementation of Rethinking Graph Neural Architecture Search from Message-passing (CVPR2021)

Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Few-shot Neural Architecture Search

PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

An implementation for Neural Architecture Search with Random Labels (CVPR 2021 poster) on Pytorch.

Owner

Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer (CVPR 2021)

Model search is a framework that implements AutoML algorithms for model architecture search at scale

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search