Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

Jonathan Striebel

Last update: Dec 12, 2022

Related tags

Deep Learning data-analysis-speedup

Overview

5 Steps to Speed Up Your Data-Analysis on a Single Core

Material for my talk at the PyConDE & PyData Berlin 2022

Description

Your data analysis pipeline works. Nice.
Could it be faster? Probably.
Do you need to parallelize? Not yet.

We'll go through optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs. This walkthrough shows tools and strategies to identify and mitigate bottlenecks, and demonstrate them in an example. The 5 steps cover:

Identifying bottlenecks: Profiling
Efficient IO
Vectorization
Memory & Precision Tradeoffs
Jit-ting with numba

This talk is suited for data scientists on a beginner and intermediate level, typically working with a numpy/scipy/… stack or similar. The talk gives strategies & concrete suggestions how to speed up an existing analysis pipeline, which is demonstrated practically on an example, showing the gained speed improvements of each step.

Installation & Usage

python3 -m pip install poetry
poetry install
poetry run python -m jupyterlab

Dev

./format.sh

A repository for storing njxzc final exam review material

文档地址，请戳我 👈 👈 👈 ☀️ 1.Reason 大三上期末复习软件工程的时候，发现其他高校在GitHub上开源了他们学校的期末试题，我很受触动。期末

2 Jan 18, 2022

Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

Blender PCOY (Pantone Color of the Year) MCMC (Mid-Century Modern Colors) HG71 (House & Garden Colors 1971) Blender Add-ons That Assign a Custom Color

15 Nov 20, 2022

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

37 Nov 27, 2022

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

274 Jan 5, 2023

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows WACV 2022 preprint:https://arxiv.org/abs/2107.1

156 Dec 28, 2022

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation (EMNLP-2021 main conference) Contents Overview Background Quick to Use Furth

13 Jul 25, 2022

Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

Manifold-SCA Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning The repo is org

172 Dec 29, 2022

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

25 Sep 26, 2022

Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

Related tags

Overview

5 Steps to Speed Up Your Data-Analysis on a Single Core

Description

Installation & Usage

Dev

You might also like...

A repository for storing njxzc final exam review material

Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

Owner

Jonathan Striebel

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

Preparation material for Dropbox interviews

This is the official code release for the paper Shape and Material Capture at Home

A repo with study material, exercises, examples, etc for Devnet SPAUTO

Code for PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing

All supplementary material used by me while TA-ing CS3244: Machine Learning

Material related to the Principles of Cloud Computing course.

High-performance moving least squares material point method (MLS-MPM) solver.