Code for STFT Transformer used in BirdCLEF 2021 competition.

Overview

STFT_Transformer

Code for STFT Transformer used in BirdCLEF 2021 competition.

The STFT Transformer is a new way to use Transformers similar to Vision Transformers on audio data. It has been developed for the BirdCLEF 2021 competition hosted on Kaggle. The pdf document gives more context. It has been submitted to the BIRDCLEF 2021 workshop.

The code is provided as is, it has not been rewritten. Given competitions are done in a hurry, code may not meet usual open source standard.

The code assumes this directory structure:

<base_dir>/code

<base_dir>/input

<base_dir>/input/freefield1010

<base_dir>/checkpoints

<base_dir>/data

Code has to be run in the code directory. Competition data has to be downloaded in the input directory. freefield1010 data must also be downloaded in the freefield1010 directory. data_final.py should be run first. It reads audio files from input and stores the relevant part in data directory as numpy files.

Then stft_transformer_final.py can be run to train one fold model. During the competition I ran 5 folds, by editing the FOLD global variable in the script (I know, this is sub standard).

Once all 5 models are trained one can upload the weights to a kaggle dataset and use the submission notebook I used. This should get a score worth the 15th rank in the competition. Achieving this rank with a single model is significant, as all top teams used an ensemble of models.

You might also like...
Pairwise model for commonlit competition
Pairwise model for commonlit competition

Pairwise model for commonlit competition To run: - install requirements - create input directory with train_folds.csv and other competition data - cd

๐Ÿฅˆ78th place in Riiid Answer Correctness Prediction competition

Riiid Answer Correctness Prediction Introduction This repository is the code that placed 78th in Riiid Answer Correctness Prediction competition. Requ

Automated Hyperparameter Optimization Competition

QQๆต่งˆๅ™จ2021AI็ฎ—ๆณ•ๅคง่ต› - ่‡ชๅŠจ่ถ…ๅ‚ๆ•ฐไผ˜ๅŒ–็ซž่ต› ACM CIKM 2021 AnalyticCup ๅœจไฟกๆฏๆตๆŽจ่ไธšๅŠกๅœบๆ™ฏไธญๆ™ฎ้ๅญ˜ๅœจๆจกๅž‹ๆˆ–็ญ–็•ฅๆ•ˆๆžœไพ่ต–ไบŽโ€œ่ถ…ๅ‚ๆ•ฐโ€็š„้—ฎ้ข˜๏ผŒ่€Œโ€œ่ถ…ๅ‚ๆ•ฐ"็š„่ฎพๅฎšๅพ€ๅพ€ไพ่ต–ไบบๅทฅ็ป้ชŒ่ฐƒๅ‚๏ผŒไธไป…ๆ•ˆ็Ž‡ไฝŽไธ‹็ปดๆŠคๆˆๆœฌ้ซ˜๏ผŒ่€Œไธ”้šพไปฅๅฎž็Žฐๆ›ดไผ˜ๆ•ˆๆžœใ€‚ๅ› ๆญค๏ผŒๆœฌๆฌก่ต›้ข˜ไปฅ่ถ…ๅ‚ๆ•ฐไผ˜ๅŒ–ไธบไธป้ข˜๏ผŒไปŽ็œŸ

My published benchmark for a Kaggle Simulations Competition
My published benchmark for a Kaggle Simulations Competition

Lux AI Working Title Bot Please refer to the Kaggle notebook for the comment section. The comment section contains my explanation on my code structure

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.
Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application
Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

FPT_data_centric_competition - Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Sartorius - Cell Instance Segmentation https://www.kaggle.com/c/sartorius-cell-instance-segmentation Environment setup Build docker image bash .dev_sc

Job-Recommend-Competition - Vectorwise Interpretable Attentions for Multimodal Tabular Data
Job-Recommend-Competition - Vectorwise Interpretable Attentions for Multimodal Tabular Data

SiD - Simple Deep Model Vectorwise Interpretable Attentions for Multimodal Tabul

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

NLP_0-project Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures1. We are a "democratic" and c

Comments
  • Pytorch dataloader bug

    Pytorch dataloader bug

    I see that in the dataset class you are using numpy random number generator. Also, in the dataloader you aren't using worker_init_fn as recommend in pytorch docs. This makes me suspect that the random samples are not being generated according to our expectations. More details about the bug are here

    It will be great to see the impact of this bug on model performance.

    opened by sidml 2
  • f1 score low!!!

    f1 score low!!!

    Hi, CPMP. I knew this project from the kaggle. I want to learn from your work, so I ran your code. But I got very low f1 score in the first 3 epochs. Is this normal?

    loss: 0.2049, smth: 0.1818: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 13726/13726 [35:29<00:00, 6.44it/s] loss: 0.0635, smth: 0.1129: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 858/858 [03:15<00:00, 4.39it/s] Orig 0 Ep 0, lr: 0.0001000, train loss: 0.28392, val loss: 0.21424, f1: 0.0784 0.0950 0.0830 0.0805 0.0779 Thu May 5 23:40:27 2022 Epoch: 1 loss: 0.1854, smth: 0.1726: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 13726/13726 [35:25<00:00, 6.46it/s] loss: 0.0494, smth: 0.0906: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 858/858 [03:10<00:00, 4.50it/s] Orig 0 Ep 1, lr: 0.0000970, train loss: 0.17728, val loss: 0.17907, f1: 0.0948 0.1055 0.0959 0.0811 0.0748 Fri May 6 00:19:09 2022 Epoch: 2 loss: 0.1628, smth: 0.1752: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 13726/13726 [35:25<00:00, 6.46it/s] loss: 0.0728, smth: 0.0919: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 858/858 [03:08<00:00, 4.55it/s] Orig 0 Ep 2, lr: 0.0000883, train loss: 0.17329, val loss: 0.15498, f1: 0.0941 0.1128 0.1118 0.0872 0.0765 Fri May 6 00:57:49 2022 Epoch: 3 loss: 0.1113, smth: 0.1657: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 13726/13726 [35:30<00:00, 6.44it/s] loss: 0.0391, smth: 0.0707: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 858/858 [03:34<00:00, 4.00it/s] Orig 0 Ep 3, lr: 0.0000750, train loss: 0.16898, val loss: 0.14133, f1: 0.1139 0.1333 0.1339 0.1080 0.0915

    opened by michaelzhouy 1
Owner
Jean-Franรงois Puget
NVIDIA Distinguished engineer, and twice Kaggle Grandmaster (CPMP). Works on machine learning and competes on Kaggle.
Jean-Franรงois Puget
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

Jianquan Ye 298 Dec 21, 2022
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

yuxzho 94 Dec 25, 2022
1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

2021AICompetition-03 ๋ณธ repo ๋Š” mAy-I Inc. ํŒ€์œผ๋กœ ์ฐธ๊ฐ€ํ•œ 2021 ์ธ๊ณต์ง€๋Šฅ ์˜จ๋ผ์ธ ๊ฒฝ์ง„๋Œ€ํšŒ ์ค‘ [์ด๋ฏธ์ง€] ์šด์ „ ์‚ฌ๊ณ  ์˜ˆ๋ฐฉ์„ ์œ„ํ•œ ์šด์ „์ž ๋ถ€์ฃผ์˜ ํ–‰๋™ ๊ฒ€์ถœ ๋ชจ๋ธ] ํƒœ์Šคํฌ ์ˆ˜ํ–‰์„ ์œ„ํ•œ ๋ ˆํฌ์ง€ํ† ๋ฆฌ์ž…๋‹ˆ๋‹ค. mAy-I ๋Š” ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณดํ†ต์‹ ๋ถ€๊ฐ€ ์ฃผ์ตœํ•˜

Junhyuk Park 9 Dec 1, 2022
QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

null 249 Jan 3, 2023
Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

KAIROS MineRL BASALT Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL B

Vinicius G. Goecks 37 Oct 30, 2022
1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

MEGVII Research 24 Sep 8, 2022
Top #1 Submission code for the first https://alphamev.ai MEV competition with best AUC (0.9893) and MSE (0.0982).

alphamev-winning-submission Top #1 Submission code for the first alphamev MEV competition with best AUC (0.9893) and MSE (0.0982). The code won't run

null 70 Oct 29, 2022
The 3rd place solution for competition

The 3rd place solution for competition "Lyft Motion Prediction for Autonomous Vehicles" at Kaggle Team behind this solution: Artsiom Sanakoyeu [Homepa

Artsiom 104 Nov 22, 2022
Winning solution of the Indoor Location & Navigation Kaggle competition

This repository contains the code to generate the winning solution of the Kaggle competition on indoor location and navigation organized by Microsoft

Tom Van de Wiele 62 Dec 28, 2022