Building Shazam from scratch
In this repository we tried to implement a simplified copy of the Shazam application able to tell you the name of a song listening to a short sample.
Overview
- Converting the songs from mp3 to wav with Librosa and extraction of the peaks
- MinHashing with permutations on the shingles matrix
- Locality sensitive hashing to divide the songs in buckets
- Shazam!
Contents
- pickle is a folder that contains the songs peaks, the shingles array and the shingle matrix in pickle format.
- ShazamLSH.ipynb is the main notebook that only contains the explanation of the steps and some comments
- function.py contains all the implemented function needed to execute the notebook
Resources
This is the dataset we used and processed:
We also share some useful links can help to understand what is the process behind Min Hashing and LSH in order to recognise song: