Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)
This is a simple audio classification api
build to classify the sound of an audio, weather it is the cat
or dog
sound.
Response
Given a .wav
audio the model will classify what does the sound the audio belongs to either cat
or dog
.
{
"predictions": {
"class": "dog",
"label": 1,
"probability": 1.0
},
"success": true
}
Starting the server
To start server and start audio
classification first you need to make sure you are in the server
folder and run the following commands:
- creating a virtual environment
virtualenv venv && .\venv\Scripts\activate.bat
- installing packages
pip install -r requirements.txt
- Starting the server
python api/app.py
The server will start on a default port of
3001
and you will be able to make api request to the server to do audio classification.
Model Metrics
The following table shows all the metrics summary we get after training the model for few 15
epochs.
model name | model description | test accuracy | validation accuracy | train accuracy | test loss | validation loss | train loss |
---|---|---|---|---|---|---|---|
cats-dogs-sound-cnn.pt | audio sentiment classification for dogs and cats CNN. | 90.7% | 90.7% | 93.5% | 0.621 | 0.218 | 0.209 |
Classification report
The following is the classification report for the model on the test
dataset.
# | precision | recall | f1-score | support |
---|---|---|---|---|
accuracy | - | - | 90% | 2305 |
macro avg | 91% | 90% | 90% | 2305 |
weighted avg | 92% | 89% | 90% | 2305 |
Confusion matrix
The following figure shows a confusion matrix for the classification model.
Audio Sentiment classification
If you hit the server at http://localhost:3001/classify
you will be able to get the following expected response that is if the request method is POST
and you provide the file expected by the server.
Expected Response
The expected response at http://localhost:3001/classify
with a file audio
of the right format will yield the following json
response to the client.
{
"predictions": {
"class": "dog",
"label": 1,
"probability": 1.0
},
"success": true
}
curl
Using Make sure that you have the audio named cat.wav
in the current folder that you are running your cmd
otherwise you have to provide an absolute or relative path to the audio.
To make a
curl
POST
request athttp://localhost:3001/classify
with the filecat.wav
we run the following command.
# for cat
curl -X POST -F [email protected] http://127.0.0.1:3001/classify
# for dog
curl -X POST -F [email protected] http://127.0.0.1:3001/classify
Using Postman client
To make this request with postman we do it as follows:
- Change the request method to
POST
at http://127.0.0.1:3001/classify - Click on
form-data
- Select type to be
file
on theKEY
attribute - For the
KEY
typeaudio
and select the audio you want to predict undervalue
- Click send
If everything went well you will get the following response depending on the face you have selected:
{
"predictions": { "class": "dog", "label": 1, "probability": 1.0 },
"success": true
}
fetch
api.
Using JavaScript - First you need to get the input from
html
- Create a
formData
object - make a POST requests
const input = document.getElementById("input").files[0];
let formData = new FormData();
formData.append("audio", input);
fetch("http://127.0.0.1:3001/classify", {
method: "POST",
body: formData,
})
.then((res) => res.json())
.then((data) => console.log(data));
If everything went well you will be able to get expected response.
{
"predictions": { "class": "dog", "label": 1, "probability": 1.0 },
"success": true
}
Notebooks
- All notebooks for training and saving the models are found in the
notebooks
folder of this repository.