OneShot Learning-based hotword detection.

Overview

EfficientWord-Net

Versions : 3.6 ,3.7,3.8,3.9

Hotword detection based on one-shot learning

Home assistants require special phrases called hotwords to get activated (eg:"ok google")

EfficientWord-Net is an hotword detection engine based on one-shot learning inspired from FaceNet's Siamese Network Architecture. Works very similar to face recognition , just requires a few samples of your own custom hotword to get going. No extra training or huge datasets required!! This will allow developers to add custom hotwords to their programs without a sweat or any extra charges. Just like google assistant's hotword detector, the engine performs the best when 3-4 hotword samples are collected directly from the user This repository is an official implemenation of EfficientWord-Net as a python library from the authors.

The library is purely written with python and uses Google's Tflite implemenation for faster realtime inference.

Demo of EfficientWord-Net in Pi

EfficientWord-Net.mp4

Access preprint

The research paper is currently under review in IEEE, click here to access the preprint and the training code will be available for public access once the paper is published.

Python Version Requirements

This Library works between python versions: 3.6 to 3.9

Dependencies Installation

Before running the pip installation command for the library, few dependencies need to be installed manually.

tflite package cannot be listed in requirements.txt hence will be automatically installed when the package is initialized in the system.

librosa package is not required for inference only cases , however when generate_reference is called , will be automatically installed.


Package Installation

Run the following pip command

pip install EfficientWord-Net

and to import running

import eff_word_net

Demo

After installing the packages, you can run the Demo script inbuilt with library (ensure you have a working mic).

Accesss Documentation from : https://ant-brain.github.io/EfficientWord-Net/

Command to run demo

python -m eff_word_net.engine

Generating Custom Wakewords

For any new hotword, the library needs information about the hotword, this information is obtained from a file called {wakeword}_ref.json. Eg: For the wakeword 'alexa', the library would need the file called alexa_ref.json

These files can be generated with the following procedure:

One needs to collect few 4 to 10 uniquely sounding pronunciations of a given wakeword. Then put them into a seperate folder, which doesnt contain anything else.

Finally run this command, it will ask for the input folder's location (containing the audio files) and the output folder (where _ref.json file will be stored).

python -m eff_word_net.generate_reference

The pathname of the generated wakeword needs to passed to the HotwordDetector detector instance.

HotwordDetector(
        hotword="hello",
        reference_file = "/full/path/name/of/hello_ref.json"),
        activation_count = 3 #2 by default
)

Few wakewords such as Mycroft, Google, Firefox, Alexa, Mobile, Siri the library has predefined embeddings readily available in the library installation directory, its path is readily available in the following variable

from eff_word_net import samples_loc

Try your first single hotword detection script

import os
from eff_word_net.streams import SimpleMicStream
from eff_word_net.engine import HotwordDetector
from eff_word_net import samples_loc

mycroft_hw = HotwordDetector(
        hotword="Mycroft",
        reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
        activation_count=3
    )

mic_stream = SimpleMicStream()
mic_stream.start_stream()

print("Say Mycroft ")
while True :
    frame = mic_stream.getFrame()
    result = mycroft_hw.checkFrame(frame)
    if(result):
        print("Wakeword uttered")

Detecting Mulitple Hotwords from audio streams

The library provides a computation friendly way to detect multiple hotwords from a given stream, installed of running checkFrame() of each wakeword individually

import os
from eff_word_net.streams import SimpleMicStream
from eff_word_net import samples_loc
print(samples_loc)

alexa_hw = HotwordDetector(
        hotword="Alexa",
        reference_file = os.path.join(samples_loc,"alexa_ref.json"),
    )

siri_hw = HotwordDetector(
        hotword="Siri",
        reference_file = os.path.join(samples_loc,"siri_ref.json"),
    )

mycroft_hw = HotwordDetector(
        hotword="mycroft",
        reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
        activation_count=3
    )

multi_hw_engine = MultiHotwordDetector(
        detector_collection = [
            alexa_hw,
            siri_hw,
            mycroft_hw,
        ],
    )

mic_stream = SimpleMicStream()
mic_stream.start_stream()

print("Say Mycroft / Alexa / Siri")

while True :
    frame = mic_stream.getFrame()
    result = multi_hw_engine.findBestMatch(frame)
    if(None not in result):
        print(result[0],f",Confidence {result[1]:0.4f}")

Access documentation of the library from here : https://ant-brain.github.io/EfficientWord-Net/

About activation_count in HotwordDetector

Documenatation with detailed explanation on the usage of activation_count parameter in HotwordDetector is in the making , For now understand that for long hotwords 3 is advisable and 2 for smaller hotwords. If the detector gives out multiple triggers for a single utterance, try increasing activation_count. To experiment begin with smaller values. Default value for the same is 2

FAQ :

  • Hotword Perfomance is bad : if you are having some issue like this , feel to ask the same in discussions

CONTRIBUTION:

  • If you have an ideas to make the project better, feel free to ping us in discussions
  • The current logmelcalc.tflite graph can convert only 1 audio frame to Log Mel Spectrogram at a time. It will be of a great help if tensorflow guru's outthere help us out with this.

TODO :

  • Add audio file handler in streams. PR's are welcome.
  • Remove librosa requirement to encourage generating reference files directly in edge devices
  • Add more detailed documentation explaining slider window concept

SUPPORT US:

Our hotword detector's performance is notably low when compared to Porcupine. We have thought about better NN architectures for the engine and hope to outperform Porcupine. This has been our undergrad project. Hence your support and encouragement will motivate us to develop the engine. If you loved this project recommend this to your peers, give us a 🌟 in Github and a clap 👏 in medium.

LICENCSE : Apache License 2.0

Comments
  • Threshold value in engine.py not working?

    Threshold value in engine.py not working?

    hello,

    first of all, thank you for this great library!

    I managed to make it work on my M1 MacBook Air, and trying out my personal hotword detection, but the threshold value does not seem to be working on my environment.

    In engine.py:

        def __init__(
                self,
                hotword:str,
                reference_file:str,
                threshold:float=0.995,
                activation_count=2,
                continuous=True,
                verbose = False):
    

    And this is my script:

    import os
    from eff_word_net.streams import SimpleMicStream
    from eff_word_net.engine import HotwordDetector
    from eff_word_net import samples_loc
    
    hotword_hw = HotwordDetector(
            hotword="hotword",
            reference_file = "hotword_ref.json",
            activation_count=3
        )
    
    mic_stream = SimpleMicStream()
    mic_stream.start_stream()
    
    print("Say hotword ")
    while True :
        frame = mic_stream.getFrame()
        result = hotword_hw.checkFrame(frame)
            print("Wakeword uttered")
            print(hotword_hw.getMatchScoreFrame(frame))
    

    and when I run this script, the checkFrame returns true even when the getMatchScoreFrame returns under the threshold, like:

    Wakeword uttered
    0.9371609374494279
    Wakeword uttered
    0.9164050520717595
    Wakeword uttered
    0.9082509350226378
    ...
    

    Could you please take a look at this?

    Thank you!

    opened by dominickchen 10
  • Hotword detection triggers the moment any sound is being playd, even with the default models

    Hotword detection triggers the moment any sound is being playd, even with the default models

    So I've been trying to make a custom hotwork. But after seeing it trigger all the time, the moment any kind of sound is being recorded, I decided to use a default one, like "brightness", "mobile", "google", etc.

    They all trigger immediatley. Using the default values for the HotWordDetector, by the way. Any clue why? It seemed to have worked great in your video presentation.

    Not using a cheap ass microphone by the way.

    opened by TrackLab 9
  • circuit diagram

    circuit diagram

    Good evening!Can you send me the circuit diagram of raspberry pie connecting the bread board and lighting the LED light? Your experiment is so interesting that I want to repeat it.

    documentation 
    opened by preachwebsite 5
  • Invalid sample rate

    Invalid sample rate

    ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.front.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM front ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround21 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround21 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround40.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround40 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround41 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround50 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround51 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround71.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround71 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM iec958 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_lavrate.so (libasound_module_rate_lavrate.so: libasound_module_rate_lavrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_samplerate.so (libasound_module_rate_samplerate.so: libasound_module_rate_samplerate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_speexrate.so (libasound_module_rate_speexrate.so: libasound_module_rate_speexrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) Expression 'paInvalidSampleRate' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2043 Expression 'PaAlsaStreamComponent_InitialConfigure( &self->capture, inParams, self->primeBuffers, hwParamsCapture, &realSr )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2713 Expression 'PaAlsaStream_Configure( stream, inputParameters, outputParameters, sampleRate, framesPerBuffer, &inputLatency, &outputLatency, &hostBufferSizeMode )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2837 Traceback (most recent call last): File "/home/pi/Documents/test.py", line 11, in mic_stream = SimpleMicStream() File "/home/pi/Documents/eff_word_net/streams.py", line 71, in init mic_stream=p.open( File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 750, in open stream = Stream(self, *args, **kwargs) File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 441, in init self._stream = pa.open(**arguments) OSError: [Errno -9997] Invalid sample rate

    Good evening!I have encountered this problem. How can I solve it?Looking forward to your reply.

    raspberrypi 
    opened by preachwebsite 3
  • Could you help me?

    Could you help me?

    I won't deploy its running environment. Can you control it remotely? I have TeamViewer, a remote control software. The ID is 621 081 831. Or use other remote control. We look forward to your help.

    opened by preachwebsite 3
  • raising precision of custom wakeword

    raising precision of custom wakeword

    I'm curious whether the precision of custom wakeword improves if you provide more sound files, e.g. 50 files from different people? or is that meaningless?

    We want to use a custom wakeword for a public interaction system, and want it to recognize voice input from a wide range of people (young&old, male&female, etc).

    Thank you for letting me know.

    opened by dominickchen 3
  • Invalid input device (no default output device)

    Invalid input device (no default output device)

    ALSA lib conf.c:3723:(snd_config_hooks_call) Cannot open shared library libasound_module_conf_pulse.so (libasound_module_conf_pulse.so: libasound_module_conf_pulse.so: cannot open shared object file: No such file or directory) ALSA lib control.c:1379:(snd_ctl_open_noupdate) Invalid CTL hw:0 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.front.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM front ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM iec958 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_lavrate.so (libasound_module_rate_lavrate.so: libasound_module_rate_lavrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_samplerate.so (libasound_module_rate_samplerate.so: libasound_module_rate_samplerate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_speexrate.so (libasound_module_rate_speexrate.so: libasound_module_rate_speexrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) Traceback (most recent call last): File "/home/pi/Documents/test.py", line 11, in mic_stream = SimpleMicStream() File "/home/pi/Documents/eff_word_net/streams.py", line 71, in init mic_stream=p.open( File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 750, in open stream = Stream(self, *args, **kwargs) File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 441, in init self._stream = pa.open(**arguments) OSError: [Errno -9996] Invalid input device (no default output device)

    I have encountered this problem. How can I solve it?

    opened by preachwebsite 2
  • Here that working fine with ref file but not if a record custom file.

    Here that working fine with ref file but not if a record custom file.

    Hello, i working on google collab, so i don't have access to mic. The work around is to used mp3 or wav file. To do that i have add this class:

    from streams import CustomAudioStream
    from pydub import AudioSegment
    
    import numpy as np
    import wave
    
    RATE = 16000
    index = 0
    
    class SimpleFileStream(CustomAudioStream) :
    
        def open_stream(self, src, mp3):
            if mp3:
              dst = "Data/sample.wav"
              # convert mp3 to wav              
              sound = AudioSegment.from_mp3(src).set_frame_rate(16000)
              sound.export(dst, format="wav")
              self.wf = wave.open(dst, 'rb')
            else:
              print("Not an mp3")
              self.wf = wave.open(src, 'rb')
              self.wf.rewind()
            print("Get params of wav file " + str(self.wf.getparams()))
    
        def close_stream(self):
            self.wf.close()
    
        def get_next_frame(self):
            global index
            print("Index ", index)
            index = index + self.CHUNK
            return np.frombuffer(self.wf.readframes(self.CHUNK),dtype=np.int16)
    
        """
        Implements stream with sliding window, 
        implemented by inheriting CustomAudioStream
        """
        def __init__(self,sliding_window_secs:float=1/8):
            self.CHUNK = int(sliding_window_secs*RATE)
    
            CustomAudioStream.__init__(
                self,
                open_stream = self.open_stream,
                close_stream = self.close_stream,
                get_next_frame = self.get_next_frame,
            )
    

    It seems working if i used ref file of github. But if i record a custom file using audacity it is not detect the wakeword.

    If i change the threshold to 0.7 and the activation count to 2 it is work better, but il will increase the chance of getting false positive.

    Is it mandatory to have custom ref for each user ?

    Best regards Sebastien

    opened by warichet 2
  • Bump numpy from 1.20.0 to 1.22.0

    Bump numpy from 1.20.0 to 1.22.0

    Bumps numpy from 1.20.0 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Discussion : Hotword's accuracy too low

    Discussion : Hotword's accuracy too low

    If you are playing around with using your own custom hotwords and some hotword happen to not work so good Feel free to use the thread in discussions https://github.com/Ant-Brain/EfficientWord-Net/discussions/4

    opened by TheSeriousProgrammer 0
  • Hotword matches without any utterance

    Hotword matches without any utterance

    Hi, first of all thanks for making this library, Its fantastic!! I understand well since its in very early phase so it will have some issue and eventually it will better. So this time I was trying to go with the given example of hotword detector, I tried to attach a speech recognition after hotword triggers, but the performance is quite messy , to demonstrate this I am including this gif.

    Code_nyP9cJMBBg

    Problem1: Basically what happening is I am trying to call speech recognition right after there is match, as the speech recognition ends it again shows hotword uttered and re listen, even though there no hotword uttered and with confidence.

    Problem2: Also in some situations it matches when there is little click or desk sound.

    any fix for at least for Problem 1 I see problem 2 could be the reason of weak training as depending upon the hotword.

    opened by OnlinePage 2
  • OSError: [Errno -9981] Input overflowed

    OSError: [Errno -9981] Input overflowed

    I've installed the python library with

    pip install EfficientWord-Net

    onto a raspberry pi 2 with recent raspbian lite.

    However if i run the demo with python -m eff_word_net.engine i'll get the following error and nothing works:

    Say Mycroft / Alexa / Siri
    Traceback (most recent call last):
      File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/home/max/.local/lib/python3.9/site-packages/eff_word_net/engine.py", line 333, in <module>
        frame = mic_stream.getFrame()
      File "/home/max/.local/lib/python3.9/site-packages/eff_word_net/streams.py", line 49, in getFrame
        new_frame = self._get_next_frame()
      File "/home/max/.local/lib/python3.9/site-packages/eff_word_net/streams.py", line 85, in <lambda>
        np.frombuffer(mic_stream.read(CHUNK),dtype=np.int16)
      File "/usr/lib/python3/dist-packages/pyaudio.py", line 608, in read
        return pa.read_stream(self._stream, num_frames, exception_on_overflow)
    OSError: [Errno -9981] Input overflowed
    

    any idea?

    opened by mKenfenheuer 1
  • complex hotwords support #Current Model Limitations Discussion

    complex hotwords support #Current Model Limitations Discussion

    Hi, Thanks for your helpful research. I wonder if the current model can handle complex hot words like "Hey Siri" or just handle one word, like "Siri"?

    My second question is about hot words that their pronunciation takes more than 1s, like"Hey XXXX." Does your model support changing the recording time?

    Did you try to use cosine_similarity instead of Euclidian distance in inference time?

    Thanks.

    enhancement 
    opened by amoazeni75 7
  • Problem with Dependencies #Docker Support

    Problem with Dependencies #Docker Support

    Hello I left a comment on Reddit saying I would give it a go, and you said if I had a problem to log it here, so here I am, with a problem 😊

    I seem to get stuck with pip3 install librosa I get this error Failed building wheel for llvmlite Running setup.py clean for llvmlite Failed to build llvmlite I can push on and get EfficientWord installed and working, if I say Alexa it says Yup I hear ya

    The problem is then when I try to create my own wake word I run this command … python3 -m eff_word_net.generate_reference pi@raspberrypi:~ $ python3 -m eff_word_net.generate_reference Paste Path of folder Containing audio files:/home/pi/wakewords Paste Path of location to save *_ref.json :/home/pi/wakewords Enter Wakeword Name :bender Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in run_code exec(code, run_globals) File "/home/pi/.local/lib/python3.7/site-packages/eff_word_net/generate_reference.py", line 80, in input("Enter Wakeword Name :") File "/home/pi/.local/lib/python3.7/site-packages/eff_word_net/generate_reference.py", line 47, in generate_reference_file x, = librosa.load(audio_file,sr=16000) AttributeError: module 'librosa' has no attribute 'load'

    My Problem is with librosa, I am not able to install it. I tried everything I could google but it will never install

    How did you get around this problem ?

    enhancement good first issue raspberrypi wake_word_generation 
    opened by Balro76 3
Releases(stable)
Owner
ANT-BRaiN
Small is the new big.
ANT-BRaiN
LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models. Developers can reproduce these SOTA methods and build their own methods.

TuZheng 405 Jan 4, 2023
(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License PyOD is a comprehensive and sca

Yue Zhao 6.6k Jan 3, 2023
Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

Mohamed Chaabane 253 Dec 18, 2022
Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Python Streaming Anomaly Detection (PySAD) PySAD is an open-source python framework for anomaly detection on streaming multivariate data. Documentatio

Selim Firat Yilmaz 181 Dec 18, 2022
A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

Data-Annotation-Tool How to Run this Tool? To run this software, follow the steps: git clone https://github.com/Autonomous-Car-Project/Data-Annotation

TiVRA AI 13 Aug 18, 2022
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

赛题背景 在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛

null 65 Dec 22, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Ibai Gorordo 35 Sep 7, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.

TFlite Ultra Fast Lane Detection Inference Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite. So

Ibai Gorordo 12 Aug 27, 2022
This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Skin Lesion detection using YOLO This project deal

Lalith Veerabhadrappa Badiger 1 Nov 22, 2021
Implementation of the state of the art beat-detection, downbeat-detection and tempo-estimation model

The ISMIR 2020 Beat Detection, Downbeat Detection and Tempo Estimation Model Implementation. This is an implementation in TensorFlow to implement the

Koen van den Brink 1 Nov 12, 2021
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

null 7 Jan 8, 2023
Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

Cancer-and-Tumor-Detection-Using-Inception-model In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks

Deepak Nandwani 1 Jan 1, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
Human Detection - Pedestrian Detection using OpenCV Python

Pedestrian Detection using OpenCV Python Follow us on Instagram for Machine Lear

Hrishikesh Dutta 1 Jan 23, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

null 3 Jan 26, 2022
End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

onnx-facial-lmk-detector End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model, model.onnx. Demo You can

atksh 42 Dec 30, 2022
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

PyGOD Team 757 Jan 4, 2023
LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection.

LightLog Introduction LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection. Function description [BG

null 25 Dec 17, 2022
CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

[CVPR2022] DSL: Dense Learning based Semi-Supervised Object Detection DSL is the first work on Anchor-Free detector for Semi-Supervised Object Detecti

Bhchen 69 Dec 8, 2022