Picovoice

Made in Vancouver, Canada by Picovoice

Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Google services, Picovoice runs entirely on-device while being more accurate. Using Picovoice, one can infer a user’s intent from a naturally spoken utterance such as:

"Hey Edison, set the lights in the living room to blue"

Picovoice detects the occurrence of the custom wake word (Hey Edison), and then extracts the intent from the follow-on spoken command:

{
  "intent": "changeColor",
  "slots": {
    "location": "living room",
    "color": "blue"
  }
}

Why Picovoice

Private & Secure: Everything is processed offline. Intrinsically private; HIPAA and GDPR compliant.
Accurate: Resilient to noise and reverberation. Outperforms cloud-based alternatives by wide margins.
Cross-Platform: Design once, deploy anywhere. Build using familiar languages and frameworks. Raspberry Pi, BeagleBone, Android, iOS, Linux (x86_64), macOS (x86_64), Windows (x86_64), and modern web browsers are supported. Enterprise customers can access the ARM Cortex-M SDK.
Self-Service: Design, train, and test voice interfaces instantly in your browser, using Picovoice Console.
Reliable: Runs locally without needing continuous connectivity.
Zero Latency: Edge-first architecture eliminates unpredictable network delay.

Build with Picovoice

Evaluate: The Picovoice SDK is a cross-platform library for adding voice to anything. It includes some pre-trained speech models. The SDK is licensed under Apache 2.0 and available on GitHub to encourage independent benchmarking and integration testing. You are empowered to make a data-driven decision.
Design: Picovoice Console is a cloud-based platform for designing voice interfaces and training speech models, all within your web browser. No machine learning skills are required. Simply describe what you need with text and export trained models.
Develop: Exported models can run on Picovoice SDK without requiring constant connectivity. The SDK runs on a wide range of platforms and supports a large number of frameworks. The Picovoice Console and Picovoice SDK enable you to design, build and iterate fast.
Deploy: Deploy at scale without having to maintain complex cloud infrastructure. Avoid unbounded cloud fees, limitations, and control imposed by big tech.

Platform Features

Custom Wake Words

Picovoice makes use of the Porcupine wake word engine to detect utterances of given wake phrases. You can train custom wake words using Picovoice Console and then run the exported wake word model on the Picovoice SDK.

Intent Inference

Picovoice relies on the Rhino Speech-to-Intent engine to directly infer user's intent from spoken commands within a given domain of interest (a "context"). You can design and train custom contexts for your product using Picovoice Console. The exported Rhino models then can run with the Picovoice SDK on any supported platform.

License & Terms

The Picovoice SDK is free and licensed under Apache 2.0 including the models released within. Picovoice Console offers two types of subscriptions: Personal and Enterprise. Personal accounts can train custom speech models that run on the Picovoice SDK, subject to limitations and strictly for non-commercial purposes. Personal accounts empower researchers, hobbyists, and tinkerers to experiment. Enterprise accounts can unlock all capabilities of Picovoice Console, are permitted for use in commercial settings, and have a path to graduate to commercial distribution^*.

Picovoice
- Why Picovoice?
- Build with Picovoice
- Platform Features
- Table of Contents
- Language Support
- Performance
- Picovoice Console
- Demos
  - Python
  - NodeJS
  - .NET
  - Java
  - Go
  - Unity
  - Flutter
  - React Native
  - Android
  - iOS
  - Web
  - Rust
  - C
  - Microcontroller
- SDKs
  - Python
  - NodeJS
  - .NET
  - Java
  - Go
  - Unity
  - Flutter
  - React Native
  - Android
  - iOS
  - Web
  - Rust
  - Microcontroller
- Releases
- FAQ

Language Support

English, German, French, and Spanish.
Support for additional languages is available for commercial customers on a case-by-case basis.

Performance

Picovoice makes use of the Porcupine wake word engine to detect utterances of given wake phrases. An open-source benchmark of Porcupine is available here. In summary, compared to the best-performing alternative, Porcupine's standard model is 5.4 times more accurate.

Picovoice relies on the Rhino Speech-to-Intent engine to directly infer user's intent from spoken commands within a given domain of interest (a "context"). An open-source benchmark of Rhino is available here. Rhino outperforms all major cloud-based alternatives with wide margins.

Picovoice Console

Picovoice Console is a web-based platform for designing, testing, and training voice user interfaces. Using Picovoice Console you can train custom wake word, and domain-specific NLU (Speech-to-Intent) models.

Demos

If using SSH, clone the repository with:

git clone --recurse-submodules [email protected]:Picovoice/picovoice.git

If using HTTPS, clone the repository with:

git clone --recurse-submodules https://github.com/Picovoice/picovoice.git

Python Demos

sudo pip3 install picovoicedemo

From the root of the repository run the following in the terminal:

picovoice_demo_mic \
--keyword_path resources/porcupine/resources/keyword_files/${PLATFORM}/porcupine_${PLATFORM}.ppn \
--context_path resources/rhino/resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn

Replace ${PLATFORM} with the platform you are running the demo on (e.g. raspberry-pi, beaglebone, linux, mac, or windows). The microphone demo opens an audio stream from the microphone, detects utterances of a given wake phrase, and infers intent from the follow-on spoken command. Once the demo initializes, it prints [Listening ...] to the console. Then say:

Porcupine, set the lights in the kitchen to purple.

Upon success, the demo prints the following into the terminal:

[wake word]

{
  intent : 'changeColor'
  slots : {
    location : 'kitchen'
    color : 'purple'
  }
}

For more information regarding Python demos refer to their documentation.

NodeJS Demos

Install the demo package:

npm install -g @picovoice/picovoice-node-demo

From the root of the repository run:

pv-mic-demo \
-k resources/porcupine/resources/keyword_files/${PLATFORM}/porcupine_${PLATFORM}.ppn \
-c resources/rhino/resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn

Replace ${PLATFORM} with the platform you are running the demo on (e.g. raspberry-pi, linux, or mac). The microphone demo opens an audio stream from the microphone, detects utterances of a given wake phrase, and infers intent from the follow-on spoken command. Once the demo initializes, it prints Listening for wake word 'porcupine' ... to the console. Then say:

Porcupine, turn on the lights.

Upon success, the demo prints the following into the terminal:

Inference:
{
    "isUnderstood": true,
    "intent": "changeLightState",
    "slots": {
        "state": "on"
    }
}

Please see the demo instructions for details.

.NET Demos

From the root of the repository run the following in the terminal:

dotnet run -p demo/dotnet/PicovoiceDemo/PicovoiceDemo.csproj -c MicDemo.Release -- \
--keyword_path resources/porcupine/resources/keyword_files/${PLATFORM}/porcupine_${PLATFORM}.ppn \
--context_path resources/rhino/resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn

Replace ${PLATFORM} with the platform you are running the demo on (e.g. linux, mac, or windows). The microphone demo opens an audio stream from the microphone, detects utterances of a given wake phrase, and infers intent from the follow-on spoken command. Once the demo initializes, it prints Listening... to the console. Then say:

Porcupine, set the lights in the kitchen to orange.

Upon success the following it printed into the terminal:

[wake word]
{
  intent : 'changeColor'
  slots : {
    location : 'kitchen'
    color : 'orange'
  }
}

For more information about .NET demos go to demo/dotnet.

Java Demos

Make sure there is a working microphone connected to your device. Then invoke the following commands from the terminal:

cd demo/java
./gradlew build
cd build/libs
java -jar picovoice-mic-demo.jar \
-k resources/porcupine/resources/keyword_files/${PLATFORM}/porcupine_${PLATFORM}.ppn \
-c resources/rhino/resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn

Porcupine, set the lights in the kitchen to orange.

Upon success the following it printed into the terminal:

[wake word]
{
  intent : 'changeColor'
  slots : {
    location : 'kitchen'
    color : 'orange'
  }
}

For more information about the Java demos go to demo/java.

Go Demos

The demo requires cgo, which on Windows may mean that you need to install a gcc compiler like Mingw to build it properly.

From demo/go run the following command from the terminal to build and run the mic demo:

go run micdemo/picovoice_mic_demo.go \
-keyword_path "../../resources/porcupine/resources/keyword_files/${PLATFORM}/porcupine_${PLATFORM}.ppn" \
-context_path "../../resources/rhino/resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn"

Porcupine, set the lights in the kitchen to orange.

Upon success the following it printed into the terminal:

[wake word]
{
  intent : 'changeColor'
  slots : {
    location : 'kitchen'
    color : 'orange'
  }
}

For more information about the Go demos go to demo/go.

Unity Demos

To run the Picovoice Unity demo, import the Picovoice Unity package into your project, open the PicovoiceDemo scene and hit play. To run on other platforms or in the player, go to File > Build Settings, choose your platform and hit the Build and Run button.

To browse the demo source go to demo/unity.

Flutter Demos

To run the Picovoice demo on Android or iOS with Flutter, you must have the Flutter SDK installed on your system. Once installed, you can run flutter doctor to determine any other missing requirements for your relevant platform. Once your environment has been set up, launch a simulator or connect an Android/iOS device.

Before launching the app, use the copy_assets.sh script to copy the Picovoice demo assets into the demo project. (NOTE: on Windows, Git Bash or another bash shell is required, or you will have to manually copy the context into the project.).

Run the following command from demo/flutter to build and deploy the demo to your device:

flutter run

Once the application has been deployed, press the start button and say:

Picovoice, turn of the lights in the kitchen.

For the full set of supported commands refer to demo's readme.

React Native Demos

To run the React Native Picovoice demo app you'll first need to install yarn and setup your React Native environment. For this, please refer to React Native's documentation. Once your environment has been set up, you can run the following commands:

Running On Android

cd demo/react-native
yarn android-install    # sets up environment
yarn android-run        # builds and deploys to Android

Running On iOS

cd demo/react-native
yarn ios-install        # sets up environment
yarn ios-run            # builds and deploys to iOS

Once the application has been deployed, press the start button and say

Porcupine, turn of the lights in the kitchen.

For the full set of supported commands refer to demo's readme.

Android Demos

Using Android Studio, open demo/android/Activity as an Android project and then run the application. Press the start button and say

Porcupine, turn of the lights in the kitchen.

For the full set of supported commands refer to demo's readme.

iOS Demos

The BackgroundService demo runs audio recording in the background while the application is not in focus and remains running in the background. The ForegroundApp demo runs only when the application is in focus.

BackgroundService Demo

To run the demo, go to demo/ios/BackgroundService and run:

pod install

Then, using Xcode, open the generated PicovoiceBackgroundServiceDemo.xcworkspace and run the application.

ForegroundApp Demo

To run the demo, go to demo/ios/ForegroundApp and run:

pod install

Then, using Xcode, open the generated PicovoiceForegroundAppDemo.xcworkspace and run the application.

Wake Word Detection and Context Inference

After running the demo, press the start button and try saying the following:

Picovoice, shut of the lights in the living room.

For more details about the iOS demos and full set of supported commands refer to demo's readme.

Web Demos

Vanilla JavaScript and HTML

From demo/web run the following in the terminal:

yarn
yarn start

(or)

npm install
npm run start

Open http://localhost:5000 in your browser to try the demo.

Angular Demos

From demo/angular run the following in the terminal:

yarn
yarn start

(or)

npm install
npm run start

Open http://localhost:4200 in your browser to try the demo.

React Demos

From demo/react run the following in the terminal:

yarn
yarn start

(or)

npm install
npm run start

Open http://localhost:3000 in your browser to try the demo.

Vue Demos

From demo/vue run the following in the terminal:

yarn
yarn serve

(or)

npm install
npm run serve

Open http://localhost:8080 in your browser to try the demo.

Rust Demos

From demo/rust/micdemo run the following command from the terminal to build and run the mic demo:

cargo run --release -- \
--keyword_path "../../../resources/porcupine/resources/keyword_files/${PLATFORM}/porcupine_${PLATFORM}.ppn" \
--context_path "../../../resources/rhino/resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn"

Porcupine, set the lights in the kitchen to orange.

Upon success the following it printed into the terminal:

[wake word]
{
  intent : 'changeColor'
  slots : {
    location : 'kitchen'
    color : 'orange'
  }
}

For more information about the Rust demos go to demo/rust.

C Demos

The C demo requires CMake version 3.4 or higher.

The Microphone demo requires miniaudio for accessing microphone audio data.

Windows Requires MinGW to build the demo.

Microphone Demo

At the root of the repository, build with:

cmake -S demo/c/. -B demo/c/build && cmake --build demo/c/build --target picovoice_demo_mic

Linux (x86_64), macOS (x86_64), Raspberry Pi, and BeagleBone

List input audio devices with:

./demo/c/build/picovoice_demo_mic --show_audio_devices

Run the demo using:

./demo/c/build/picovoice_demo_mic \
${PICOVOICE_LIBRARY_PATH} \
resources/porcupine/lib/common/porcupine_params.pv \
resources/porcupine/resources/keyword_files/${PLATFORM}/picovoice_${PLATFORM}.ppn \
0.5 \
resources/rhino/lib/common/rhino_params.pv \
resources/rhino/resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn \
0.5 \
{AUDIO_DEVICE_INDEX}

Replace ${LIBRARY_PATH} with path to appropriate library available under /sdk/c/lib, ${PLATFORM} with the name of the platform you are running on (linux, raspberry-pi, mac, or beaglebone), and ${AUDIO_DEVICE_INDEX} with the index of your audio device.

Windows

List input audio devices with:

.\\demo\\c\\build\\picovoice_demo_mic.exe --show_audio_devices

Run the demo using:

.\\demo\\c\\build\\picovoice_demo_mic.exe sdk/c/lib/windows/amd64/libpicovoice.dll resources/porcupine/lib/common/porcupine_params.pv resources/porcupine/resources/keyword_files/windows/picovoice_windows.ppn 0.5 resources/rhino/lib/common/rhino_params.pv resources/rhino/resources/contexts/windows/smart_lighting_windows.rhn 0.5 {AUDIO_DEVICE_INDEX}

Replace ${AUDIO_DEVICE_INDEX} with the index of your audio device.

The demo opens an audio stream and waits for the wake word "Picovoice" to be detected. Once it is detected, it infers your intent from spoken commands in the context of a smart lighting system. For example, you can say:

"Turn on the lights in the bedroom."

File Demo

At the root of the repository, build with:

cmake -S demo/c/. -B demo/c/build && cmake --build demo/c/build --target picovoice_demo_file

Linux (x86_64), macOS (x86_64), Raspberry Pi, and BeagleBone

Run the demo using:

./demo/c/build/picovoice_demo_file \
${LIBRARY_PATH} \
resources/porcupine/lib/common/porcupine_params.pv \
resources/porcupine/resources/keyword_files/${PLATFORM}/picovoice_${PLATFORM}.ppn \
0.5 \
resources/rhino/lib/common/rhino_params.pv \
resources/rhino/resources/contexts/${PLATFORM}/coffee_maker_${PLATFORM}.rhn \
0.5 \
resources/audio_samples/picovoice-coffee.wav

Replace ${LIBRARY_PATH} with path to appropriate library available under sdk/c/lib, ${PLATFORM} with the name of the platform you are running on (linux, raspberry-pi, mac, or beaglebone).

Windows

Run the demo using:

.\\demo\\c\\build\\picovoice_demo_file.exe sdk/c/lib/windows/amd64/libpicovoice.dll resources/porcupine/lib/common/porcupine_params.pv resources/porcupine/resources/keyword_files/windows/picovoice_windows.ppn 0.5 resources/rhino/lib/common/rhino_params.pv resources/rhino/resources/contexts/windows/coffee_maker_windows.rhn 0.5 resources/audio_samples/picovoice-coffee.wav

The demo opens up the WAV file. It detects the wake word and infers the intent in the context of a coffee maker system.

For more information about C demos go to demo/c.

Microcontroller Demos

There are several projects for various development boards inside the mcu demo folder.

SDKs

Python

Install the package:

pip3 install picovoice

Create a new instance of Picovoice:

from picovoice import Picovoice

keyword_path = ...

def wake_word_callback():
    pass

context_path = ...

def inference_callback(inference):
    print(inference.is_understood)
    print(inference.intent)
    print(inference.slots)

handle = Picovoice(
        keyword_path=keyword_path,
        wake_word_callback=wake_word_callback,
        context_path=context_path,
        inference_callback=inference_callback)

handle is an instance of the Picovoice runtime engine. It detects utterances of wake phrase defined in the file located at keyword_path. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within the context defined by the file located at context_path. keyword_path is the absolute path to the Porcupine wake word engine keyword file (with .ppn extension). context_path is the absolute path to the Rhino Speech-to-Intent engine context file (with .rhn extension). wake_word_callback is invoked upon the detection of wake phrase and inference_callback is invoked upon completion of follow-on voice command inference.

When instantiated, the required rate can be obtained via handle.sample_rate. Expected number of audio samples per frame is handle.frame_length. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio. The set of supported commands can be retrieved (in YAML format) via handle.context_info.

def get_next_audio_frame():
    pass

while True:
    handle.process(get_next_audio_frame())

When done, resources have to be released explicitly handle.delete().

NodeJS

The Picovoice SDK for NodeJS is available from NPM:

yarn add @picovoice/picovoice-node

(or)

npm install @picovoice/picovoice-node

The SDK provides the Picovoice class. Create an instance of this class using a Porcupine keyword (with .ppn extension) and Rhino context file (with .rhn extension), as well as callback functions that will be invoked on wake word detection and command inference completion events, respectively:

const Picovoice = require("@picovoice/picovoice-node");

let keywordCallback = function (keyword) {
  console.log(`Wake word detected`);
};

let inferenceCallback = function (inference) {
  console.log("Inference:");
  console.log(JSON.stringify(inference, null, 4));
};

let handle = new Picovoice(
  keywordArgument,
  keywordCallback,
  contextPath,
  inferenceCallback
);

The keywordArgument can either be a path to a Porcupine keyword file (.ppn), or one of the built-in keywords (integer enums). The contextPath is the path to the Rhino context file (.rhn).

Upon constructing the Picovoice class, send it frames of audio via its process method. Internally, Picovoice will switch between wake word detection and inference. The Picovoice class includes frameLength and sampleRate properties for the format of audio required.

// process audio frames that match the Picovoice requirements (16-bit linear pcm audio, single-channel)
while (true) {
  handle.process(frame);
}

As the audio is processed through the Picovoice engines, the callbacks will fire.

.NET

You can install the latest version of Picovoice by adding the latest Picovoice NuGet package in Visual Studio or using the .NET CLI.

dotnet add package Picovoice

To create an instance of Picovoice, do the following:

using Pv;

string keywordPath = "/absolute/path/to/keyword.ppn";

void wakeWordCallback() => {..}

string contextPath = "/absolute/path/to/context.rhn";

void inferenceCallback(Inference inference)
{
    // `inference` exposes three immutable properties:
    // (1) `IsUnderstood`
    // (2) `Intent`
    // (3) `Slots`
    // ..
}

Picovoice handle = new Picovoice(keywordPath,
                                 wakeWordCallback,
                                 contextPath,
                                 inferenceCallback);

handle is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in the file located at keywordPath. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within the context defined by the file located at contextPath. keywordPath is the absolute path to Porcupine wake word engine keyword file (with .ppn extension). contextPath is the absolute path to Rhino Speech-to-Intent engine context file (with .rhn extension). wakeWordCallback is invoked upon the detection of wake phrase and inferenceCallback is invoked upon completion of follow-on voice command inference.

When instantiated, the required sample rate can be obtained via handle.SampleRate. The expected number of audio samples per frame is handle.FrameLength. The Picovoice engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

short[] GetNextAudioFrame()
{
    // .. get audioFrame
    return audioFrame;
}

while(true)
{
    handle.Process(GetNextAudioFrame());
}

Porcupine will have its resources freed by the garbage collector, but to have resources freed immediately after use, wrap it in a using statement:

using(Picovoice handle = new Picovoice(keywordPath, wakeWordCallback, contextPath, inferenceCallback))
{
    // .. Picovoice usage here
}

Java

The Picovoice Java library is available from Maven Central at ai.picovoice:picovoice-java:${version}.

The easiest way to create an instance of the engine is with the Picovoice Builder:

import ai.picovoice.picovoice.*;

String keywordPath = "/absolute/path/to/keyword.ppn"

PicovoiceWakeWordCallback wakeWordCallback = () -> {..};

String contextPath = "/absolute/path/to/context.rhn"

PicovoiceInferenceCallback inferenceCallback = inference -> {
    // `inference` exposes three getters:
    // (1) `getIsUnderstood()`
    // (2) `getIntent()`
    // (3) `getSlots()`
    // ..
};

try{
    Picovoice handle = new Picovoice.Builder()
                    .setKeywordPath(keywordPath)
                    .setWakeWordCallback(wakeWordCallback)
                    .setContextPath(contextPath)
                    .setInferenceCallback(inferenceCallback)
                    .build();
} catch (PicovoiceException e) { }

handle is an instance of the Picovoice runtime engine that detects utterances of wake phrase defined in the file located at keywordPath. Upon detection of wake word it starts inferring the user's intent from the follow-on voice command within the context defined by the file located at contextPath. keywordPath is the absolute path to Porcupine wake word engine keyword file (with .ppn extension). contextPath is the absolute path to Rhino Speech-to-Intent engine context file (with .rhn extension). wakeWordCallback is invoked upon the detection of wake phrase and inferenceCallback is invoked upon completion of follow-on voice command inference.

When instantiated, the required sample rate can be obtained via handle.getSampleRate(). The expected number of audio samples per frame is handle.getFrameLength(). The Picovoice engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

short[] getNextAudioFrame()
{
    // .. get audioFrame
    return audioFrame;
}

while(true)
{
    handle.process(getNextAudioFrame());
}

Once you're done with Picovoice, ensure you release its resources explicitly:

handle.delete();

Go

To install the Picovoice Go module to your project, use the command:

go get github.com/Picovoice/picovoice/sdk/go

To create an instance of the engine with default parameters, use the NewPicovoice function. You must provide a Porcupine keyword file, a wake word detection callback function, a Rhino context file and a inference callback function. You must then make a call to Init().

. "github.com/Picovoice/picovoice/sdk/go"
rhn "github.com/Picovoice/rhino/binding/go"

keywordPath := "/path/to/keyword/file.ppn"
wakeWordCallback := func(){
    // let user know wake word detected
}

contextPath := "/path/to/keyword/file.rhn"
inferenceCallback := func(inference rhn.RhinoInference){
    if inference.IsUnderstood {
            intent := inference.Intent
            slots := inference.Slots
        // add code to take action based on inferred intent and slot values
    } else {
        // add code to handle unsupported commands
    }
}

picovoice := NewPicovoice(keywordPath, 
    wakeWordCallback, 
    contextPath, 
    inferenceCallback)

err := picovoice.Init()
if err != nil {
    // handle error
}

Upon detection of wake word defined by keywordPath it starts inferring user's intent from the follow-on voice command within the context defined by the file located at contextPath. keywordPath is the absolute path to Porcupine wake word engine keyword file (with .ppn suffix). contextPath is the absolute path to Rhino Speech-to-Intent engine context file (with .rhn suffix). wakeWordCallback is invoked upon the detection of wake phrase and inferenceCallback is invoked upon completion of follow-on voice command inference.

When instantiated, valid sample rate can be obtained via SampleRate. Expected number of audio samples per frame is FrameLength. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

func getNextFrameAudio() []int16{
    // get audio frame
}

for {
    err := picovoice.Process(getNextFrameAudio())
}

When done resources have to be released explicitly

picovoice.Delete()

Unity

Import the Picovoice Unity Package into your Unity project.

The SDK provides two APIs:

High-Level API

PicovoiceManager provides a high-level API that takes care of audio recording. This is the quickest way to get started.

The constructor PicovoiceManager.Create will create an instance of the PicovoiceManager using the Porcupine keyword and Rhino context files that you pass to it.

using Pv.Unity;

PicovoiceManager _picovoiceManager = new PicovoiceManager(
                                "/path/to/keyword/file.ppn",
                                () => {},
                                "/path/to/context/file.rhn",
                                (inference) => {};

Once you have instantiated a PicovoiceManager, you can start/stop audio capture and processing by calling:

try 
{
    _picovoiceManager.Start();
}
catch(Exception ex)
{
    Debug.LogError(ex.ToString());
}

// .. use picovoice

_picovoiceManager.Stop();

PicovoiceManager uses our unity-voice-processor Unity package to capture frames of audio and automatically pass it to the Picovoice platform.

Low-Level API

Picovoice provides low-level access to the Picovoice platform for those who want to incorporate it into a already existing audio processing pipeline.

Picovoice is created by passing a Porcupine keyword file and Rhino context file to the Create static constructor.

using Pv.Unity;

try
{    
    Picovoice _picovoice = Picovoice.Create(
                                "path/to/keyword/file.ppn",
                                OnWakeWordDetected,
                                "path/to/context/file.rhn",
                                OnInferenceResult);
} 
catch (Exception ex) 
{
    // handle Picovoice init error
}

To use Picovoice, you must pass frames of audio to the Process function. The callbacks will automatically trigger when the wake word is detected and then when the follow-on command is detected.

short[] GetNextAudioFrame()
{
    // .. get audioFrame
    return audioFrame;
}

short[] buffer = GetNextAudioFrame();
try 
{
    _picovoice.Process(buffer);
}
catch (Exception ex)
{
    Debug.LogError(ex.ToString());
}

For Process to work correctly, the provided audio must be single-channel and 16-bit linearly-encoded.

Picovoice implements the IDisposable interface, so you can use Picovoice in a using block. If you don't use a using block, resources will be released by the garbage collector automatically or you can explicitly release the resources like so:

_picovoice.Dispose();

Flutter

Add the Picovoice Flutter package to your pub.yaml.

dependencies:  
  picovoice: ^<version>

The SDK provides two APIs:

High-Level API

PicovoiceManager provides a high-level API that takes care of audio recording. This class is the quickest way to get started.

The static constructor PicovoiceManager.create will create an instance of a PicovoiceManager using a Porcupine keyword file and Rhino context file that you pass to it.

import 'package:picovoice/picovoice_manager.dart';
import 'package:picovoice/picovoice_error.dart';

void createPicovoiceManager() {  
  _picovoiceManager = PicovoiceManager.create(
      "/path/to/keyword/file.ppn",
      _wakeWordCallback,
      "/path/to/context/file.rhn",
      _inferenceCallback);    
}

The wakeWordCallback and inferenceCallback parameters are functions that you want to execute when a wake word is detected and when an inference is made.

Once you have instantiated a PicovoiceManager, you can start/stop audio capture and processing by calling:

await _picovoiceManager.start();
// .. use for detecting wake words and commands
await _picovoiceManager.stop();

Our flutter_voice_processor Flutter plugin handles audio capture and passes frames to Picovoice for you.

Low-Level API

Picovoice provides low-level access to the Picovoice platform for those who want to incorporate it into a already existing audio processing pipeline.

Picovoice is created by passing a a Porcupine keyword file and Rhino context file to the create static constructor. Sensitivity and model files are optional.

import 'package:picovoice/picovoice_manager.dart';
import 'package:picovoice/picovoice_error.dart';

void createPicovoice() async {
    double porcupineSensitivity = 0.7;
    double rhinoSensitivity = 0.6;
    try{
        _picovoice = await Picovoice.create(
            "/path/to/keyword/file.ppn",
            wakeWordCallback,
            "/path/to/context/file.rhn",
            inferenceCallback,
            porcupineSensitivity,
            rhinoSensitivity,
            "/path/to/porcupine/model.pv",
            "/path/to/rhino/model.pv");
    } on PvError catch (err) {
        // handle picovoice init error
    }
}

To use Picovoice, just pass frames of audio to the process function. The callbacks will automatically trigger when the wake word is detected and then when the follow-on command is detected.

List<int> buffer = getAudioFrame();

try {
    _picovoice.process(buffer);
} on PvError catch (error) {
    // handle error
}

// once you are done using Picovoice
_picovoice.delete();

React Native

First add our React Native modules to your project via yarn or npm:

yarn add @picovoice/react-native-voice-processor
yarn add @picovoice/porcupine-react-native
yarn add @picovoice/rhino-react-native
yarn add @picovoice/picovoice-react-native

The @picovoice/picovoice-react-native package exposes a high-level and a low-level API for integrating Picovoice into your application.

High-Level API

PicovoiceManager provides a high-level API that takes care of audio recording. This class is the quickest way to get started.

The static constructor PicovoiceManager.create will create an instance of a PicovoiceManager using a Porcupine keyword file and Rhino context file that you pass to it.

this._picovoiceManager = PicovoiceManager.create(
    '/path/to/keyword/file.ppn',
    wakeWordCallback,
    '/path/to/context/file.rhn',
    inferenceCallback);

The wakeWordCallback and inferenceCallback parameters are functions that you want to execute when a wake word is detected and when an inference is made.

Once you have instantiated a PicovoiceManager, you can start/stop audio capture and processing by calling:

try {
  let didStart = await this._picovoiceManager.start();
} catch(err) { }
// .. use for detecting wake words and commands
let didStop = await this._picovoiceManager.stop();

@picovoice/react-native-voice-processor module handles audio capture and passes frames to Picovoice for you.

Low-Level API

Picovoice provides low-level access to the Picovoice platform for those who want to incorporate it into a already existing audio processing pipeline.

Picovoice is created by passing a a Porcupine keyword file and Rhino context file to the create static constructor. Sensitivity and model files are optional.

async createPicovoice(){
    let porcupineSensitivity = 0.7
    let rhinoSensitivity = 0.6

    try{
        this._picovoice = await Picovoice.create(
            '/path/to/keyword/file.ppn',
            wakeWordCallback,
            '/path/to/context/file.rhn',
            inferenceCallback,
            porcupineSensitivity,
            rhinoSensitivity,
            "/path/to/porcupine/model.pv",
            "/path/to/rhino/model.pv")
    } catch (err) {
        // handle error
    }
}

To use Picovoice, just pass frames of audio to the process function. The callbacks will automatically trigger when the wake word is detected and then when the follow-on command is detected.

let buffer = getAudioFrame();

try {
    await this._picovoice.process(buffer);
} catch (e) {
    // handle error
}

// once you are done
this._picovoice.delete();

Android

Porcupine can be found on Maven Central. To include the package in your Android project, ensure you have included mavenCentral() in your top-level build.gradle file and then add the following to your app's build.gradle:

dependencies {
    // ...
    implementation 'ai.picovoice:picovoice-android:1.1.0'
}

There are two possibilities for integrating Picovoice into an Android application.

High-Level API

PicovoiceManager provides a high-level API for integrating Picovoice into Android applications. It manages all activities related to creating an input audio stream, feeding it into Picovoice engine, and invoking user-defined callbacks upon wake word detection and inference completion. The class can be initialized as follows:

import ai.picovoice.picovoice.*;

PicovoiceManager manager = new PicovoiceManager(    
    .setKeywordPath("path/to/keyword/file.ppn")    
    .setWakeWordCallback(new PicovoiceWakeWordCallback() {
        @Override
        public void invoke() {
            // logic to execute upon deletection of wake word
        }
    })    
    .setContextPath("path/to/context/file.rhn")
    .setInferenceCallback(new PicovoiceInferenceCallback() {
        @Override
        public void invoke(final RhinoInference inference) {
            // logic to execute upon completion of intent inference
        }
    })
    .build(appContext);
);

The appContext parameter is the Android application context - this is used to extract Picovoice resources from the APK.

When initialized, input audio can be processed using:

manager.start();

Stop the manager with:

manager.stop();

Low-Level API

Picovoice.java provides a low-level binding for Android. It can be initialized as follows:

import ai.picovoice.picovoice.*;

try {
    Picovoice picovoice = new Picovoice.Builder()
        .setPorcupineModelPath("/path/to/porcupine/model.pv")
        .setKeywordPath("/path/to/keyword.ppn")
        .setPorcupineSensitivity(0.7f)
        .setWakeWordCallback(new PicovoiceWakeWordCallback() {
            @Override
            public void invoke() {
                // logic to execute upon deletection of wake word
            }
        })
        .setRhinoModelPath("/path/to/rhino/model.pv")
        .setContextPath("/path/to/context.rhn")
        .setRhinoSensitivity(0.55f)
        .setInferenceCallback(new PicovoiceInferenceCallback() {
            @Override
            public void invoke(final RhinoInference inference) {
                // logic to execute upon completion of intent inference
            }
        })
        .build(appContext);
} catch(PicovoiceException ex) { }

Once initialized, picovoice can be used to process incoming audio.

private short[] getNextAudioFrame();

while (true) {
    try {
        picovoice.process(getNextAudioFrame());
    } catch (PicovoiceException e) {
        // error handling logic
    }
}

Finally, be sure to explicitly release resources acquired as the binding class does not rely on the garbage collector for releasing native resources:

picovoice.delete();

iOS

The Picovoice iOS SDK is available via Cocoapods. To import it into your iOS project install Cocoapods and add the following line to your Podfile:

pod 'Picovoice-iOS'

There are two possibilities for integrating Picovoice into an iOS application.

High-Level API

PicovoiceManager class manages all activities related to creating an audio input stream, feeding it into Picovoice engine, and invoking user-defined callbacks upon wake word detection and completion of intent inference. The class can be initialized as below:

import Picovoice

PicovoiceManager manager = PicovoiceManager(
    keywordPath: "/path/to/keyword.ppn",
    onWakeWordDetection: { 
        // logic to execute upon deletection of wake word
    },
    contextPath: "/path/to/context.rhn",
    onInference: { inference in 
        // logic to execute upon completion of intent inference
    })

when initialized input audio can be processed using manager.start(). The processing can be interrupted using manager.stop().

Low-Level API

Picovoice.swift provides an API for passing audio from your own audio pipeline into the Picovoice Platform for wake word detection and intent inference.

o constuct an instance, you'll need to provide a Porcupine keyword file (.ppn), a Rhino context file (.rhn) and callbacks for when the wake word is detected and an inference is made. Sensitivity and model parameters are optional

import Picovoice

do {
    Picovoice picovoice = try Picovoice(
        keywordPath: "/path/to/keyword.ppn",
        porcupineSensitivity: 0.4,
        porcupineModelPath: "/path/to/porcupine/model.pv"
        onWakeWordDetection: { 
            // logic to execute upon deletection of wake word
        },
        contextPath: "/path/to/context.rhn",
        rhinoSensitivity: 0.7,
        rhinoModelPath: "/path/to/rhino/model.pv"
        onInference: { inference in 
            // logic to execute upon completion of intent inference
        })
} catch { }

Once initialized, picovoice can be used to process incoming audio. The underlying logic of the class will handle switching between wake word detection and intent inference, as well as invoking the associated events.

func getNextAudioFrame() -> [Int16] {
    // .. get audioFrame
    return audioFrame;
}

while (true) {
    do {
        try picovoice.process(getNextAudioFrame());
    } catch { }
}

Once you're done with an instance of Picovoice you can force it to release its native resources rather than waiting for the garbage collector:

picovoice.delete();

Web

The Picovoice SDK for Web is available on modern web browsers (i.e. not Internet Explorer) via WebAssembly. Microphone audio is handled via the Web Audio API and is abstracted by the WebVoiceProcessor, which also handles downsampling to the correct format. Picovoice is provided pre-packaged as a Web Worker.

Each spoken language is available as a dedicated npm package (e.g. @picovoice/picovoice-web-en-worker). These packages can be used with the @picovoice/web-voice-processor. They can also be used with the Angular, React, and Vue bindings, which abstract and hide the web worker communication details.

Vanilla JavaScript and HTML (CDN Script Tag)

<!DOCTYPE html>
<html lang="en">
  <head>
    <script src="https://unpkg.com/@picovoice/picovoice-web-en-worker/dist/iife/index.js"></script>
    <script src="https://unpkg.com/@picovoice/web-voice-processor/dist/iife/index.js"></script>
    <script type="application/javascript">
      const RHINO_CONTEXT_BASE64 = /* Base64 representation of Rhino .rhn file */;

      async function startPicovoice() {
        console.log("Picovoice is loading. Please wait...");
        picovoiceWorker = await PicovoiceWebEnWorker.PicovoiceWorkerFactory.create(
          {
            porcupineKeyword: { builtin: "Picovoice" },
            rhinoContext: { base64: RHINO_CONTEXT_BASE64 },
            start: true,
          }
        );

        console.log("Picovoice worker ready!");

        picovoiceWorker.onmessage = (msg) => {
          switch (msg.data.command) {
            case "ppn-keyword": {
              console.log(
                "Wake word detected: " + JSON.stringify(msg.data.keywordLabel)
              );
              break;
            }
            case "rhn-inference":
              {
                console.log(
                  "Inference detected: " + JSON.stringify(msg.data.inference)
                );
                break;
              }

              writeMessage(msg);
          }
        };

        console.log(
          "WebVoiceProcessor initializing. Microphone permissions requested ..."
        );

        try {
          let webVp = await WebVoiceProcessor.WebVoiceProcessor.init({
            engines: [picovoiceWorker],
            start: true,
          });
          console.log(
            "WebVoiceProcessor ready! Say 'Picovoice' to start the interaction."
          );
        } catch (e) {
          console.log("WebVoiceProcessor failed to initialize: " + e);
        }
      }

      document.addEventListener("DOMContentLoaded", function () {
        startPicovoice();
      });
    </script>
  </head>
  <body>
  </body>
</html>

Vanilla JavaScript and HTML (ES Modules)

yarn add @picovoice/picovoice-web-en-worker @picovoice/web-voice-processor

(or)

npm install @picovoice/picovoice-web-en-worker @picovoice/web-voice-processor

import { WebVoiceProcessor } from "@picovoice/web-voice-processor"
import { PicovoiceWorkerFactory } from "@picovoice/picovoice-web-en-worker";
 
async function startPicovoice() {
  // Create a Picovoice Worker (English language) to listen for
  // the built-in keyword "Picovoice" and follow-on commands in the given Rhino context.
  // Note: you receive a Web Worker object, _not_ an individual Picovoice instance
  const picovoiceWorker = await PicovoiceWorkerFactory.create(
    {
      porcupineKeyword: { builtin: "Picovoice" },
      rhinoContext: { base64: RHINO_CONTEXT_BASE64 },
      start: true,
    }
  );
 
  // The worker will send a message with data.command = "ppn-keyword" upon a detection event
  // And data.command = "rhn-inference" when the follow-on inference concludes.
  // Here, we tell it to log it to the console:
  picovoiceWorker.onmessage = (msg) => {
    switch (msg.data.command) {
      case 'ppn-keyword':
        // Wake word detection
        console.log("Wake word: " + msg.data.keywordLabel);
        break;
      case 'rhn-inference:
        // Follow-on command inference concluded
        console.log("Inference: " + msg.data.inference)
      default:
        break;
    }
  };
 
  // Start up the web voice processor. It will request microphone permission
  // and immediately (start: true) start listening.
  // It downsamples the audio to voice recognition standard format (16-bit 16kHz linear PCM, single-channel)
  // The incoming microphone audio frames will then be forwarded to the Picovoice Worker
  // n.b. This promise will reject if the user refuses permission! Make sure you handle that possibility.
  const webVp = await WebVoiceProcessor.init({
    engines: [picovoiceWorker],
    start: true,
  });
}
 
startPicovoice()
 
...
 
// Finished with Picovoice? Release the WebVoiceProcessor and the worker.
if (done) {
  webVp.release()
  picovoiceWorker.sendMessage({command: "release"})
}

Angular

yarn add @picovoice/picovoice-web-angular @picovoice/picovoice-web-en-worker

(or)

npm install @picovoice/picovoice-web-angular @picovoice/picovoice-web-en-worker

import { Subscription } from "rxjs"
import { PicovoiceService } from "@picovoice/picovoice-web-angular"
 
...
 
  constructor(private picovoiceService: PicovoiceService) {
    // Subscribe to Picovoice Keyword detections
    // Store each detection so we can display it in an HTML list
    this.keywordDetection = picovoiceService.keyword$.subscribe(
      keywordLabel => this.detections = [...this.detections, keywordLabel])
    // Subscribe to Rhino Inference events
    // Show the latest one in the widget
    this.inferenceDetection = picovoiceService.inference$.subscribe(
      inference => this.latestInference = inference)
  }

    async ngOnInit() {
        // Load Picovoice worker chunk with specific language model (large ~4-6MB chunk; dynamically imported)
        const pvFactoryEn = (await import('@picovoice/picovoice-web-en-worker')).PicovoiceWorkerFactory
        // Initialize Picovoice Service
        try {
        await this.picovoiceService.init(pvFactoryEn,
            {
            // Built-in wake word
            porcupineKeyword: {builtin: "Hey Google", sensitivity: 0.6},
            // Rhino context (Base64 representation of a `.rhn` file)
            rhinoContext: { base64: RHINO_CONTEXT_BASE64 },
            start: true
            })
        }
        catch (error) {
        console.error(error)
        }
    }

    ngOnDestroy() {
        this.keywordDetection.unsubscribe()
        this.inferenceDetection.unsubscribe()
        this.picovoiceService.release()
    }

React

yarn add @picovoice/picovoice-web-react @picovoice/picovoice-web-en-worker

(or)

npm install @picovoice/picovoice-web-react @picovoice/picovoice-web-en-worker

import React, { useState } from 'react';
import { PicovoiceWorkerFactory } from '@picovoice/picovoice-web-en-worker';
import { usePicovoice } from '@picovoice/picovoice-web-react';
 
const RHINO_CONTEXT_BASE64 = /* Base64 representation of English-language `.rhn` file, omitted for brevity */
 
export default function VoiceWidget() {
  const [keywordDetections, setKeywordDetections] = useState([]);
  const [inference, setInference] = useState(null);
 
  const inferenceEventHandler = (rhinoInference) => {
    console.log(rhinoInference);
    setInference(rhinoInference);
  };
 
  const keywordEventHandler = (porcupineKeywordLabel) => {
    console.log(porcupineKeywordLabel);
    setKeywordDetections((x) => [...x, porcupineKeywordLabel]);
  };
 
  const {
    isLoaded,
    isListening,
    isError,
    errorMessage,
    start,
    resume,
    pause,
    engine,
  } = usePicovoice(
    PicovoiceWorkerFactory,
    {
      // "Picovoice" is one of the builtin wake words, so we merely need to ask for it by name.
      // To use a custom wake word, you supply the `.ppn` files in base64 and provide a label for it.
      porcupineKeyword: "Picovoice",
      rhinoContext: { base64: RHINO_CONTEXT_BASE64 },
      start: true,
    },
    keywordEventHandler,
    inferenceEventHandler
  );
 
return (
  <div className="voice-widget">
    <h3>Engine: {engine}</h3>
    <h3>Keyword Detections:</h3>
    {keywordDetections.length > 0 && (
      <ul>
        {keywordDetections.map((label, index) => (
          <li key={index}>{label}</li>
        ))}
      </ul>
    )}
    <h3>Latest Inference:</h3>
    {JSON.stringify(inference)}
  </div>
)

Vue

yarn add @picovoice/picovoice-web-vue @picovoice/picovoice-web-en-worker

(or)

npm install @picovoice/picovoice-web-vue @picovoice/picovoice-web-en-worker

<template>
  <div class="voice-widget">
    <Picovoice
      v-bind:picovoiceFactoryArgs="{ start: true, porcupineKeyword:
    'Picovoice', rhinoContext: { base64: '... Base64 representation of a .rhn file ...'}"
      v-bind:picovoiceFactory="factory"
      v-on:pv-init="pvInitFn"
      v-on:pv-ready="pvReadyFn"
      v-on:ppn-keyword="pvKeywordFn"
      v-on:rhn-inference="pvInferenceFn"
      v-on:pv-error="pvErrorFn"
    />
    <h3>Keyword Detections:</h3>
    <ul v-if="detections.length > 0">
      <li v-for="(item, index) in detections" :key="index">{{ item }}</li>
    </ul>
  </div>
</template>
<script>
import Picovoice from "@picovoice/picovoice-web-vue";
import { PicovoiceWorkerFactoryEn } from "@picovoice/picovoice-web-en-worker";
 
export default {
  name: "VoiceWidget",
  components: {
    Picovoice,
  },
  data: function() {
    return {
      detections: [],
      isError: null,
      isLoaded: false,
      factory: PicovoiceWorkerFactoryEn,
    };
  },
  methods: {
    pvInitFn: function () {
      this.isError = false;
    },
    pvReadyFn: function () {
      this.isLoaded = true;
      this.isListening = true;
      this.engine = 'ppn'
    },
    pvKeywordFn: function (keyword) {
      this.detections = [...this.detections, keyword];
      this.engine = 'rhn'
    },
    pvInferenceFn: function (inference) {
      this.inference = inference;
      this.engine = 'ppn'
    },
    pvErrorFn: function (error) {
      this.isError = true;
      this.errorMessage = error.toString();
    },
  },
};
</script>

Rust

To add the picovoice library into your app, add picovoice to your app's Cargo.toml manifest:

[dependencies]
picovoice = "*"

To create an instance of the engine with default parameters, use the PicovoiceBuilder function. You must provide a Porcupine keyword file, a wake word detection callback function, a Rhino context file and a inference callback function. You must then make a call to init():

use picovoice::{rhino::RhinoInference, PicovoiceBuilder};

let wake_word_callback = || {
    // let user know wake word detected
};
let inference_callback = |inference: RhinoInference| {
    if inference.is_understood {
        let intent = inference.intent.unwrap();
        let slots = inference.slots;
        // add code to take action based on inferred intent and slot values
    } else {
        // add code to handle unsupported commands
    }
};

let mut picovoice = PicovoiceBuilder::new(
    keyword_path,
    wake_word_callback,
    context_path,
    inference_callback,
).init().expect("Failed to create picovoice");

Upon detection of wake word defined by keyword_path it starts inferring user's intent from the follow-on voice command within the context defined by the file located at context_path. keyword_path is the absolute path to Porcupine wake word engine keyword file (with .ppn suffix). context_path is the absolute path to Rhino Speech-to-Intent engine context file (with .rhn suffix). wake_word_callback is invoked upon the detection of wake phrase and inference_callback isinvoked upon completion of follow-on voice command inference.

When instantiated, valid sample rate can be obtained via sample_rate(). Expected number of audio samples per frame is frame_length(). The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio:

fn next_audio_frame() -> Vec<i16> {
    // get audio frame
}

loop {
    picovoice.process(&next_audio_frame()).expect("Picovoice failed to process audio");
}

Microcontroller

Picovoice is implemented in ANSI C and therefore can be directly linked to embedded C projects. Its public header file contains relevant information. An instance of the Picovoice object can be constructed as follows:

#define MEMORY_BUFFER_SIZE ...
static uint8_t memory_buffer[MEMORY_BUFFER_SIZE] __attribute__((aligned(16)));

static const uint8_t *keyword_array = ...
const float porcupine_sensitivity = 0.5f

static void wake_word_callback(void) {
    // logic to execute upon detection of wake word
}

static const uint8_t *context_array = ...
const float rhino_sensitivity = 0.75f

static void inference_callback(pv_inference_t *inference) {
    // `inference` exposes three immutable properties:
    // (1) `IsUnderstood`
    // (2) `Intent`
    // (3) `Slots`
    // ..
    pv_inference_delete(inference);
}

pv_picovoice_t *handle = NULL;

const pv_status_t status = pv_picovoice_init(
        MEMORY_BUFFER_SIZE,
        memory_buffer,
        sizeof(keyword_array),
        keyword_array,
        porcupine_sensitivity,
        wake_word_callback,
        sizeof(context_array),
        context_array,
        rhino_sensitivity,
        inference_callback,
        &handle);

if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating-point number within [0, 1]. A higher sensitivity reduces miss rate (false reject rate) at cost of increased false alarm rate.

handle is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in keyword_array. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within the context defined in context_array. wake_word_callback is invoked upon the detection of wake phrase and inference_callback is invoked upon completion of follow-on voice command inference.

Picovoice accepts single channel, 16-bit PCM audio. The sample rate can be retrieved using pv_sample_rate(). Finally, Picovoice accepts input audio in consecutive chunks (aka frames) the length of each frame can be retrieved using pv_porcupine_frame_length().

extern const int16_t *get_next_audio_frame(void);

while (true) {
    const int16_t *pcm = get_next_audio_frame();
    const pv_status_t status = pv_picovoice_process(handle, pcm);
    if (status != PV_STATUS_SUCCESS) {
        // error handling logic
    }
}

Finally, when done be sure to release the acquired resources.

pv_picovoice_delete(handle);

Releases

v1.1.0 - December 2nd, 2020

Improved accuracy.
Runtime optimizations.
.NET SDK.
Java SDK.
React Native SDK.
C SDK.

v1.0.0 - October 22, 2020

Initial release.

FAQ

You can find the FAQ here.

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

when i list custom words the accuracy should improve.

Actual behaviour

accuracy declines :

Original text:

The lesion was circumscribed in an ellipse and the subcutaneous tissue was taken down with cautery to expose the anterior fascia. With the anterior fascia exposed, the fistulous tract was identified and mobilized free from the fascia. It was then stapled free and sent off the table. This was done with a GIA 55. The anterior fascia was then closed with interrupted 2-0 Vicryl sutures. Prior to doing this, the staple line was also oversewn with 2- 0 Vicryl sutures. The wound was irrigated with normal saline. The subcutaneous tissue was loosely closed with 3-0 Vicryl suture. The skin was loosely closed with 4-0 Monocryl suture. Sterile dressings were applied.

The child did appear to have drink something prior to being intubated and did have a significant episode of clear emesis. We will be watching her overnight to make sure there is no problem with aspiration. In addition, we will watch her overnight and keep her n.p.o. while she was doing well, we will start a diet on her tomorrow.

transcription without custom words :

THE ALEUTIAN WAS CIRCUMSCRIBED IN AN ELLIPSE AND THE SUBCUTANEOUS TISSUE WAS TAKING DOWN WITH CONTRARY TO EXPOSE THE INTERIOR FAR SHEER NEW LINE WITH THE ANTERIOR FASHION EXPOSED CALMER THE FACELESS TRACK WAS IDENTIFIED AND MOBILIZED FREE FROM THE FASHION PERIOD IT WAS THEN STABLED FREE AND SCENT OF THE TABLE PERIOD THIS WAS DONE WITH A GUY A FIFTY FIVE PERIOD THE ANTERIOR FACIAL WAS THEN CLOSED WITH INTERRUPTED TO VIRAL SUITORS PERIOD PROUD TO DO IN THIS THE STABLE LINE WAS ALSO OVERSEWN WITH TO OR VEHICLE SUITORS PERIOD THE WOUND WAS IRRIGATED WITH NORMAL SELLING THE SUBCUTANEOUS TISSUE WAS LOOSELY CLOSED WITH THREE OVER CROW SUITORS PERIOD THE SKIN WAS LOOSELY GLOWED WITH FOR ALL WHEN A GROSS SUITORS PERIOD STARE ARE DRESSES WERE APPLIED PERIOD NEW PARAGRAPH THE CHILD DID APPEAR TO HAVE DARUG SOMETHING PRIOR TO BEEN INTUBATED AND DID HAVE A SIGNIFICANT EPISODE OF CLAIM AS ITS PERIOD WE WILL BE WATCHING HER OVERNIGHT TO MAKE SURE THERE IS NO PROBLEM WITH ASPIRATION PERIOD IN ADDITION CALMER WE WILL WATCH HER OVERNIGHT AND KEEP HER NYPL WHILE SHE IS DOING WELL COMER BUT WILL START A DIET OF HER TOMORROW PERIOD

transcription with custom words: THE RELATION WAS THE CONSCRIBED HER LIPS AND THE SUCCESSIVITY SHE WAS TAKING DOWN THE CONTRARY TO EXPOSE THE INTERIOR BISHOP WITH THE AND SERIOUSLY SHAPED STONE THE KRISTELLER STRIKE WAS IDENTIFIED AND MOBILIZED FREE FROM THE SESSION IT WAS THEN STABLE FREE AND SENT ON THE TABLE IT THIS WAS DONE WITH A TO BUY A SEED THE SIDE THE UNTIL THE FISHER WAS THEN CLOSED TO IN INTERRUPT AND PEOPLE LIKE RESEARCHERS BRIAN TO DOING THIS THE STATE OF LIFE WAS ALSO A MUSCLE WITH TO HIM MICHAEL SUITORS THE WOUND WAS IRRITATED WITH MARBLE SAILING THE SUB PIGEON OF TISSUE WAS LOOSELY CLOSED WITH GREAT AND BAIKAL CERTAIN THE SKIN WAS MAINLY CLOSES FOR MORE VICRYL AN OPEN SECRET STERILE DRESSES WERE APPLIED THE CHILD SLID UP INTO A HOT DRINK SOMETHING FIRE TO BEEN INTEGRATED AND THEY HAVE A SIGNATURE AND EPISODE OF THEIR AMISES WE WILL BE WATCHING OUT OF THE NIGHT TO MAKE SURE THERE IS NO PROBLEM WITH ASPIRATION IN ADDITION WE WILL WORK OUT OF THE MIGHT HAVE MAKE HALF AND PEOPLE WHILE SHE IS DOING WELL WE WILL START TO THE GUISE OF ANTINOUS

copy of yml file of custom words:

new: npo: - n p oʊ emesis: - ɛ m ɪ s ɪ s sterile: - s t ɛ ɹ ɑ ɪ l - s t ɛ ɹ e l monocryl: - m oʊ n oʊ K ɹ i l four oh: - f oʊ ɹ oʊ three oh: - θ ɹ i oʊ saline: - s ɑ l i n wound: - w u n d sutures: - s u t ʃ ɝ vicryl: - v aɪ K ɹ i l two oh: - t ʊ oʊ staple: - s t ɔ p l fistulous: - f i s t ʊ l oʊ s fascia: - f æ ʃ aɪ cautery: - K ɔ t e ɹ i lesion: - l ɪ ʃ ɔ n boost:

period
comma

Steps to reproduce the behaviour

All transcriptions were from the web console on a mac book pro 2.6ghz 6 core intel 6 i7 running big sur

The first was using the built in microphone at home with minimal ambient noise and the second was using airpods in a starbucks cafe with about moderate noise level.

QUESTIONS

I have read through the documentation and other questions on this github forum. The docs say we don't need voice data to train models and only typing the text trains the models. My basic test for using in a medical environment does not seem to work. I had also raised an issue with the abbreviated IPA not representing all the sounds to create custom words.

IS THERE ANY OTHER WAY TO TRAIN CUSTOM WORDS TO INCREASE ACCURACY.

HOW DOES PICOVOICE DEAL WITH ABBREVIATIONS LIKE GIA IN THE MODEL WITHOUT TRANSCRIBING THEM AS WORDS.

HOW DOES CHEETAH DEAL WITH NUMBERS

I don't mind joining the starter tier to resolve this, but as the payment is for the whole year (about $12,000, price for enterprise not listed), I want to be sure it is possible and ensure the accuracy is acceptable for production within the predicted noise environments it would be deployed before committing that amount.

I also have a similar issue with rhino but would open a new issue for that.

Thank you for your help.

(Include enough details so that the issue can be reproduced independently.)

help wanted

ai.picovoice.rhino.RhinoActivationException: Initialization failed. with Java SDK
I'm using the Java SDK on a Linux Fedora 34 system. With my own project as well with the MicDemo I'm only getting ai.picovoice.rhino.RhinoActivationException: Initialization failed. on start and with the demo also

[INFO] Porcupine model path : '/tmp/porcupine-mic-demo/porcupine/lib/common/porcupine_params.pv' [INFO] Porcupine keyword path [0] : '/tmp/porcupine-mic-demo/porcupine/resources/keyword_files/linux/porcupine_linux.ppn' [INFO] Porcupine sensitivity [0] : 0,50
help wanted
opened by MineKing9534 36
Picovoice Issue: "Attempt to renew a license nearing expiry was unsuccessful - will continue to use current license"

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

We are running Picovoice Rhino on a Jetson Xavier NX via the picovoice_ros node. Up until this morning it has worked very well for almost one month.

Actual behaviour

As of today we are receiving the following error:

[WARN] Attempt to renew a license nearing expiry was unsuccessful - will continue to use current license.

Picovoice appears to be continuously attempting to reconnect which is causing gaps in the "listening" function which leads to missed detections.
question

opened by masonearles 14
Struggling with 'OUT_OF_MEMORY' error when using FreeRTOS with STM32CubeIDE
Hello,

I'm using STM32F769-Disco board and I'm struggling since two days with 'OUT_OF_MEMORY' error returned by pv_picovoice_init()! I'm using Picovoice in an RTOS context (FreeRTOS). I've increased all possible memory regions as following: 1- heap and stack size of the application in linker file: _Min_Heap_Size = 0xF000 ; /* required amount of heap: 60kB / _Min_Stack_Size = 0xF000 ; / required amount of stack: 60kB */

2- FreeRTOS heap size: #define configTOTAL_HEAP_SIZE ((size_t) (250 * 1024))

3- Stack size of the task managing picovoice: #define configPICOVOICE_TASK_STK_SIZE ( 0x8000 )

4- Picovoice memory buffer, relocated at the end of SDRAM and increased its size to 128kB: #define SDRAM_DEVICE_ADDR ((uint32_t)0xC0000000) #define SDRAM_DEVICE_SIZE ((uint32_t)0x1000000) /* SDRAM device size in MBytes */ #define MEMORY_BUFFER_SIZE (128 * 1024) // 128kB instead of 70kB //static int8_t memory_buffer[MEMORY_BUFFER_SIZE] attribute((aligned(16))); // Commented out

In the task: int8_t* memory_buffer = NULL; uint32_t membuffer_start_address = 0; /* memory_buffer will be located at the end of the SDRAM */ membuffer_start_address = (SDRAM_DEVICE_ADDR+SDRAM_DEVICE_SIZE) - MEMORY_BUFFER_SIZE; membuffer_start_address = membuffer_start_address - (membuffer_start_address%16); memory_buffer = (int8_t *)membuffer_start_address;

status = pv_picovoice_init( ACCESS_KEY, MEMORY_BUFFER_SIZE, memory_buffer, // In SDRAM sizeof(KEYWORD_ARRAY), KEYWORD_ARRAY, PORCUPINE_SENSITIVITY, wake_word_callback, sizeof(CONTEXT_ARRAY), CONTEXT_ARRAY, RHINO_SENSITIVITY, true, inference_callback, &handle); Any help please? PS:

Issue faced with 1.1.0 and 2.1.0 library versions.

For point 4- : If I relocate again the memory buffer in the internal RAM using static int8_t memory_buffer[MEMORY_BUFFER_SIZE] attribute((aligned(16))); , I still having the same issue!

Thank you for your help.
help wanted
opened by safsoun 12
ESP-32 Support

name: ESP32 Support about: Picovoice SDK support ESP32 title: 'ESP32 Support' labels: enhancement assignees: ''

Is your feature request related to a problem? I think espressif microcontrollers are great chips. size of memory available in these microcontrollers is much higher especially by considering the cost. Supporting these microcontrollers by this great platform will definitely result in powerful IOT voice recognizer devices. (especially when internet access isn't available for a short time)

Solution Supporting these great microcontrollers either its official esp-IDF platform or Arduino.

Alternatives I believe Supporting more general platforms like Arduino or Platform IO will ease the process of supporting different boards.
new-platform-request

opened by aalaei 11
Picovoice Documentation Issue (/docs/api/rhino-python/)
What is the URL of the doc?

Rhino - Python API

What's the nature of the issue? (e.g. steps do not work, typos/grammar/spelling, etc., out of date)

pvrhino.Rhino.frame_length self.frame_length: int The number of audio samples per frame that Rhino accepts.

As per docs, Rhino accepts single channel audio with 16 bit sample with and 16k samples per second. On Mac OS (Intel), pvrhino.Rhino.frame_length == 512. However, if passing in 512 frames as the docs suggest, I get the error:

ValueError: Invalid frame length. expected 512 but received 1024

It does work when using 256 frames, which is the frame_rate divided by the audio frame width in bytes. So it seems that pvrhino.Rhino.frame_length is the expected number of bytes, and not the expected number of frames. Maybe the docs should state this?
help wanted
opened by dnknth 9
Support for Jetson NX USB sound card
I have followed the clone and pip instructions on the web for Picovoice. I needed to make some changes to util.py to get past the unsupported platform.

elif '0x004' == cpu_part: return 'cortex-a72' + arch_info

that works and now I can run: python3 porcupine_demo_mic.py --keywords alexa

from the ~/picovoice/resources/porcupine/demo/python directory and it listens:

Listening { alexa (0.50)

However I cannot get the mic and speakers to switch from device index 0 to device 32. IE

dan@Am1:~/picovoice/resources/porcupine/demo/python$ python3 porcupine_demo_mic.py --audio_device_index 32 --keywords alexa ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda-xnx.pcm.front.0:CARD=0'

'cards.tegra-hda-xnx.pcm.front.0:CARD=0' is the device at index 0.

In addition I have updated my asound.conf file to default to the USB device. asound sound.wav for example plays fine.

Any help would be appreciated...
opened by danielP1S 9
Picovoice Console Issue:

Hi, I am using a PsoC6 with a Rhino Speech-to-Intent context (EN). My problem is $pv.Percent and $pv.TwoDigitInteger return the number's literal rather than the number itself. My version of picovoice is 2.1.0.
bug

opened by Uwita082 8
Picovoice Issue:

hi, unfortunately despite having tried again many times, following the documentation from your official site and following the tutorial on mediums on how to implement react-native with picovoice I can't build my app in ios. I tried to download the zip or even doing git clone then in the react-native demo doing yarn ios-install and opening the xcode workspace file does not work, it always gives me an error and warnings, could I have assistance or a more detailed guide? please thank you very much it is important ..
help wanted

opened by applabstudio 8
Picovoice Issue: Accuracy and how models are trained ... pv cheetah
Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

when i list custom words the accuracy should improve.

Actual behaviour

accuracy declines :

Original text:

The lesion was circumscribed in an ellipse and the subcutaneous tissue was taken down with cautery to expose the anterior fascia. With the anterior fascia exposed, the fistulous tract was identified and mobilized free from the fascia. It was then stapled free and sent off the table. This was done with a GIA 55. The anterior fascia was then closed with interrupted 2-0 Vicryl sutures. Prior to doing this, the staple line was also oversewn with 2- 0 Vicryl sutures. The wound was irrigated with normal saline. The subcutaneous tissue was loosely closed with 3-0 Vicryl suture. The skin was loosely closed with 4-0 Monocryl suture. Sterile dressings were applied.

The child did appear to have drink something prior to being intubated and did have a significant episode of clear emesis. We will be watching her overnight to make sure there is no problem with aspiration. In addition, we will watch her overnight and keep her n.p.o. while she was doing well, we will start a diet on her tomorrow.

transcription without custom words :

THE ALEUTIAN WAS CIRCUMSCRIBED IN AN ELLIPSE AND THE SUBCUTANEOUS TISSUE WAS TAKING DOWN WITH CONTRARY TO EXPOSE THE INTERIOR FAR SHEER NEW LINE WITH THE ANTERIOR FASHION EXPOSED CALMER THE FACELESS TRACK WAS IDENTIFIED AND MOBILIZED FREE FROM THE FASHION PERIOD IT WAS THEN STABLED FREE AND SCENT OF THE TABLE PERIOD THIS WAS DONE WITH A GUY A FIFTY FIVE PERIOD THE ANTERIOR FACIAL WAS THEN CLOSED WITH INTERRUPTED TO VIRAL SUITORS PERIOD PROUD TO DO IN THIS THE STABLE LINE WAS ALSO OVERSEWN WITH TO OR VEHICLE SUITORS PERIOD THE WOUND WAS IRRIGATED WITH NORMAL SELLING THE SUBCUTANEOUS TISSUE WAS LOOSELY CLOSED WITH THREE OVER CROW SUITORS PERIOD THE SKIN WAS LOOSELY GLOWED WITH FOR ALL WHEN A GROSS SUITORS PERIOD STARE ARE DRESSES WERE APPLIED PERIOD NEW PARAGRAPH THE CHILD DID APPEAR TO HAVE DARUG SOMETHING PRIOR TO BEEN INTUBATED AND DID HAVE A SIGNIFICANT EPISODE OF CLAIM AS ITS PERIOD WE WILL BE WATCHING HER OVERNIGHT TO MAKE SURE THERE IS NO PROBLEM WITH ASPIRATION PERIOD IN ADDITION CALMER WE WILL WATCH HER OVERNIGHT AND KEEP HER NYPL WHILE SHE IS DOING WELL COMER BUT WILL START A DIET OF HER TOMORROW PERIOD

transcription with custom words: THE RELATION WAS THE CONSCRIBED HER LIPS AND THE SUCCESSIVITY SHE WAS TAKING DOWN THE CONTRARY TO EXPOSE THE INTERIOR BISHOP WITH THE AND SERIOUSLY SHAPED STONE THE KRISTELLER STRIKE WAS IDENTIFIED AND MOBILIZED FREE FROM THE SESSION IT WAS THEN STABLE FREE AND SENT ON THE TABLE IT THIS WAS DONE WITH A TO BUY A SEED THE SIDE THE UNTIL THE FISHER WAS THEN CLOSED TO IN INTERRUPT AND PEOPLE LIKE RESEARCHERS BRIAN TO DOING THIS THE STATE OF LIFE WAS ALSO A MUSCLE WITH TO HIM MICHAEL SUITORS THE WOUND WAS IRRITATED WITH MARBLE SAILING THE SUB PIGEON OF TISSUE WAS LOOSELY CLOSED WITH GREAT AND BAIKAL CERTAIN THE SKIN WAS MAINLY CLOSES FOR MORE VICRYL AN OPEN SECRET STERILE DRESSES WERE APPLIED THE CHILD SLID UP INTO A HOT DRINK SOMETHING FIRE TO BEEN INTEGRATED AND THEY HAVE A SIGNATURE AND EPISODE OF THEIR AMISES WE WILL BE WATCHING OUT OF THE NIGHT TO MAKE SURE THERE IS NO PROBLEM WITH ASPIRATION IN ADDITION WE WILL WORK OUT OF THE MIGHT HAVE MAKE HALF AND PEOPLE WHILE SHE IS DOING WELL WE WILL START TO THE GUISE OF ANTINOUS

copy of yml file of custom words:

new: npo: - n p oʊ emesis: - ɛ m ɪ s ɪ s sterile: - s t ɛ ɹ ɑ ɪ l - s t ɛ ɹ e l monocryl: - m oʊ n oʊ K ɹ i l four oh: - f oʊ ɹ oʊ three oh: - θ ɹ i oʊ saline: - s ɑ l i n wound: - w u n d sutures: - s u t ʃ ɝ vicryl: - v aɪ K ɹ i l two oh: - t ʊ oʊ staple: - s t ɔ p l fistulous: - f i s t ʊ l oʊ s fascia: - f æ ʃ aɪ cautery: - K ɔ t e ɹ i lesion: - l ɪ ʃ ɔ n boost:

period

comma

Steps to reproduce the behaviour

All transcriptions were from the web console on a mac book pro 2.6ghz 6 core intel 6 i7 running big sur

The first was using the built in microphone at home with minimal ambient noise and the second was using airpods in a starbucks cafe with about moderate noise level.

QUESTIONS

I have read through the documentation and other questions on this github forum. The docs say we don't need voice data to train models and only typing the text trains the models. My basic test for using in a medical environment does not seem to work. I had also raised an issue with the abbreviated IPA not representing all the sounds to create custom words.

IS THERE ANY OTHER WAY TO TRAIN CUSTOM WORDS TO INCREASE ACCURACY.

HOW DOES PICOVOICE DEAL WITH ABBREVIATIONS LIKE GIA IN THE MODEL WITHOUT TRANSCRIBING THEM AS WORDS.

HOW DOES CHEETAH DEAL WITH NUMBERS

I don't mind joining the starter tier to resolve this, but as the payment is for the whole year (about $12,000, price for enterprise not listed), I want to be sure it is possible and ensure the accuracy is acceptable for production within the predicted noise environments it would be deployed before committing that amount.

I also have a similar issue with rhino but would open a new issue for that.

Thank you for your help.

(Include enough details so that the issue can be reproduced independently.)
help wanted
opened by francis-idada-biz 7
Picovoice issue when running demo on python with Raspberry Pi 4

Hello i am attempting to run the pico voice demo to test my custom wakeword and rhino but i am running into an issue. I am using a Respeaker USB mic array connected to the Pi via USB. The console error is below: picovoice_demo_mic --access_key == --keyword_path /home/pi/Desktop/hey-lazy-suzan.ppn --context_path /home/pi/Desktop/lazy-2_en_raspberry-pi_v2_1_0.rhn

Using device: ReSpeaker 4 Mic Array (UAC1.0) Multichannel [Listening ...] Traceback (most recent call last): File "/home/pi/.local/bin/picovoice_demo_mic", line 8, in sys.exit(main()) File "/home/pi/.local/lib/python3.9/site-packages/picovoicedemo/picovoice_demo_mic.py", line 212, in main PicovoiceDemo( File "/home/pi/.local/lib/python3.9/site-packages/picovoicedemo/picovoice_demo_mic.py", line 124, in run pcm = recorder.read() File "/home/pi/.local/lib/python3.9/site-packages/pvrecorder/pvrecorder.py", line 129, in read raise self._PVRECORDER_STATUS_TO_EXCEPTION[status]("Failed to read from device.") OSError: Failed to read from device.

opened by opesam 7
TypeError: create() got an unexpected keyword argument 'access_key'

import pvporcupine

access_key = "my access key"

handle = pvporcupine.create(access_key=access_key, keywords=['picovoice'])

When I run this python code , or any of the python demos i get this error

TypeError: create() got an unexpected keyword argument 'access_key'

I have checked and key-check the access key is correct.. Any help?
question

opened by printmanpro 7
Picovoice Console Issue: "Error retrieving Rhino model: Internal server error"

Hello,

I have a rather large Rhino YML file consisting of almost 2,000 lines. ~315 expression lines, ~1,500 lines of slots and ~150 lines of macros. Recently I've noticed that when building the rhino file from the console (so that I can test it from the browser) I'm seeing the error "Error retrieving Rhino model: Internal server error". There's no additional diagnostic information but from my very uneducated guess, it seems like it's a timeout. When I remove a chunk of expressions, it builds fine. I've also tried removing different chunks expressions over multiple builds to hopefully rule out the issue being with any specific expressions.

Does this seem likely/possible that it's a timeout? If so, what can be done to work around this (if anything)? I'm aware Rhino YML files probably aren't usually this large, but I read in the Rhino FAQ that "There is no technical limit on the number of commands (expressions) or slot values Rhino can understand" so I figured that it was fine.

When I tried to train the rhino file, I also saw this error.

Edit: I don't know if this helps at all, but the following error appears in the browser console when the internal server error ocurs:

All the best,
bug

opened by ReeceKenney 3

v2.1(Jan 21, 2022)
macOS arm64 (Apple Silocon) support added for Java and Unity SDKs

Various bug fixes and improvements

Source code(tar.gz)
Source code(zip)
v2.0(Nov 25, 2021)
Improved accuracy.

Added Rust SDK.

macOS arm64 support.

Added NodeJS support for Windows, NVIDIA Jetson Nano, and BeagleBone.

Added .NET support for NVIDIA Jetson Nano and BeagleBone.

Runtime optimization.

Source code(tar.gz)
Source code(zip)
v1.1(Dec 3, 2020)
Improved accuracy.

Runtime optimizations.

.NET SDK.

Java SDK.

React Native SDK.

C SDK.

Source code(tar.gz)
Source code(zip)