Octopus
Made in Vancouver, Canada by Picovoice
Octopus is Picovoice's Speech-to-Index engine. It directly indexes speech without relying on a text representation. This acoustic-only approach boosts accuracy by removing out-of-vocabulary limitation and eliminating the problem of competing hypothesis (e.g. homophones)
Table of Contents
Demos
Python Demos
Install the demo package:
sudo pip3 install pvoctopusdemo
Run the following in the terminal:
octopus_demo --access_key {AccessKey} --audio_paths ${AUDIO_PATHS}
Replace ${AccessKey}
with your AccessKey obtained from Picovoice Console and ${AUDIO_PATHS}
with a space-separated list of audio files. Octopus starts processing the audio files and asks you for search phrases and shows results interactively.
For more information about the Python demos go to demo/python.
C Demos
Build the demo:
cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build
Index a given audio file:
./demo/c/build/octopus_index_demo ${LIBRARY_PATH} ${ACCESS_KEY} ${AUDIO_PATH} ${INDEX_PATH}
Then search the index for a given phrase:
./demo/c/build/octopus_search_demo ${LIBRARY_PATH} ${MODEL_PATH} ${ACCESS_KEY} ${INDEX_PATH} ${SEARCH_PHRASE}
Replace ${LIBRARY_PATH}
with path to appropriate library available under lib, ${ACCESS_KEY}
with AccessKey obtained from Picovoice Console, ${AUDIO_PATH}
with the path to a given audio file and format, ${INDEX_PATH}
with the path to cached index file and ${SEARCH_PHRASE}
to a search phrase.
For more information about C demos go to demo/c.
Android Demos
Using Android Studio, open demo/android/OctopusDemo as an Android project.
Replace "${YOUR_ACCESS_KEY_HERE}"
inside MainActivity.java with your AccessKey obtained from Picovoice Console. Then run the demo.
For more information about Android demos go to demo/android.
iOS Demos
From the demo/ios/OctopusDemo, run the following to install the Octopus CocoaPod:
pod install
Replace "{YOUR_ACCESS_KEY_HERE}"
inside ViewModel.swift
with your AccessKey obtained from Picovoice Console. Then, using Xcode, open the generated OctopusDemo.xcworkspace
and run the application.
For more information about iOS demos go to demo/ios.
Web Demos
From demo/web run the following in the terminal:
yarn
yarn start
(or)
npm install
npm run start
Open http://localhost:5000
in your browser to try the demo.
SDKs
Python
Create an instance of the engine:
import pvoctopus
access_key = "" # AccessKey provided by Picovoice Console (https://picovoice.ai/console/)
handle = pvoctopus.create(access_key=access_key)
Index your raw audio data or file:
audio_data = [..]
metadata = handle.index(audio_data)
# or
audio_file_path = "/path/to/my/audiofile.wav"
metadata = handle.index_file(audio_file_path)
Then search the metadata for phrases:
avocado_matches = matches['avocado']
for match in avocado_matches:
print(f"Match for `avocado`: {match.start_sec} -> {match.end_sec} ({match.probablity})")
When done the handle resources have to be released explicitly:
handle.delete()
C
pv_octopus.h header file contains relevant information. Build an instance of the object:
const char *model_path = "..."; // absolute path to the model file available at `lib/common/octopus_params.pv`
const char *access_key = "..." // AccessKey provided by Picovoice Console (https://picovoice.ai/console/)
pv_octopus_t *handle = NULL;
pv_status_t status = pv_octopus_init(access_key, model_path, &handle);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
Index audio data using constructed object:
const char *audio_path = "..."; // absolute path to the audio file to be indexed
void *indices = NULL;
int32_t num_indices_bytes = 0;
pv_status_t status = pv_octopus_index_file(handle, audio_path, &indices, &num_indices_bytes);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
Search the indexed data:
const char *phrase = "...";
pv_octopus_match_t *matches = NULL;
int32_t num_matches = 0;
pv_status_t status = pv_octopus_search(handle, indices, num_indices_bytes, phrase, &matches, &num_matches);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
When done be sure to release the acquired resources:
pv_octopus_delete(handle);
Android
Create an instance of the engine:
import ai.picovoice.octopus.*;
final String accessKey = "..."; // AccessKey provided by Picovoice Console (https://picovoice.ai/console/)
try {
Octopus handle = new Octopus.Builder(accessKey).build(appContext);
} catch (OctopusException ex) { }
Index audio data using constructed object:
final String audioFilePath = "/path/to/my/audiofile.wav"
try {
OctopusMetadata metadata = handle.indexAudioFile(audioFilePath);
} catch (OctopusException ex) { }
Search the indexed data:
HashMap <String, OctopusMatch[]> matches = handle.search(metadata, phrases);
for (Map.Entry<String, OctopusMatch[]> entry : map.entrySet()) {
final String phrase = entry.getKey();
for (OctopusMatch phraseMatch : entry.getValue()){
final float startSec = phraseMatch.getStartSec();
final float endSec = phraseMatch.getEndSec();
final float probability = phraseMatch.getProbability();
}
}
When done be sure to release the acquired resources:
metadata.delete();
handle.delete();
iOS
Create an instance of the engine:
import Octopus
let accessKey : String = // .. AccessKey provided by Picovoice Console (https://picovoice.ai/console/)
do {
let handle = try Octopus(accessKey: accessKey)
} catch { }
Index audio data using constructed object:
let audioFilePath = "/path/to/my/audiofile.wav"
do {
let metadata = try handle.indexAudioFile(path: audioFilePath)
} catch { }
Search the indexed data:
let matches: Dictionary<String, [OctopusMatch]> = try octopus.search(metadata: metadata, phrases: phrases)
for (phrase, phraseMatches) in matches {
for phraseMatch in phraseMatches {
var startSec = phraseMatch.startSec;
var endSec = phraseMatch.endSec;
var probability = phraseMatch.probability;
}
}
When done be sure to release the acquired resources:
handle.delete();
Web
Octopus is available on modern web browsers (i.e., not Internet Explorer) via WebAssembly. Octopus is provided pre-packaged as a Web Worker to allow it to perform processing off the main thread.
Vanilla JavaScript and HTML (CDN Script Tag)
>
<html lang="en">
<head>
<script src="https://unpkg.com/@picovoice/octopus-web-en-worker/dist/iife/index.js">script>
<script type="application/javascript">
// The metadata object to save the result of indexing for later searches
let octopusMetadata = undefined
function octopusIndexCallback(metadata) {
octopusMetadata = metadata
}
function octopusSearchCallback(matches) {
console.log(`Search results (${matches.length}):`)
console.log(`Start: ${match.startSec}s -> End: ${match.endSec}s (Probability: ${match.probability})`)
}
async function startOctopus() {
// Create an Octopus Worker
// Note: you receive a Worker object, _not_ an individual Octopus instance
const accessKey = ... // AccessKey string provided by Picovoice Console (https://picovoice.ai/console/)
const OctopusWorker = await OctopusWorkerFactory.create(
accessKey,
octopusIndexCallback,
octopusSearchCallback
)
}
document.addEventListener("DOMContentLoaded", function () {
startOctopus();
// Send Octopus the audio signal
const audioSignal = new Int16Array(/* Provide data with correct format*/)
OctopusWorker.postMessage({
command: "index",
input: audioSignal,
});
});
const searchText = ...
OctopusWorker.postMessage({
command: "search",
metadata: octopusMetadata,
searchPhrase: searchText,
});
script>
head>
<body>body>
html>
Vanilla JavaScript and HTML (ES Modules)
yarn add @picovoice/octopus-web-en-worker
(or)
npm install @picovoice/octopus-web-en-worker
import { OctopusWebEnWorker } from "@picovoice/octopus-web-en-worker";
// The metadata object to save the result of indexing for later searches
let octopusMetadata = undefined;
function octopusIndexCallback(metadata) {
octopusMetadata = metadata;
}
function octopusSearchCallback(matches) {
console.log(`Search results (${matches.length}):`);
console.log(`Start: ${match.startSec}s -> End: ${match.endSec}s (Probability: ${match.probability})`);
}
async function startOctopus() {
// Create an Octopus Worker
// Note: you receive a Worker object, _not_ an individual Octopus instance
const accessKey = // .. AccessKey provided by Picovoice Console (https://picovoice.ai/console/)
const OctopusWorker = await OctopusWorkerFactory.create(
accessKey,
octopusIndexCallback,
octopusSearchCallback
);
}
startOctopus()
...
// Send Octopus the audio signal
const audioSignal = new Int16Array(/* Provide data with correct format*/)
OctopusWorker.postMessage({
command: "index",
input: audioSignal,
});
...
const searchText = ...;
OctopusWorker.postMessage({
command: "search",
metadata: octopusMetadata,
searchPhrase: searchText,
});
Releases
v1.0.0 Oct 8th, 2021
- Initial release.