A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.

Related tags

Audio praatIO
Overview

praatIO

Questions? Comments? Feedback?


A library for working with praat, time aligned audio transcripts, and audio files that comes with batteries included.

Praat uses a file format called textgrids, which are time aligned speech transcripts. This library isn't just a data struct for reading and writing textgrids--many utilities are provided to make it easy to work with with transcripts and associated audio files. This library also provides some other tools for use with praat.

Praat is an open source software program for doing phonetic analysis and annotation of speech. Praat can be downloaded here

Table of contents

  1. Documentation
  2. Tutorials
  3. Version History
  4. Requirements
  5. Installation
  6. Version 4 to 5 Migration
  7. Usage
  8. Common Use Cases
  9. Tests
  10. Citing praatIO
  11. Acknowledgements

Documentation

Automatically generated pdocs can be found here:

http://timmahrt.github.io/praatIO/

Tutorials

There are tutorials available for learning how to use PraatIO. These are in the form of IPython Notebooks which can be found in the /tutorials/ folder distributed with PraatIO.

You can view them online using the external website Jupyter:

Tutorial 1: An introduction and tutorial

Version History

Praatio uses semantic versioning (Major.Minor.Patch)

Please view CHANGELOG.md for version history.

Requirements

Python module https://pypi.org/project/typing-extensions/. It should be installed automatically with praatio but you can install it manually if you have any problems.

Python 3.7.* or above

Click here to visit travis-ci and see the specific versions of python that praatIO is currently tested under

If you are using Python 2.x or Python < 3.7, you can use PraatIO 4.x.

Installation

PraatIO is on pypi and can be installed or upgraded from the command-line shell with pip like so

python -m pip install praatio --upgrade

Otherwise, to manually install, after downloading the source from github, from a command-line shell, navigate to the directory containing setup.py and type

python setup.py install

If python is not in your path, you'll need to enter the full path e.g.

C:\Python37\python.exe setup.py install

Version 4 to 5 Migration

Many things changed between versions 4 and 5. If you see an error like WARNING: You've tried to import 'tgio' which was renamed 'textgrid' in praatio 5.x. it means that you have installed version 5 but your code was written for praatio 4.x or earlier.

The immediate solution is to uninstall praatio 5 and install praatio 4. From the command line:

pip uninstall praatio
pip install "praatio<5"

If praatio is being installed as a project dependency--ie it is set as a dependency in setup.py like

    install_requires=["praatio"],

then changing it to the following should fix the problem

    install_requires=["praatio ~= 4.1"],

Many files, classes, and functions were renamed in praatio 5 to hopefully be clearer. There were too many changes to list here but the tgio module was renamed textgrid.

Also, the interface for openTextgrid() and tg.save() has changed. Here are examples of the required arguments in the new interface

textgrid.openTextgrid(
  fn=name,
  includeEmptyIntervals=False
)
tg.save(
  fn=name,
  format= "short_textgrid",
  includeBlankSpaces= False
)

Please consult the documentation to help in upgrading to version 5.

Usage

99% of the time you're going to want to run

from praatio import textgrid
tg = textgrid.openTextgrid(r"C:\Users\tim\Documents\transcript.TextGrid", False)

Or if you want to work with KlattGrid files

from praatio import klattgrid
kg = klattgrid.openKlattGrid(r"C:\Users\tim\Documents\transcript.KlattGrid")

See /test for example usages

Common Use Cases

What can you do with this library?

  • query a textgrid to get information about the tiers or intervals contained within

    tg = textgrid.openTextgrid("path_to_textgrid", False)
    entryList = tg.tierDict["speaker_1_tier"].entryList # Get all intervals
    entryList = tg.tierDict["phone_tier"].find("a") # Get the indicies of all occurrences of 'a'
  • create or augment textgrids using data from other sources

  • found that you clipped your audio file five seconds early and have added it back to your wavefile but now your textgrid is misaligned? Add five seconds to every interval in the textgrid

    tg = textgrid.openTextgrid("path_to_textgrid", False)
    moddedTG = tg.editTimestamps(5)
    moddedTG.save('output_path_to_textgrid', 'long_textgrid', True)
  • utilize the klattgrid interface to raise all speech formants by 20%

    kg = klattgrid.openKlattGrid("path_to_klattgrid")
    incrTwenty = lambda x: x * 1.2
    kg.tierDict["oral_formants"].modifySubtiers("formants",incrTwenty)
    kg.save(join(outputPath, "bobby_twenty_percent_less.KlattGrid"))
  • replace labeled segments in a recording with silence or delete them

    • see /examples/deleteVowels.py
  • use set operations (union, intersection, difference) on textgrid tiers

    • see /examples/textgrid_set_operations.py
  • see /praatio/praatio_scripts.py for various ready-to-use functions such as

    • splitAudioOnTier(): split an audio file into chunks specified by intervals in one tier
    • spellCheckEntries(): spellcheck a textgrid tier
    • tgBoundariesToZeroCrossings(): adjust all boundaries and points to fall at the nearest zero crossing in the corresponding audio file
    • alignBoundariesAcrossTiers(): for handmade textgrids, sometimes entries may look as if they are aligned at the same time but actually are off by a small amount, this will correct them

Tests

I run tests with the following command (this requires pytest and pytest-cov to be installed):

pytest --cov=praatio tests/

Citing praatIO

PraatIO is general purpose coding and doesn't need to be cited but if you would like to, it can be cited like so:

Tim Mahrt. PraatIO. https://github.com/timmahrt/praatIO, 2016.

Acknowledgements

Development of PraatIO was possible thanks to NSF grant BCS 12-51343 to Jennifer Cole, José I. Hualde, and Caroline Smith and to the A*MIDEX project (n° ANR-11-IDEX-0001-02) to James Sneed German funded by the Investissements d'Avenir French Government program, managed by the French National Research Agency (ANR).

Comments
  • Can I fill the blanks in the tier by extending the existing intervals?

    Can I fill the blanks in the tier by extending the existing intervals?

    I noticed that when saving the textgrid file, praatio would try to fill up the tiers with new blank intervals, which seems not to be quite friendly to automatic aligning. I am not sure but what would happen if the tier is left there and not filled up? Or can I use praatio to fill the blanks by extending the existing intervals (which would not change the total number of intervals, making it much easier for machines to recognize)?

    opened by GalaxieT 21
  • Forced alignment?

    Forced alignment?

    Hi,

    I've studied a little bit of forced alignment so currently I have a wav file which spoke "hello" and a .txt file which contain the word "hello". Can I use some sort of forced alignment to find out where is the start or end of the sentences along with its pronunciation? If so is it possible to do it in a Window 10 OS?

    Thank you.

    opened by Terrance82 8
  • use praat to segment a speech file

    use praat to segment a speech file

    Hi I am very new to Praat overall, not just the python version.

    is Praat a good tool to use to segment a .wav file if I know the exact times marks where I want to chop up the file? I like to just use Praat, because afterwards I will be extracting pitch and tempo from the segments.

    opened by bhomass 8
  • Issues parsing TextGrids from ELAN

    Issues parsing TextGrids from ELAN

    I've had a couple of users reporting issues with loading TextGrids exported from ELAN. The issue seems to be that the "item [1]" lines are formatted without a space ("item[1]"), so the parsing in https://github.com/timmahrt/praatIO/blob/master/praatio/tgio.py#L1896 fails. I think a reasonable fix would be something like re.split(r'item ?\[', data, flags=re.MULTILINE)[1:].

    Looks like you're working on a 5.0, so don't know if that would be the place to fix it or if it would be better for me to submit a PR for the main branch.

    opened by mmcauliffe 7
  • Textgrid validation

    Textgrid validation

    A user might make certain assumptions about their data and praatio will silently break those expectations. It can make calculation errors go unnoticed.

    Scenario A

    • Textgrid A has a maxTimestamp of 10.12345
    • Textgrid B has a maxTimestamp of 10.123456
    • both textgrids have an intervalTier with a final entry that runs to the end of the file
    • if those Textgrids are combined (using append) the interval in Textgrid A will no longer run to the new end of the file, although that may have been the original intention

    Scenario B?

    Solution?

    • validate on save will probably be too late for Scenario A but can be caught when adding the tier to the Textgrid
    • validation could potentially be desired anywhere (will this pollute function signatures more? Is there a cleaner way to do this?)
    • only offer validation in "risky" functions? Some functions already have the option to throw exceptions if unexpected data is encountered
    • to what extent is it the users responsibility to validate and to what extent is it praatios?
    opened by timmahrt 7
  • Creating textgrid with data

    Creating textgrid with data

    Hi, I'm new to praat and after following your tutorial I've created a Textgrid that looks like this:

    bobby.TextGrid File type = "ooTextFile short" Object class = "TextGrid"

    0.0 1.194625 1 "IntervalTier" "phoneme" 0.0 1.194625 1 0.0 1.194625 ""

    However when I tried to print out data from my textgrid it doesn't work. When I use bobby_phone.TextGrid for example it works. My question is do you need the praat.exe to do it manually?

    Bobby_phone.Textgrid: File type = "ooTextFile" Object class = "TextGrid"

    xmin = 0.0 xmax = 1.194625 tiers? size = 1 item []: item [1]: class = "IntervalTier" name = "phone" xmin = 0.0 xmax = 1.18979591837 intervals: size = 15 intervals [1]: xmin = 0.0124716553288 xmax = 0.06469123242311078 text = "" intervals [2]: xmin = 0.06469123242311078 xmax = 0.08438971390281873 text = "B" intervals [3]: xmin = 0.08438971390281873 xmax = 0.23285789838876556 text = "AA1" intervals [4]: xmin = 0.23285789838876556 xmax = 0.2788210218414174 text = "B" intervals [5]: xmin = 0.2788210218414174 xmax = 0.41156462585 text = "IY0" intervals [6]: xmin = 0.41156462585 xmax = 0.47094510353588265 text = "R" intervals [7]: xmin = 0.47094510353588265 xmax = 0.521315192744 text = "IH1" intervals [8]: xmin = 0.521315192744 xmax = 0.658052967538796 text = "PT" intervals [9]: xmin = 0.658052967538796 xmax = 0.680952380952 text = "DH" intervals [10]: xmin = 0.680952380952 xmax = 0.740816326531 text = "AH0" intervals [11]: xmin = 0.740816326531 xmax = 0.807647261005538 text = "L" intervals [12]: xmin = 0.807647261005538 xmax = 0.910430839002 text = "EH1" intervals [13]: xmin = 0.910430839002 xmax = 0.980272108844 text = "JH" intervals [14]: xmin = 0.980272108844 xmax = 1.1171482864527198 text = "ER0" intervals [15]: xmin = 1.1171482864527198 xmax = 1.18979591837 text = ""

    opened by Terrance82 7
  • Extracting out the phoneme / phone from audio

    Extracting out the phoneme / phone from audio

    Hi Tim,

    You've explained how to create a blank textgrid from audio file in your tutorial. I was wondering how do you extract out the (phoneme / phone) and words from audio and insert into textgrid.

    Thank you!

    opened by jiunn95 7
  • Tier entries that have blank labels are not read

    Tier entries that have blank labels are not read

    I have many textgrid files with some or all the labels in specific tiers being deliberately set to be blank. Praatio just skips over them, ignoring the time information (which is what I need to retrieve). I have tried both a prebuilt version from "pip install" and the latest version from github, installed using setup.py

    The attached file has three entries in its only tier, and two of them are empty. Only one is retrieved by praatio.

    EmptyLabelBug.Txt

    opened by stevebeet 6
  • xsampa.py lack of license

    xsampa.py lack of license

    praatio/utilities/xsampa.py is listed as "does not carry any license". IANAL, but my understanding is that without any license, no one has permission to use it. It may be included in the source of the project with permission, but without an explicit license for the file no one else can know exactly how they are allowed to use it, if at all. Is there any way to contact the author and get a specific license for the file, such as MIT like the main project?

    opened by toddrme2178 6
  • openTextgrid() cannot correctly parse the file if there are '\n's within the label text of interval tiers

    openTextgrid() cannot correctly parse the file if there are '\n's within the label text of interval tiers

    Files like the following:

    item []:
    	item [1]:
    		class = "IntervalTier"
    		name = "Tokens"
    		xmin = 0.0
    		xmax = 16.6671875
    		intervals: size = 22
    		intervals [1]:
    			xmin = 0.0
    			xmax = 0.32
    			text = "#"
    		intervals [2]:
    			xmin = 0.32
    			xmax = 1.165
    			text = "zao
    chen
    liu
    wan
    er
    ne"
    

    Only the "zao part is recognized. According to the manual of Praat, string variables are identified by double quotes instead of newlines. (double quotes in text are turned into two double quotes in the file: " → """" image

    It is not hard to fix it, but I'm unfamiliar with git/github. So I paste the changed code in below (in place of original _fetchRow in tgio):

    def _fetchRow_for_text(dataStr, searchStr, index):
        startIndex = dataStr.index(searchStr, index) + len(searchStr)
        first_quote_index = dataStr.index("\"", startIndex)
        
        looking = True
        next_quote_index = dataStr.index("\"", first_quote_index+1)
        while looking:
            try:
                neighbor_letter = dataStr[next_quote_index+1]
                if neighbor_letter == "\"":
                    next_quote_index = dataStr.index("\"", next_quote_index+1)
                else:
                    looking = False
            except IndexError:
                looking = False
        final_quote_index = next_quote_index
        
        word = dataStr[first_quote_index+1:final_quote_index]
        word = word.replace("\"\"", "\"")
        
        return word, final_quote_index + 1
    

    I suppose it might be possible that in other places, like textgrid short version reading and writing, there are also problems due to this issue.

    opened by GalaxieT 5
  • Why filter out empty labels from Intervals?

    Why filter out empty labels from Intervals?

            if tierType == INTERVAL_TIER:
                while True:
                    try:
                        timeStart, timeStartI = _fetchRow(tierData,
                                                          "xmin = ", labelI)
                        timeEnd, timeEndI = _fetchRow(tierData,
                                                      "xmax = ", timeStartI)
                        label, labelI = _fetchRow(tierData, "text =", timeEndI)
                    except (ValueError, IndexError):
                        break
                    
                    label = label.strip()
                    if label == "":
                        continue
                    tierEntryList.append((timeStart, timeEnd, label))
                tier = IntervalTier(tierName, tierEntryList, tierStart, tierEnd)
    

    Why wouldn't I want the intervals exactly as they appear in the file?

    opened by macriluke 5
  • Overhaul audio.py

    Overhaul audio.py

    audio.py suffers from a number of inconsistencies and bugs.

    This PR aims to remove redundant methods and add documentation, and test coverage to audio.py

    opened by timmahrt 4
  • More idiomatic json format

    More idiomatic json format

    The json format exported by praatio largely mirrors the textgrids, in terms of the data that is output.

    As requested on a different github project, (https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/453) there isn't really any need to be bound by the textgrids. We should structure the json files to be their own thing.

    Details can be found in the above link.

    opened by timmahrt 0
  • TextGrid.tierDict can be modified, corrupting the TextGrid

    TextGrid.tierDict can be modified, corrupting the TextGrid

    As far as I can tell from your examples, the canonical way to access a TextGridTier from a TextGrid is to access the internal tierDict (Example).

    However, because tierDict is mutable, one can end up adding a new TextGridTier to this tierDict, e.g.,

    new_textgrid_tier = new_textgrid.tierDict.setdefault(
        "new_tier",
        IntervalTier(
            "new_tier", [], textgrid_obj.minTimestamp, textgrid_obj.maxTimestamp
        )
    )
    

    If this happens, the TextGrid will essentially be in an Illegal State, because TextGrid.tierNameList will be missing the new tier. This obviously breaks functions like TextGridTier#replaceTier, among other things.

    I think there are at least two possible solutions:

    1. Add a getTier method and rename tierDict to _tierDict to indicate that it is a protected member that shouldn't be altered
    2. Change tierDict to an OrderedDict and remove tierNameList entirely. You can then just use tierDict.keys() in place of tierNameList

    I think it'd probably be best to implement both solutions, particularly because it's dangerous to have two parallel data structures that you need to keep in sync. However, simply implementing number 1 alone should solve the problem quickly and easily.

    I'm also happy to submit a PR for this if you'd like.

    Thanks again for all your work!

    opened by scottmk 2
  • Potential bug in `audio.extractSubwav` and/or `audio.openAudioFile`

    Potential bug in `audio.extractSubwav` and/or `audio.openAudioFile`

    audio.extractSubwav calls audio.openAudioFile with a keepList containing startT and endT and no deleteList:

    def extractSubwav(fn: str, outputFN: str, startT: float, endT: float) -> None:
        audioObj = openAudioFile(
            fn,
            [
                (startT, endT, ""),
            ],
            doShrink=True,
        )
        audioObj.save(outputFN)
    

    In audio.openAudioFile L491, it calls utils.invertIntervalList with duration given as the min:

    elif deleteList is None and keepList is not None:
            computedKeepList = []
            computedDeleteList = utils.invertIntervalList(
                [(start, end) for start, end, _ in keepList], duration
            )
    

    In this case, the computedDeleteList will always be empty, because the min time provided is the end of the original wav file.

    This means when you call audio.extractSubwav, you will always get an empty result.

    To further complicate things, even if you changed L491 to specify min=0, max=duration, you'll still get an empty result for audio.extractSubwav.

    This is because on L489, it sets computedKeepList = [], and then on L499, the original keepList is overridden with the now empty computedKeepList:

    keepList = [(row[0], row[1], _KEEP) for row in computedKeepList]
    

    Finally, when we actually do the operations, because audio.extractSubwav calls audio.openAudioFile with doShrink=True, it won't actually delete anything on LL519-521:

            elif label == _DELETE and doShrink is False:
                zeroPadding = [0] * int(framerate * diff)
                audioSampleList.extend(zeroPadding)
    

    And it won't keep anything because keepList is empty.

    opened by scottmk 6
  • Validate support for Klattgrids

    Validate support for Klattgrids

    I had a project that needed support for Klattgrids, so I added it to Praatio. However that was a long time ago and I'm not sure the code still is functioning.

    It was refactored a bit in the move from Praatio 4 -> Praatio 5.

    We should add some more robust tests for it maybe?

    opened by timmahrt 0
Owner
Tim
I write tools for working with speech data.
Tim
Library for working with sound files of the format: .ogg, .mp3, .wav

Library for working with sound files of the format: .ogg, .mp3, .wav. By work is meant - playing sound files in a straight line and in the background, obtaining information about the sound file (author, performer, duration, bitrate, and so on). Playing goes through the pygame, and getting information through the mutagen.

Romanin 2 Dec 15, 2022
convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

convert-to-opus-cli convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format. Installation Must have installed ffmp

null 4 Dec 21, 2022
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

beetbox 419 Dec 26, 2022
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

beetbox 359 Feb 15, 2021
Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Rhet Turnbull 14 Nov 2, 2022
Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.

Bruno Gola 34 Jun 29, 2022
praudio provides audio preprocessing framework for Deep Learning audio applications

praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.

Valerio Velardo 105 Dec 26, 2022
Python I/O for STEM audio files

stempeg = stems + ffmpeg Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio stream

Fabian-Robert Stöter 72 Dec 23, 2022
Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord

David Mainoo 1 Oct 29, 2021
This bot can stream audio or video files and urls in telegram voice chats

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) ?? Follow me and star this repo for more telegram bot

WiskeyWorm 4 Oct 9, 2022
nicfit 425 Jan 1, 2023
Carnatic Notes Predictor for audio files

Carnatic Notes Predictor for audio files Link for live application: https://share.streamlit.io/pradeepak1/carnatic-notes-predictor-for-audio-files/mai

null 1 Nov 6, 2021
Real-time audio visualizations (spectrum, spectrogram, etc.)

Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco

Timothée Lecomte 700 Dec 31, 2022
C++ library for audio and music analysis, description and synthesis, including Python bindings

Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.

Music Technology Group - Universitat Pompeu Fabra 2.3k Jan 3, 2023
Python library for audio and music analysis

librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut

librosa 5.6k Jan 6, 2023
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 2, 2023
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 3.8k Feb 17, 2021
Python audio and music signal processing library

madmom Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is i

Institute of Computational Perception 1k Dec 26, 2022
Python library for handling audio datasets.

AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener

Matthias 121 Nov 27, 2022