1651 Python Text-To-Audio-Windows Libraries

PaSST: Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022

glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Glow-Speak glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end. Installation git clone https://g

8 Dec 25, 2022

Use Youdao OCR API to covert your clipboard image to text.

Alfred Clipboard OCR 注：本仓库基于 oott123/alfred-clipboard-ocr 的逻辑用 Python 重写，换用了有道 AI 的 API，准确率更高，有效防止百度导致隐私泄露等问题，并且有道 AI 初始提供的 50 元体验金对于其资费而言个人用户基本可以永久使用

6 Sep 19, 2022

Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

110 Dec 3, 2022

Image-Viewer is a Windows image viewer based on Python 3.

Image-Viewer Hi! Image-Viewer is a Windows image viewer based on Python 3. Using You must download Image-Viewer.exe from the root of the repository. T

2 Apr 18, 2022

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

19 Oct 14, 2022

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

2 Oct 20, 2021

Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA

n-stage Latent Dirichlet Allocation (n-LDA) Proposed n-LDA & A Novel Approach for classical LDA Latent Dirichlet Allocation (LDA) is a generative prob

4 Mar 7, 2022

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Taming Visually Guided Sound Generation • [Project Page] • [ArXiv] • [Poster] • • Listen for the samples on our project page. Overview We propose to t

226 Jan 3, 2023

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

SPRING This is the repo for SPRING (Symmetric ParsIng aNd Generation), a novel approach to semantic parsing and generation, presented at AAAI 2021. Wi

98 Dec 21, 2022

Photini - A free, easy to use, digital photograph metadata (Exif, IPTC, XMP) editing application for Linux, Windows and MacOS.

A free, easy to use, digital photograph metadata (Exif, IPTC, XMP) editing application for Linux, Windows and MacOS. "Metadata" is said to mea

120 Dec 20, 2022

PyPacker: a dumb little script for turning Python apps into standalone executable packages on Windows

PyPacker: a dumb little script for turning Python apps into standalone executable packages on Windows PyPacker is my attempt at creating a way to make

9 Mar 15, 2022

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

12 Oct 22, 2022

Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral) Run this model on Replicate Optimization: Global directions: Mapper: Check ou

3.3k Jan 5, 2023

Styled Handwritten Text Generation with Transformers (ICCV 21)

⚡ Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We

85 Dec 22, 2022

ScreenTeX is a tool that grabs all text when taking a screenshot rather than getting an image.

The ScreenTeX project By: Seanpm2001 / ScreenTeX, Et; Al. Top README.md Read this article in a different language 🌐 List of languages Sorted by: A-Z

3 Oct 25, 2022

Audio Steganography is a technique used to transmit hidden information by modifying an audio signal in an imperceptible manner.

Audio Steganography Audio Steganography is a technique used to transmit hidden information by modifying an audio signal in an imperceptible manner. Ab

1 Oct 17, 2021

Draw tree diagrams from indented text input

Draw tree diagrams This repository contains two very different scripts to produce hierarchical tree diagrams like this one: $ ./classtree.py collectio

8 Dec 14, 2022

Acoustic mosquito detection code with Bayesian Neural Networks

HumBugDB Acoustic mosquito detection with Bayesian Neural Networks. Extract audio or features from our large-scale dataset on Zenodo. This repository

31 Nov 28, 2022

🚧Useful shortcuts for simple task on windows

Windows Manager A tool containg useful utilities for performing simple shortcut tasks on Windows 10 OS. Features Lit Up - Turns up screen brightness t

0 Mar 24, 2022

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation Tasks | Datasets | LongLM | Baselines | Paper Introduction LOT is a ben

46 Dec 28, 2022

SciBERT is a BERT model trained on scientific text.

1.2k Dec 24, 2022

utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags.

utoken utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresse

11 Jan 5, 2023

Weakly-supervised Text Classification Based on Keyword Graph

Weakly-supervised Text Classification Based on Keyword Graph How to run? Download data Our dataset follows previous works. For long texts, we follow C

20 Dec 29, 2022

Classes and functions for animated text and graphics on an LED display

LEDarcade A collection of classes and functions for animated text and graphics on an Adafruit LED Matrix.

31 Jan 4, 2023

=|= the MsgRoom bot for Windows 96

=|= bot A MsgRoom bot in Python 3 for Windows96.net. The bot joins as =|=, if parameter name_lasts is not true and default_name is =|=. The full

2 Jun 7, 2022

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

MT5_paddle Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer English | 简体中文 mT5: A Massively

2 Oct 17, 2021

Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video.

Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video. You can chose the cha

2 Jul 22, 2022

Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

2 Oct 13, 2021

PyEditor - A Simple Text Editor for python

PyEditor work in progress Text Editor for python Installation git clone https://github.com/ArmenG888/PyEditor Install the libraries Linux or mac pip

3 Mar 15, 2022

Steal Files on a Windows Machine

File-Stealer Steal Files on a Windows Machine About This Script will steal certain Files on a Windows Machine and sends them to a FTP Server. Preview

5 Nov 17, 2022

use Notepad++ for real-time sync after python appending new log text

FTP远程log同步工具使用Notepad++配合来获取实时更新的log文档效果适用于FTP协议的log远程同步工具,配合MT管理器开启FTP服务器使用,通过Notepad++监听文本变化,更便捷的使用电脑查看方法注入打印后的信息功能过滤器对每行要打印的文本使用回调函数筛选，支持链式调用

1 Oct 17, 2021

A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

175 Dec 29, 2022

translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

12 Oct 22, 2022

A simple python program to sign text using either the RSA or ISRSAC algorithm with GUI built using tkinter library.

Digital Signatures using ISRSAC Algorithm A simple python program to sign text using either the RSA or ISRSAC algorithm with GUI built using tkinter l

3 Nov 15, 2022

To lazy to read your homework ? Get it done with LOL

LOL To lazy to read your homework ? Get it done with LOL Needs python 3.x L:::::::::L OO:::::::::OO L:::::::::L L:::::::

4 Dec 8, 2022

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Parrot Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models. A paraphrase framework is more t

690 Jan 4, 2023

Download videos and audio with a graphical interface in python

Youtube-Downloader Download videos and audio with a graphical interface in python Windows To run windows using Command Prompt python main.py linux To

2 Jan 7, 2022

GUI for a Vocal Remover that uses Deep Neural Networks.

4.4k Jan 7, 2023

Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

StyleAttack Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer" Prepare Pois

19 Nov 20, 2022

A simple tool for searching images inside a local folder with text/image input using CLIP

clip-search (WIP) A simple tool for searching images inside a local folder with text/image input using CLIP 10 results for "a blonde woman" in a folde

5 Dec 25, 2022

AutoSub is a CLI application to generate subtitle files (.srt, .vtt, and .txt transcript) for any video file using Mozilla DeepSpeech.

AutoSub About Motivation Installation Docker How-to example How it works TO-DO Contributing References About AutoSub is a CLI application to generate

414 Jan 6, 2023

Telegram bot to download tiktok video/audio

TikTokDL (Bot) Telegram RoBot to Download Tiktok video/audio. Features: 👉 Download TikTok Video without Watermark 👉 Download TikTok Video with Water

23 Nov 21, 2022

This bot can stream audio or video files and urls in telegram voice chats :)

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot

63 Dec 25, 2022

Youtube Downloader is a Graphic User Interface(GUI) that lets users download a Youtube Video or Audio through a URL

Youtube Downloader This Python and Tkinter based GUI allows users to directly download the Best Resolution Videos and Audios from Youtube. Pa-fy Insta

2 Jun 25, 2022

the best video downloader for terminals (currently only compatible with Linux and Windows)

2 Oct 14, 2021

FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis

LST-TTS Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis. Submitted to ICASSP 2022. Audi

64 Dec 30, 2022

A fast implementation of bss_eval metrics for blind source separation

fast_bss_eval Do you have a zillion BSS audio files to process and it is taking days ? Is your simulation never ending ? Fear no more! fast_bss_eval i

99 Dec 13, 2022

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss This is official implement of "

87 Dec 24, 2022

python写的一款免杀工具（shellcode加载器）BypassAV，国内杀软全过（windows denfend）

266 Jan 2, 2023

Audio media crawler for lbry.

Audio media crawler for lbry. Requirements Python 3.8 Poetry 1.1.7 Elasticsearch 7.14.0 Lbry-sdk 0.99.0 Development This project uses poetry as a depe

4 Dec 3, 2022

Leon is an open-source personal assistant who can live on your server.

Leon Your open-source personal assistant. Website :: Documentation :: Roadmap :: Contributing :: Story 👋 Introduction Leon is an open-source personal

11.7k Dec 30, 2022

Python SDK for working with Voicegain Speech-to-Text

Voicegain Speech-to-Text Python SDK Python SDK for the Voicegain Speech-to-Text API. This API allows for large vocabulary speech-to-text transcription

3 Dec 14, 2022

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

African language Speech Recognition - Speech-to-Text

Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l

2 Jan 5, 2023

Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

20 Jan 9, 2023

Phishing-Detective is a command line application for Windows 10 built to detect a phishing site from two url's

Phishing-Detective Phishing-Detective is a command line application for Windows 10 built to detect a phishing site from two url's How it works A simpl

2 Jun 23, 2022

Basic Clojure REPL for Sublime Text

Basic Clojure REPL for Sublime Text Goals: Decomplected: just REPL, nothing more Zero dependencies: works directly with pREPL Compact: Display code ev

23 Dec 24, 2021

Code Jam for creating a text-based adventure game engine and custom worlds

Text Based Adventure Jam Author: Devin McIntyre Our goal is two-fold: Create a text based adventure game engine that can parse a standard file format

4 Dec 26, 2021

Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)

Pano-AVQA Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021) [Paper] [Poster] [Video] Getting Starte

9 Dec 23, 2022

TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat

7 Oct 31, 2021

Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes Setup virtualenv -p python3 venv source venv/bin/activate pip instal

9 May 20, 2022

Source code for the paper "SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text" PACLIC 2021

Adversarial text generator Refer to "adversarial_text_generator"[https://github.com/quocnsh/SEPP_generator] project for generating adversarial texts A

0 Oct 5, 2021

University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

Music-Sentiment-Transfer University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN Poster: Music Sentiment Transfer

2 Jan 24, 2022

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

Swin Transformer for Semantic Segmentation of satellite images This repo contains the supported code and configuration files to reproduce semantic seg

23 Oct 10, 2022

This is text based adventure game

CHOOSE-YOUR-OWN-ADVENTURE This is text based adventure game CONTRIBUTORS Aditya binukumar Srishti Sharma Shiva Tripathi Tanishq Tanwar ABOUT Theme: Ho

3 May 31, 2022

A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat

292 Dec 25, 2022

WorldCloud Orçamento de Estado 2022

World Cloud Orçamento de Estado 2022 What it does This script creates a worldcloud, masked on a image, from a txt file How to run it? Install all libr

2 Oct 12, 2021

A tool for making simple-style text posters or wallpapers with high resolution.

PurePoster PurePoster is a fancy tool for making arbitrary-resolution, simple-style posters or wallpapers with text in center. Functionality PurePoste

4 Jul 9, 2022

This module extends twarc to allow you to print out tweets as text for easy testing on the command line

twarc-text This module extends twarc to allow you to print out tweets as text for easy testing on the command line. Maybe it's useful for spot checkin

2 Oct 12, 2021

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation The code repository for "Audio-Visual Generalized Few-Shot Learning with

3 Jun 27, 2022

Delphi's FireMonkey framework as a Python module for Windows, MacOS, Linux, and Android GUI development.

DelphiFMX4Python Delphi's FireMonkey framework as a Python module for Windows, MacOS, Linux, and Android GUI development. About: The delphifmx library

191 Jan 9, 2023

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

AASIST This repository provides the overall framework for training and evaluating audio anti-spoofing systems proposed in 'AASIST: Audio Anti-Spoofing

56 Jan 2, 2023

A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network The official code of VisionLAN (ICCV2021). VisionLAN successfully a

81 Dec 12, 2022

Azure Neural Speech Service TTS

Written in Python using the Azure Speech SDK. App.py provides an easy way to create an Text-To-Speech request to Azure Speech and download the wav file.

1 Oct 11, 2021

An audio-solving python funcaptcha solving module

funcapsolver funcapsolver is a funcaptcha audio-solving module, which allows captchas to be interacted with and solved with the use of google's speech

8 Nov 21, 2022

Code for classifying international patents based on the text of their titles/abstracts

Patent Classification Goal: To train a machine learning classifier that can automatically classify international patents downloaded from the WIPO webs

1 Nov 8, 2022

Paper: De-rendering Stylized Texts

Paper: De-rendering Stylized Texts Wataru Shimoda1, Daichi Haraguchi2, Seiichi Uchida2, Kota Yamaguchi1 1CyberAgent.Inc, 2 Kyushu University Accepted

55 Dec 18, 2022

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

EdiTTS: Score-based Editing for Controllable Text-to-Speech Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech. Au

98 Dec 25, 2022

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

22 Jul 7, 2022

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

279 Jan 4, 2023

Every Google, Azure & IBM text to speech voice for free

TTS-Grabber Quick thing i made about a year ago to download any text with any tts voice, over 630 voices to choose from currently. It will split the i

16 Dec 7, 2022

Simple python code to fix your combo list by removing any text after a separator or removing duplicate combos

Combo List Fixer A simple python code to fix your combo list by removing any text after a separator or removing duplicate combos Removing any text aft

3 Dec 5, 2022

Given a string or a text file with plain text , returns his encryption using SHA256 method

Encryption using SHA256 Given a string or a .txt file with plain text. Returns his encryption using SHA256 method Requirements : pip install pyperclip

3 Jan 24, 2022

text to speech toolkit. 好用的中文语音合成工具箱，包含语音编码器、语音合成器、声码器和可视化模块。

ttskit Text To Speech Toolkit: 语音合成工具箱。安装 pip install -U ttskit 注意可能需另外安装的依赖包：torch，版本要求torch=1.6.0,=1.7.1，根据自己的实际环境安装合适cuda或cpu版本的torch。 ttskit的

483 Jan 4, 2023

An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.

37 Sep 5, 2022

Train 🤗-transformers model with Poutyne.

poutyne-transformers Train 🤗 -transformers models with Poutyne. Installation pip install poutyne-transformers Example import torch from transformers

2 Dec 18, 2022

Repository containing the code for An-Gocair text normaliser

Scottish Gaelic Text Normaliser The following project contains the code and resources for the Scottish Gaelic text normalisation project. The repo can

3 Jun 28, 2022

EdiTTS: Score-based Editing for Controllable Text-to-Speech

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

99 Jan 2, 2023

Group imports from Windows binaries

importsort This is a tool that I use to group imports from Windows binaries. Sometimes, you have a gigantic folder full of executables, and you want t

15 Aug 27, 2022

PortaSpeech - PyTorch Implementation

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

276 Dec 26, 2022

A taskbar clock for secondary taskbars on Windows 11

ElevenClock A taskbar clock for secondary taskbars on Windows 11. When microsoft's engineers were creating Windows 11, they forgot to add a clock on t

1.7k Jan 7, 2023

PyTorch implementation of Microsoft's text-to-speech system FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

1k Dec 30, 2022

Signature remover is a NLP based solution which removes email signatures from the rest of the text.

Signature Remover Signature remover is a NLP based solution which removes email signatures from the rest of the text. It helps to enchance data conten

8 Jan 6, 2023

a python package that lets you add custom colors and text formatting to your scripts in a very easy way!

colormate Python script text formatting package What is colormate? colormate is a python library that lets you add text formatting to your scripts, it

2 Dec 14, 2022

Write maintainable, production-ready pipelines using Jupyter or your favorite text editor. Develop locally, deploy to the cloud. ☁️

2.9k Jan 6, 2023

Change your Windows background with this program safely & easily!

Background_Changer Table of Contents: About the Program Features Requirements Preview Credits Reach Me See Also About the Program: You can change your

0 Jul 14, 2022

This is the remake of the program PYOBD. It works on Python3 and all new libraries. It was tested on Linux, Windows, and it should work on MAC too.

This is the remake of the program PYOBD. It works on Python3 and all new libraries. It was tested on Linux, Windows, and it should work on MAC too. You just need an ELM327 USB or bluetooth device and a PC(laptop preferably).

127 Jan 6, 2023

Python Text-To-Audio-Windows Resources

Python Text-To-Audio-Windows Libraries

PaSST: Efficient Training of Audio Transformers with Patchout

glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Use Youdao OCR API to covert your clipboard image to text.

Identify the emotion of multiple speakers in an Audio Segment

Image-Viewer is a Windows image viewer based on Python 3.

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

Photini - A free, easy to use, digital photograph metadata (Exif, IPTC, XMP) editing application for Linux, Windows and MacOS.

PyPacker: a dumb little script for turning Python apps into standalone executable packages on Windows

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

Styled Handwritten Text Generation with Transformers (ICCV 21)

ScreenTeX is a tool that grabs all text when taking a screenshot rather than getting an image.

Audio Steganography is a technique used to transmit hidden information by modifying an audio signal in an imperceptible manner.

Draw tree diagrams from indented text input

Acoustic mosquito detection code with Bayesian Neural Networks

🚧Useful shortcuts for simple task on windows

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

SciBERT is a BERT model trained on scientific text.

utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags.

Weakly-supervised Text Classification Based on Keyword Graph

Classes and functions for animated text and graphics on an LED display

=|= the MsgRoom bot for Windows 96

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video.

Text to QR-CODE

PyEditor - A Simple Text Editor for python

Steal Files on a Windows Machine

use Notepad++ for real-time sync after python appending new log text

A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

translate using your voice

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

A simple python program to sign text using either the RSA or ISRSAC algorithm with GUI built using tkinter library.

To lazy to read your homework ? Get it done with LOL

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Download videos and audio with a graphical interface in python

GUI for a Vocal Remover that uses Deep Neural Networks.

Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

A simple tool for searching images inside a local folder with text/image input using CLIP

AutoSub is a CLI application to generate subtitle files (.srt, .vtt, and .txt transcript) for any video file using Mozilla DeepSpeech.

Telegram bot to download tiktok video/audio

This bot can stream audio or video files and urls in telegram voice chats :)

Youtube Downloader is a Graphic User Interface(GUI) that lets users download a Youtube Video or Audio through a URL

the best video downloader for terminals (currently only compatible with Linux and Windows)

FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis

A fast implementation of bss_eval metrics for blind source separation

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

python写的一款免杀工具（shellcode加载器）BypassAV，国内杀软全过（windows denfend）

Audio media crawler for lbry.

Leon is an open-source personal assistant who can live on your server.

Python SDK for working with Voicegain Speech-to-Text

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

African language Speech Recognition - Speech-to-Text

Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Phishing-Detective is a command line application for Windows 10 built to detect a phishing site from two url's

Basic Clojure REPL for Sublime Text

Code Jam for creating a text-based adventure game engine and custom worlds

Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)

TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Source code for the paper "SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text" PACLIC 2021

University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

This is text based adventure game

A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

WorldCloud Orçamento de Estado 2022

A tool for making simple-style text posters or wallpapers with high resolution.

This module extends twarc to allow you to print out tweets as text for easy testing on the command line

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation

Delphi's FireMonkey framework as a Python module for Windows, MacOS, Linux, and Android GUI development.

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

Azure Neural Speech Service TTS

An audio-solving python funcaptcha solving module