User-friendly Voice Cloning Application

Overview

GitHub code size in bytes GitHub top language GitHub GitHub GitHub GitHub GitHub


MLRTVC logo

Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio features to synthesize speeches in that voice based on just a few seconds of unknown audio data.

License

This code is licensed under MIT. For more information regarding the license model or associated duties and rights, click here.

Project History

This project was started in 2021 with the goal of inheriting Corentin Jemine's Real-Time-Voice-Cloning. The project originated from the wish of multi-language support for voice cloning models and is now maintained and enhanced by contributing volunteers.

Contributing

We welcome all those interested in the project, from beginners to experts. The MLRTVC community standard is a nice, open-minded and efficient working climate. We encourage all those with ideas to take part in the project by sharing their thoughts.
There are multiple meaningful ways of contributing:

  • Developing code (new features, fixes, enhancements)
  • Writing documentation
  • Raising issues (bugs, feature requests, enhancement proposals, code refacturing, etc.)
  • Providing pre-trained models
  • Participating in community tasks (code reviews, discussions, maintenance, etc.)

For transparacy reasons, we ask you to engage with this project via the official ways (issues, pull requests) to share knowledge and questions publicly. Only in cases where privacy or confidentiality is of great importance, other communication channels are accepted (email, chat, etc.).

Further information can be gained in the Contributing Guidelines.

Comments
  • Building the MLRTVC Application together

    Building the MLRTVC Application together

    @Dannypeja @Aatularyan @AsterTheWanderer @Fzr2k @CrazyPlaysHD @lavarith @Andredenise @wotulong @cryptoCommunitiesCONTROLfuture-RAOKindx @HirabayashiCallie @DimensionWarped @Archviz360 @francisco-renteria @AlexSteveChungAlvarez @mithisha @brcisna @PabloSandovalxtxt @alexpeattie @matheusfillipe @vanpelt @cforcomputer @cclauss @mathigatti @CorentinJ @akaitsuki-ii @LeonardsonCC @greatabel @Jinksi @sebiglesias @akozdev @musikalkemist @aydodo @SynapticSage @noahyoda @vidalmaxime @jason-h-35 @thecodeboss @iamgroot42 @leopard627 @mruettgers @berdakh @adilsoybali @EdwardELeininger @DanielLin94144 @PatentLobster

    Dear fans of the Voice Cloning Community,

    since we are all interested in Voice Cloning or Audio Engineering overall and all have either used or co-developed the Real-Time-Voice-Cloning repo, we are well aware, that the repository is not maintained anymore and lacks structure, meaning that many errors stay, issues cannot get handled and the documentation is unsatisfying.

    For those and more reasons, I have decided to begin a new project, aiming to provide the functionality of Real-Time-Voice-Cloning but enhancing it by adding support for other languages, taking multiple developers/maintainers abord and writing a good and clear documentation that explains both the theory behind and the application of the code.

    Since I wanted to wait for an initial response, I have not yet begun structuring the repository (except for some basics) but I have a few things in mind that need to be taken care of if we want to improve the original repo:

    1. Writing a documentation that can actually help beginners (tricky, needs to be done by someone who really understands the theory of pre-processing, spectrograms, etc.). First, a wiki structure needs to be discussed/drafted before actually writing it with all its sections (Theory, Prerequesites, Download, Learning, Training, Using GUI, etc.)
    2. Structuring the directory hierarchy (lots of loose scripts in the old repo, maybe structuring it differently/cleaner -> would mean code adaption though)
    3. Adding pre-trained models for different languages (I am working on a German model myself at the moment, could take some time though)
    4. Developing a well-oiled work routine where issues get handled, new ideas discussed and things done.

    It would be really sad to see a good idea end on the siding just for lack-of-maintenance reasons.

    I am waiting for responses...

    Greetings Sven

    help wanted question discussion/vote 
    opened by sveneschlbeck 11
  • Discord Server Community

    Discord Server Community

    I have created a Link for the next 100 members: https://discord.gg/asym7axdgd After that we can renew but I wanted to avoid getting trolled. Feel free to add it to the readme @sveneschlbeck

    opened by Dannypeja 2
  • Add missing data_parallel_workaround import.

    Add missing data_parallel_workaround import.

    Attempt to resolve error on multi-GPU system where vocoder preprocess fails because "data_parallel_workaround" is not recognized in mlrtvc/src/core/synthesizer/synthesize.py.

    opened by raccoonML 1
  • Repository structure

    Repository structure

    I have refactored the code from the original RTVC into the new repo structure that we developed in https://github.com/sveneschlbeck/Multi-Language-RTVC/discussions/2 .

    Everything is functional, including the toolbox, CLI, preprocess and training. The training scripts save into the appropriate folder in the new saved_models folder structure, including the language code.

    Some documentation on preprocess and training is provided. It is not enough, but can be expanded upon.

    I did not implement a unified mel spectrogram definition for the encoder, synth, and vocoder. This is no longer necessary as the repo directory structure has been simplified without it.

    A downloadable release including pretrained models can be obtained here: https://github.com/raccoonML/Multi-Language-RTVC/releases/tag/v1.0

    opened by raccoonML 1
  • Impliment accessibility and sapification support

    Impliment accessibility and sapification support

    Hi devs. I'm not a programmer, but I suggest making the windows app accessible to screen readers and assistive technologies. The other thing I suggest is making the app able to export the models as a sapi5 voice so assistive techs can use them. The most crucial requirement is responsiveness and realtime output, which means that the voice shouldn't lag before the speech or in the middle of inference. Please impliment these suggestions. What does this app use for guis? Wxpython is the most accessible one in my opinion, as it uses native controles which are accessible to assistive technologies. Thanks and I hope you impliment them.

    opened by king-dahmanus 7
  • hifi-gan vocoder

    hifi-gan vocoder

    As many of you know, hifi-gan is a recent neural vocoder that is very popular for its speed and quality. I will replace the current WaveRNN vocoder with hifi-gan.

    Making this announcement to avoid duplication of effort.

    advanced 
    opened by raccoonML 4
  • First step to multi-language support

    First step to multi-language support

    #11 @raccoonML @Dannypeja @Aatularyan @AsterTheWanderer @Fzr2k @CrazyPlaysHD

    I just pushed some changes to the new branch ml_support (https://github.com/sveneschlbeck/Multi-Language-RTVC/tree/ml_support).

    I updated the following files:

    • mlrtvc/src/core/synthesizer/utils/symbols.py
    • mlrtvc/src/core/synthesizer/utils/cleaners.py
    • mlrtvc/src/core/synthesizer/utils/numbers.py

    The changes are quite complicated as they reveal further problems:

    1. The language_code parameter now appears very often. I haven't yet thought about what needs to be done with it or where it can get called from, I just implemented it in several functions as an argument.
    2. While adding abbreviations for other languages was quite easy, the number thing is quite heavy. Differences between decimal points (USA) and decimal commas (EU & more) make it quite harsh to deal with.
    3. All methods imported from inflect are designed for the English language (plurals, declinations, etc.). I don't know how to solve this for other languages. This is still a big flaw in the code.

    The checklist for the next steps involves:

    • [ ] Adding abbreviations (from native speakers if possible) for English, German, Spanish & French in the module mlrtvc/src/core/synthesizer/utils/cleaners.py
    • [ ] Checking symbols/characters/letters (from native speakers if possible) for English, German, Spanish & French in the module mlrtvc/src/core/synthesizer/utils/symbols.py
    • [ ] Solutions/substitutes for inflect
    • [ ] Embedding the changed files into the context of all other files (especially thinking about language_code parameter that appears in different functions but isn't yet defined/called/imported/read out)
    enhancement help wanted question advanced expert 
    opened by sveneschlbeck 14
  • ``environment.yml``

    ``environment.yml``

    The requirements.txt in the old repo was inaccurate and not very well-written. We need to fill the environment.yml in this repository with the essential packages needed, specifying versions of packages only if it really needs to be that very version.

    help wanted advanced 
    opened by sveneschlbeck 4
  • Added GUI docker support

    Added GUI docker support

    i added a Dockerfile for setting up this repo. Some guides recommend venv, some virtualenv some conda, and all of them mostly don't work the same way. I created this dockerfile to run everything inside docker enviornment, so just using a few docker build and docker run command should get you this environment working.

    I'm not sure about the location for documentation, since the complete repository is empty as of now. I have added how do use the dockerfile under ## Using Docker section in the main README.md.

    To bring GUI outside this docker you need to forward X like for arch linux its

    xhost + command. This will change depending on the OS, so feel free to add to it if you know about other OSes.

    image

    opened by rushic24 3
Owner
Sven Eschlbeck
"The more I C, the less I see."
Sven Eschlbeck
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022
Pyrogram bot to automate streaming music in voice chats

Pyrogram bot to automate streaming music in voice chats Help If you face an error, want to discuss this project or get support for it, join it's group

Roj 124 Oct 21, 2022
Python interface to the WebRTC Voice Activity Detector

py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p

John Wiseman 1.5k Dec 22, 2022
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats

SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Note Neither this, or PyTgCalls are fully

SU Projects 58 Jan 2, 2023
Play any song directly into your group voice chat.

Telegram VCPlayer Bot Play any song directly into your group voice chat. Official Bot : VCPlayerBot | Discussion Group : VoiceChat Music Player Suppor

Shubham Kumar 50 Nov 21, 2022
A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

TeamOfDaisyX 20 Jun 11, 2021
DaisyXmusic ❤ A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

TeamOfDaisyX 34 Oct 22, 2022
Jarvis From Basic to Advance - make a voice assistant similar to JARVIS (in iron man movie)

JARVIS (Basic to Advance) This was my attempt to make a voice assistant similar to JARVIS (in iron man movie) Let's be honest, it's not as intelligent

codesempai 17 Dec 25, 2022
A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music

Auto-Baby-Cry-Detection-with-Music-Player A simple voice detection system which can be applied practically for designing a device with capability to d

null 2 Dec 15, 2021
Stevan KZ 1 Oct 27, 2021
This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

OneBit 1 Nov 5, 2021
This is my voice assistant Patric!

voice-assistant This is my voice assistant Patric! You can add can add commands and even modify his name Indice How to use Installation guide How to u

Norbert Gabos 1 Jun 28, 2022
Voice helper on russian

Voice helper on russian

KreO 1 Jun 30, 2022
This bot can stream audio or video files and urls in telegram voice chats

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) ?? Follow me and star this repo for more telegram bot

WiskeyWorm 4 Oct 9, 2022
A Simple Script that will help you to Play / Change Songs with just your Voice

Auto-Spotify using Voice Recognition A Simple Script that will help you to Play / Change Songs with just your Voice Explore the docs » Table of Conten

Mehul Shah 1 Nov 21, 2021
Open-Source bot to play songs in your Telegram's Group Voice Chat. Powered by @Akki_ThePro

VcPlayer Telegram Voice-Chat Bot [PyTGCalls] ⇝ Requirements ⇜ Account requirements A Telegram account to use as the music bot, You cannot use regular

Akki ThePro 2 Dec 25, 2021
voice assistant made with python that search for covid19 data(like total cases, deaths and etc) in a specific country

covid19-voice-assistant voice assistant made with python that search for covid19 data(like total cases, deaths and etc) in a specific country installi

Miguel 2 Dec 5, 2021
Voice to Text using Raspberry Pi

This module will help to convert your voice (speech) into text using Speech Recognition Library. You can control the devices or you can perform the desired tasks by the word recognition

Raspberry_Pi Pakistan 2 Dec 15, 2021
A voice assistant which can be used to interact with your computer and controls your pc operations

Introduction ??‍?? It is a voice assistant which can be used to interact with your computer and also you have been seeing it in Iron man movies, but t

Sujith 84 Dec 22, 2022