100 Repositories
Python jvm-languages Libraries
GULAG: GUessing LAnGuages with neural networks
GULAG: GUessing LAnGuages with neural networks Classify languages in text via neural networks. Привет! My name is Egor. Was für ein herrliches Frühl
Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further languages
Coreferee Author: Richard Paul Hudson, Explosion AI 1. Introduction 1.1 The basic idea 1.2 Getting started 1.2.1 English 1.2.2 French 1.2.3 German 1.2
Optical Character Recognition + Instance Segmentation for russian and english languages
Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты
Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.
Language Identifier What is this ? The goal of this project is to create a model that is able to predict a given sentence language through text proces
A python-jvm wrapper for interacting with the Ergo blockchain.
ergpy A python-jvm wrapper for interacting with the Ergo blockchain. Usage Requires Java 8 or higher to be installed on system pip install JPype1 pip
Input english text, then translate it between languages n times using the Deep Translator Python Library.
mass-translator About Input english text, then translate it between languages n times using the Deep Translator Python Library. How to Use Install dep
Transcript-Extractor-Bot - Yet another Telegram Voice Recognition bot but using vosk and supports 20+ languages
transcript extractor Yet another Telegram Voice Recognition bot but using vosk a
Searches a document for hash tags. Support multiple natural languages. Works in various contexts.
ht-getter Searches a document for hash tags. Supports multiple natural languages. Works in various contexts. This package uses a non-regex approach an
Learn other languages using artificial intelligence with python.
The main idea of the project is to facilitate the learning of other languages. We created a simple AI that will interact with you. Just ask questions that if she knows, she will answer.
Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities
This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer vision can be used to identify cognates known to exist, and perhaps lead linguists to evidence of unknown cognates.
This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project
Common Voice Utils This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project. It aims t
Multilingual finetuning of Machine Translation model on low-resource languages. Project for Deep Natural Language Processing course.
Low-resource-Machine-Translation This repository contains the code for the project relative to the course Deep Natural Language Processing. The goal o
NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages
NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages. This project was supported by lacuna-fund initiatives. Jump straight to one of the sections below, or just scroll down to find out more.
Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"
Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data
Calculatrix is a project where I'll create plenty of calculators in a lot of differents languages
Calculatrix What is Calculatrix ? Calculatrix is a project where I'll create plenty of calculators in a lot of differents languages. I know this sound
DaCe is a parallel programming framework that takes code in Python/NumPy and other programming languages
aCe - Data-Centric Parallel Programming Decoupling domain science from performance optimization. DaCe is a parallel programming framework that takes c
Parsing, analyzing, and comparing source code across many languages
Semantic semantic is a Haskell library and command line tool for parsing, analyzing, and comparing source code. In a hurry? Check out our documentatio
English loanwords in the world's languages
Wiktionary as CLDF Content cldf1 and cldf2 contain cldf-conform data sets with a total of 2 377 756 entries about the vocabulary of all 1403 languages
Text editor on python tkinter to convert english text to other languages with the help of ployglot.
Transliterator Text Editor This is a simple transliteration program which is used to convert english word to phonetically matching word in another lan
Implements a polyglot REPL which supports multiple languages and shared meta-object protocol scope between REPLs.
MetaCall Polyglot REPL Description This repository implements a Polyglot REPL which shares the state of the meta-object protocol between the REPLs. Us
A calibre plugin that generates Word Wise and X-Ray files then sends them to Kindle. Supports KFX, AZW3 and MOBI eBooks. X-Ray supports 18 languages.
WordDumb A calibre plugin that generates Word Wise and X-Ray files then sends them to Kindle. Supports KFX, AZW3 and MOBI eBooks. Languages X-Ray supp
SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages
SASM (SimpleASM) - простая кроссплатформенная среда разработки для языков ассемблера NASM, MASM, GAS, FASM с подсветкой синтаксиса и отладчиком. В SA
Algorithms written in different programming languages
Data Structures and Algorithms Clean example implementations of data structures and algorithms written in different languages. List of implementations
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text. Diff: Compare two blocks o
streamlit translator is used to detect and translate between languages created using gTTS, googletrans, pillow and streamlit python packages
Streamlit Translator Streamlit Translator is a simple translator app to detect and translate between languages. Streamlit Translator gets text and lan
A Python library for inspecting JVM class files (.class)
lawu Lawu is a human-friendly library for assembling, disassembling, and exploring JVM class files. It's highly suitable for automation tasks. Documen
OpenAI CLIP text encoders for multiple languages!
Multilingual-CLIP OpenAI CLIP text encoders for any language Colab Notebook · Pre-trained Models · Report Bug Overview OpenAI recently released the pa
Multilingual word vectors in 78 languages
Aligning the fastText vectors of 78 languages Facebook recently open-sourced word vectors in 89 languages. However these vectors are monolingual; mean
Similarity checking of sign languages
Similarity checking of sign languages This repository checks for similarity betw
Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.
PythonTextObfuscator Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense. Requi
Contains descriptions and code of the mini-projects developed in various programming languages
TexttoSpeechAndLanguageTranslator-project introduction A pleasant application where the client will be given buttons like play,reset and exit. The cli
Tools and data for measuring the popularity & growth of various programming languages.
growth-data Tools and data for measuring the popularity & growth of various programming languages. Install the dependencies $ pip install -r requireme
A Telegram bot to extracting text from images. All languages supported.
OCR Bot A Telegram bot to extracting text from images. All languages supported. Deploy to Heroku Local Deploying Clone the repo git clone https://gith
A telegram bot to translate 100+ Languages
🔥 GOOGLE TRANSLATER 🔥 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🔰 Deploy To Railway 🔰 • • ✅ OFF
🏆 • 5050 most frequent words in 109 languages
🏆 Most Common Words Multilingual 5000 most frequent words in 109 languages. Uses wordfrequency.info as a source. 🔗 License source code license data
Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains
Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains This is an accompanying repository to the ICAIL 2021 pap
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Hiring We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on NLP and large-scale pre-traine
CodeBERT: A Pre-Trained Model for Programming and Natural Languages.
CodeBERT This repo provides the code for reproducing the experiments in CodeBERT: A Pre-Trained Model for Programming and Natural Languages. CodeBERT
A hackerank problems, solution repository
This is a repository for all hackerank challenges kindly note this is for learning purposes and if you wish to contribute, dont hesitate all submision
swagger-codegen contains a template-driven engine to generate documentation, API clients and server stubs in different languages by parsing your OpenAPI / Swagger definition.
Master (2.4.25-SNAPSHOT): 3.0.31-SNAPSHOT: Maven Central ⭐ ⭐ ⭐ If you would like to contribute, please refer to guidelines and a list of open tasks. ⭐
Python supercharged for the fastai library
Welcome to fastcore Python goodies to make your coding faster, easier, and more maintainable Python is a powerful, dynamic language. Rather than bake
Code to reproduce the results in "Visually Grounded Reasoning across Languages and Cultures", EMNLP 2021.
marvl-code [WIP] This is the implementation of the approaches described in the paper: Fangyu Liu*, Emanuele Bugliarello*, Edoardo M. Ponti, Siva Reddy
A single model that parses Universal Dependencies across 75 languages.
A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology tags, lemmas, and dependency trees.
A simple language for new programmers and a toy language ;)
Yell An extremely simple, yet powerful language for new programmers, as well as a toy language ;) Explore the docs » Report Bug · Request Feature Yell
NER for Indian languages
CL-NERIL: A Cross-Lingual Model for NER in Indian Languages Code for the paper - https://arxiv.org/abs/2111.11815 Setup Setup a virtual environment Th
Stanza: A Python NLP Library for Many Human Languages
Official Stanford NLP Python Library for Many Human Languages
Lingtrain Alignment Studio is an ML based app for texts alignment on different languages.
Lingtrain Alignment Studio Intro Lingtrain Alignment Studio is the ML based app for accurate texts alignment on different languages. Extracts parallel
PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages
PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages Abstract NLP applications for code-mixed (CM) or mix-li
AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages
AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages This repository contains the code for the pa
Tools for analyzing Java JVM gc log files
gc_log This package consists of two separate utilities useful for : gc_log_visualizer.py regionsize.py GC Log Visualizer This was updated to run under
My collection of mini-projects in various languages
Mini-Projects My collection of mini-projects in various languages About: This repository consists of a number of small projects. Most of these "mini-p
Blazing fast language detection using fastText model
Luga A blazing fast language detection using fastText's language models Luga is a Swahili word for language. fastText provides a blazing fast language
A programming language with logic of Python, and syntax of all languages.
Pytov The idea was to take all well known syntaxes, and combine them into one programming language with many posabilities. Installation Install using
A Python 3.6+ package to run .many files, where many programs written in many languages may exist in one file.
RunMany Intro | Installation | VSCode Extension | Usage | Syntax | Settings | About A tool to run many programs written in many languages from one fil
Multtable is a collection of multiplication table generators in various languages.
Multtable Multtable is a collection of multiplication table generators in various languages. This project was created as a joke based on one of my bro
Implementation of Natural Language Code Search in the project CodeBERT: A Pre-Trained Model for Programming and Natural Languages.
CodeBERT-Implementation In this repo we have replicated the paper CodeBERT: A Pre-Trained Model for Programming and Natural Languages. We are interest
PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different languages
PyMultiDictionary PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different
Open source Board Games Like Tic Tac Toe, Connect 4, Ludo, Snakes and Ladder etc...
Board-Games What to do... Add Board games like Tic Tac Toe, Connect 4, Ludo, Snakes and Ladder etc... How to do... Fork the repo Clone the repo git cl
BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.
BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia. Its intended use is as input for neural models in natural language processing.
🐳 Docker templates for various languages.
Docker Deployment Templates One Stop repository for Docker Compose and Docker Templates for Deployment. Features Python (FastAPI, Flask) Screenshots D
This repository is for adding codes of data structures and algorithms, leetCode, hackerrank etc solutions in different languages
DSA-Code-Snippet This repository is for adding codes of data structures and algorithms, leetCode, hackerrank etc solutions in different languages Cont
Auto translate Localizable.strings for multiple languages in Xcode
auto_localize Auto translate Localizable.strings for multiple languages in Xcode Usage put your origin Localizable.strings file in folder pip3 install
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
py-trans is a Free Python library for translate text into different languages.
Free Python library to translate text into different languages.
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities
Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)
🎴 LearnQuick is a flashcard application that you can study with decks and cards.
🎴 LearnQuick is a flashcard application that you can study with decks and cards. The main function of the application is to show the front sides of the created cards to the user and ask them to guess the back of the card. As a result of self-assessment of user's ability to guess the back sides of the displayed cards, the cards that user weak against are shown more often, and the cards that user strong against are shown less frequently.
End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.
End to end text to speech system using gruut and onnx
This tool ability to analyze software packages of different programming languages that are being or will be used in their codes, providing information that allows them to know in advance if this library complies with processes.
This tool gives developers, researchers and companies the ability to analyze software packages of different programming languages that are being or will be used in their codes, providing information that allows them to know in advance if this library complies with processes. secure development, if currently supported, possible backdoors (malicious embedded code), typosquatting analysis, the history of versions and reported vulnerabilities (CVEs) of the package.
Punctuation Restoration using Transformer Models for High-and Low-Resource Languages
Punctuation Restoration using Transformer Models This repository contins official implementation of the paper Punctuation Restoration using Transforme
This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".
CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo
मराठी भाषा वाचविण्याचा एक प्रयास. इंग्रजी ते मराठीचा शब्दकोश. An attempt to preserve the Marathi language. A lightweight and ad free English to Marathi thesaurus.
For English, scroll down मराठी शब्द मराठी भाषा वाचवण्यासाठी मी हा ओपन सोर्स प्रोजेक्ट सुरू केला आहे. माझ्या मते, आपली भाषा हळूहळू आणि कोणाचाही लक्षात
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.
XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.
XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang
A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/casual, active/passive, and many more. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Styleformer A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/cas
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).
Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w
Rick Astley Language is a rick roll oriented, dynamic, strong, esoteric programming language.
Rick Roll Language / Rick Astley Language A rick roll oriented, dynamic, strong, esoteric programming language. Prolegomenon The reasons that I made t
TANL: Structured Prediction as Translation between Augmented Natural Languages
TANL: Structured Prediction as Translation between Augmented Natural Languages Code for the paper "Structured Prediction as Translation between Augmen
Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages
Coreferee Author: Richard Paul Hudson, msg systems ag 1. Introduction 1.1 The basic idea 1.2 Getting started 1.2.1 English 1.2.2 German 1.2.3 Polish 1
OCR engine for all the languages
Description kraken is a turn-key OCR system optimized for historical and non-Latin script material. kraken's main features are: Fully trainable layout
Pure Javascript OCR for more than 100 Languages 📖🎉🖥
Version 2 is now available and under development in the master branch, read a story about v2: Why I refactor tesseract.js v2? Check the support/1.x br
Lightning Fast Language Prediction 🚀
whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and
TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese, German and Easy to adapt for other languages)
🤪 TensorFlowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2, Melgan, Multiband-Melgan, FastSpeech, FastSpeech2 based-on TensorFlow 2. With Tensorflow 2, we can speed-up training/inference progress, optimizer further by using fake-quantize aware and pruning, make TTS models can be run faster than real-time and be able to deploy on mobile devices or embedded systems.
Share constant definitions between programming languages and make your constants constant again
Introduction Reconstant lets you share constant and enum definitions between programming languages. Constants are defined in a yaml file and converted
Simple text to phones converter for multiple languages
Phonemizer -- foʊnmaɪzɚ The phonemizer allows simple phonemization of words and texts in many languages. Provides both the phonemize command-line tool
Write complicated anonymous functions other than lambdas in Python.
lambdex allows you to write multi-line anonymous function expression (called a lambdex) in an idiomatic manner.
Py4J enables Python programs to dynamically access arbitrary Java objects
Py4J Py4J enables Python programs running in a Python interpreter to dynamically access Java objects in a Java Virtual Machine. Methods are called as
Get list of common stop words in various languages in Python
Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words
Official Stanford NLP Python Library for Many Human Languages
Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac
A flexible free and unlimited python tool to translate between different languages in a simple way using multiple translators.
deep-translator Translation for humans A flexible FREE and UNLIMITED tool to translate between different languages in a simple way using multiple tran
Get list of common stop words in various languages in Python
Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words
Official Stanford NLP Python Library for Many Human Languages
Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac
coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a
Easy to use, state-of-the-art Neural Machine Translation for 100+ languages
EasyNMT - Easy to use, state-of-the-art Neural Machine Translation This package provides easy to use, state-of-the-art machine translation for more th
Hello World in different languages !
Hello World And some Examples in different Programming Languages This repository contains a big list of programming languages and some examples for th
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes
Official Stanford NLP Python Library for Many Human Languages
Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set
coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a