1653 Repositories
Python Natural-Language-Processing-Specialization Libraries
Hotpile: High Order Turing Machine Language Compiler
Hotpile: High Order Turing Machine Language Compiler Build and Run Requirements: Python 3.6+, bison, flex, and GCC installed. Needs to be run under UN
Turn any live video stream or locally stored video into a dataset of interesting samples for ML training, or any other type of analysis.
Sieve Video Data Collection Example Find samples that are interesting within hours of raw video, for free and completely automatically using Sieve API
Open source annotation tool for machine learning practitioners.
doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ
Python Libraries with functions and constants related to electrical engineering.
ElectricPy Electrical-Engineering-for-Python Python Libraries with functions and constants related to electrical engineering. The functions and consta
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs".
CrossSum This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summ
Non official, but friendly QvaPay library for the Python language.
Python SDK for the QvaPay API Non official, but friendly QvaPay library for the Python language. Setup You can install this package by using the pip t
fhempy is a FHEM binding to write modules in Python language
fhempy (BETA) fhempy allows the usage of Python 3 (NOT 2!) language to write FHEM modules. Python 3.7 or higher is required, therefore I recommend usi
Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"
CLIN-X (CLIN-X-ES) & (CLIN-X-EN) This repository holds the companion code for the system reported in the paper: "CLIN-X: pre-trained language models a
The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)
Language Models are Few-shot Multilingual Learners Paper This is the source code of the paper [Arxiv] [ACL Anthology]: This code has been written usin
Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English
Breame ( British English and American English) Breame is a lightweight Python package with a number of utility tools to aid in the detection of words
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing. It can be mainly used for non-English language to get accurate and relevant scraped text.
Catch-all collection of generative art made using processing
Generative art with Processing.py Some art I have created for fun. Dependencies Processing for Python, see how to download/use here Packages contained
computer vision, image processing and machine learning on the web browser or node.
Image processing and Machine learning labs computer vision, image processing and machine learning on the web browser or node note Fast Fourier Trans
A gamey, snakey esoteric programming language
Snak Snak is an esolang based on the classic snake game. Installation You will need python3. To use the visualizer, you will need the curses module. T
Contains links to publicly available datasets for modeling health outcomes using speech and language.
speech-nlp-datasets Contains links to publicly available datasets for modeling various health outcomes using speech and language. Speech-based Corpora
🏆 • 5050 most frequent words in 109 languages
🏆 Most Common Words Multilingual 5000 most frequent words in 109 languages. Uses wordfrequency.info as a source. 🔗 License source code license data
Turning images into '9-pan' palettes using KMeans clustering from sklearn.
img2palette Turning images into '9-pan' palettes using KMeans clustering from sklearn. Requirements We require: Pillow, for opening and processing ima
LTGen provides classic algorithms used in Language Theory.
LTGen LTGen stands for Language Theory GENerator and provides tools to implement language theory. Command Line LTGen is a collection of tools to imple
A module grouping multiple translation APIs
translatepy (originally: translate) An aggregation of multiple translation API Translate, transliterate, get the language of texts in no time with the
Evaluation of file formats in the context of geo-referenced 3D geometries.
Geo-referenced Geometry File Formats Classic geometry file formats as .obj, .off, .ply, .stl or .dae do not support the utilization of coordinate syst
A PyTorch-based model pruning toolkit for pre-trained language models
English | 中文说明 TextPruner是一个为预训练语言模型设计的模型裁剪工具包,通过轻量、快速的裁剪方法对模型进行结构化剪枝,从而实现压缩模型体积、提升模型速度。 其他相关资源: 知识蒸馏工具TextBrewer:https://github.com/airaria/TextBrewe
AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation
AtlasNet [Project Page] [Paper] [Talk] AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation Thibault Groueix, Matthew Fisher, Vladimir
Just a Basic like Language for Zeno INC
zeno-basic-language Just a Basic like Language for Zeno INC This is written in 100% python. this is basic language like language. so its not for big p
Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.
Welcome to Healthsea ✨ Create better access to health with spaCy. Healthsea is a pipeline for analyzing user reviews to supplement products by extract
pyo is a Python module written in C to help digital signal processing script creation.
pyo is a Python module written in C to help digital signal processing script creation.
BERT, LDA, and TFIDF based keyword extraction in Python
BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl
BERT-based Financial Question Answering System
BERT-based Financial Question Answering System In this example, we use Jina, PyTorch, and Hugging Face transformers to build a production-ready BERT-b
Python version of the TerminusDB client - for TerminusDB API and WOQLpy
TerminusDB Client Python Development status ⚙️ Python Package status 📦 Python version of the TerminusDB client - for TerminusDB API and WOQLpy Requir
Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.
Illumination_Decomposition Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources. This code implements the
A high-level yet extensible library for fast language model tuning via automatic prompt search
ruPrompts ruPrompts is a high-level yet extensible library for fast language model tuning via automatic prompt search, featuring integration with Hugg
A very terrible python-based programming language that uses folders instead of text files
PYFolders by Lewis L. Foster PYFolders is a very terrible python-based programming language that uses folders instead of regular text files. In this r
Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models
Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models
This is the Alpha of Nutte language, she is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda
nutte-language This is the Alpha of Nutte language, it is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda My language was
User-friendly Voice Cloning Application
Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur
Label data using HuggingFace's transformers and automatically get a prediction service
Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu
Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python
Synchrosqueezing is a powerful reassignment method that focuses time-frequency representations, and allows extraction of instantaneous amplitudes and frequencies
Wonkey - an open source programming language for the creation of cross-platform video games
Wonkey Programming Language Wonkey is an open source programming language for the creation of cross-platform video games, highly inspired by the “Blit
Reaction SMILES-AA mapping via language modelling
rxn-aa-mapper Reactions SMILES-AA sequence mapping setup conda env create -f conda.yml conda activate rxn_aa_mapper In the following we consider on ex
UniSpeech - Large Scale Self-Supervised Learning for Speech
UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202
Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE)
[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
FNet: Mixing Tokens with Fourier Transforms Pytorch implementation of Fnet : Mixing Tokens with Fourier Transforms. Citation: @misc{leethorp2021fnet,
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation
CPT This repository contains code and checkpoints for CPT. CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Gener
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Pretrained Language Model This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei N
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models
LUKE -- Language Understanding with Knowledge-based Embeddings
LUKE (Language Understanding with Knowledge-based Embeddings) is a new pre-trained contextualized representation of words and entities based on transf
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Hiring We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on NLP and large-scale pre-traine
Unsupervised Language Model Pre-training for French
FlauBERT and FLUE FlauBERT is a French BERT trained on a very large and heterogeneous French corpus. Models of different sizes are trained using the n
Large-scale pretraining for dialogue
A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT) This repository contains the source code and trained model for a large-
Few-shot Natural Language Generation for Task-Oriented Dialog
Few-shot Natural Language Generation for Task-Oriented Dialog This repository contains the dataset, source code and trained model for the following pa
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Introduction Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduc
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis
CodeBERT: A Pre-Trained Model for Programming and Natural Languages.
CodeBERT This repo provides the code for reproducing the experiments in CodeBERT: A Pre-Trained Model for Programming and Natural Languages. CodeBERT
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
This repository contains code for the following two papers: VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short
PG-19 Language Modelling Benchmark
PG-19 Language Modelling Benchmark This repository contains the PG-19 language modeling benchmark. It includes a set of books extracted from the Proje
Conditional Transformer Language Model for Controllable Generation
CTRL - A Conditional Transformer Language Model for Controllable Generation Authors: Nitish Shirish Keskar, Bryan McCann, Lav Varshney, Caiming Xiong,
Multi Task Vision and Language
12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
ALBERT ***************New March 28, 2020 *************** Add a colab tutorial to run fine-tuning for GLUE datasets. ***************New January 7, 2020
Code for the paper "Language Models are Unsupervised Multitask Learners"
Status: Archive (code is provided as-is, no updates expected) gpt-2 Code and models from the paper "Language Models are Unsupervised Multitask Learner
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
GPT Neo 🎉 1T or bust my dudes 🎉 An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here t
Awesome Treasure of Transformers Models Collection
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
PyContinual (An Easy and Extendible Framework for Continual Learning)
PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read
The King is Naked: on the Notion of Robustness for Natural Language Processing
the-king-is-naked: on the notion of robustness for natural language processing AAAI2022 DISCLAIMER:This repo will be updated soon with instructions on
💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena.
NORETURN is an esoteric programming language, based around the idea of not going back
NORETURN NORETURN is an esoteric programming language, based around the idea of not going back Concept Program coded in noreturn runs over one array,
A Python library for electronic structure pre/post-processing
PyProcar PyProcar is a robust, open-source Python library used for pre- and post-processing of the electronic structure data coming from DFT calculati
A foreign language learning aid using a neural network to predict probability of translating foreign words
Langy Langy is a reading-focused foreign language learning aid orientated towards young children. Reading is an activity that every child knows. It is
The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques
Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener
pygame is a Free and Open Source python programming language library for making multimedia applications like games built on top of the excellent SDL library. C, Python, Native, OpenGL.
pygame is a Free and Open Source python programming language library for making multimedia applications like games built on top of the excellent SDL library. C, Python, Native, OpenGL.
🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.
Charset Detection, for Everyone 👋 The Real First Universal Charset Detector A library that helps you read text from an unknown charset encoding. Moti
✔️ Visual, reactive testing library for Julia. Time machine included.
PlutoTest.jl (alpha release) Visual, reactive testing library for Julia A macro @test that you can use to verify your code's correctness. But instead
JurjenLang, an interpreted programming language
JurjenLang An interpreted programming language Getting started Follow these three steps on your computer to get started git clone https://github.com/J
Python language from the beginning.
Python For Beginners Python Programming Language ♦️ Python is a very powerful and user friendly programming language. ❄️ ♦️ There are some basic sytax
A toolkit for document-level event extraction, containing some SOTA model implementations
❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there 👋 . Thanks for your stay in this repo. This project aims at buildi
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
WECHSEL Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. arXiv: https://arx
Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models.
Statutory Interpretation Data Set This repository contains the data set created for the following research papers: Savelka, Jaromir, and Kevin D. Ashl
ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab
AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode
SimpleITK is an image analysis toolkit with a large number of components supporting general filtering operations, image segmentation and registration
SimpleITK is an image analysis toolkit with a large number of components supporting general filtering operations, image segmentation and registration
Python programming language Test
Exercise You are tasked with creating a data-processing app that pre-processes and enriches the data coming from crawlers, with the following requirem
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"
Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
WECHSEL Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. arXiv: https://arx
This is a graphql api build using ariadne python that serves a graphql-endpoint at port 3002 to perform language translation and identification using deep learning in python pytorch.
Language Translation and Identification this machine/deep learning api that will be served as a graphql-api using ariadne, to perform the following ta
A small Python module for BMP image processing.
micropython-microbmp A small Python module for BMP image processing. It supports BMP image of 1/2/4/8/24-bit colour depth. Loading supports compressio
p5 is a Python package based on the core ideas of Processing.
p5 p5 is a Python library that provides high level drawing functionality to help you quickly create simulations and interactive art using Python. It c
These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations"
Few-shot-NLEs These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations". You can find the smal
gACSON software for visualization, processing and analysis of three-dimensional electron microscopy images
gACSON gACSON software is to visualize, segment, and analyze the morphology of neurons in three-dimensional electron microscopy images. If you use any
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
RDFLib RDFLib is a pure Python package for working with RDF. RDFLib contains most things you need to work with RDF, including: parsers and serializers
A complex language with high level programming and moderate syntax.
zsq a complex language with high level programming and moderate syntax.
Code for paper: "Spinning Language Models for Propaganda-As-A-Service"
Spinning Language Models for Propaganda-As-A-Service This is the source code for the Arxiv version of the paper. You can use this Google Colab to expl
Get 2D point positions (e.g., facial landmarks) projected on 3D mesh
points2d_projection_mesh Input 2D points (e.g. facial landmarks) on an image Camera parameters (extrinsic and intrinsic) of the image Aligned 3D mesh
A C-like hardware description language (HDL) adding high level synthesis(HLS)-like automatic pipelining as a language construct/compiler feature.
██████╗ ██╗██████╗ ███████╗██╗ ██╗███╗ ██╗███████╗ ██████╗ ██╔══██╗██║██╔══██╗██╔════╝██║ ██║████╗ ██║██╔════╝██╔════╝ ██████╔╝██║██████╔╝█
A Python Based Utility for Processing GST-Return JSON Files to Multiple Formats
GSTR 1/2A Utility by Shan.tk Open Source GSTR 1/GSTR 2A JSON to Excel utility based on Python. Useful for Auditors in Verifying GSTR 1 Return Invoices
Pulumi - Developer-First Infrastructure as Code. Your Cloud, Your Language, Your Way 🚀
Pulumi's Infrastructure as Code SDK is the easiest way to create and deploy cloud software that use containers, serverless functions, hosted services,
A Python Library to Make Quote Images
Quote2Image A Python Library to Make Quote Images How To Use? Download The Latest Package From Releases Extract The Zip File And Place Every File In I
Tools for teachers and students using nng (Natural Number Game)
nngtools Usage Place your nngsave.json to the directory in which you want to extract the level files. Place nngmap.json on the same directory. Run nng
Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets
Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets What is LASSL • How to Use What is LASSL LASSL은 LAnguage Semi-Super
📜Generate poetry with gcc diagnostics
gado (gcc awesome diagnostics orchestrator) is a wrapper of gcc that outputs its errors and warnings in a more poetic format.
The third home of the bare Programming Language (1st there's my heart, the forest came second and then there's Github :)
The third home of the bare Programming Language (1st there's my heart, the forest came second and then there's Github :)
Model of an AI powered sign language interpreter.
TEXT AND SPEECH TO SIGN LANGUAGE. A web application which takes in text or live audio speech recording as input, converts and displays the relevant Si