219 Repositories
Python word-vector Libraries
Sentiment Analysis Project using Count Vectorizer and TF-IDF Vectorizer
Sentiment Analysis Project This project contains two sentiment analysis programs for Hotel Reviews using a Hotel Reviews dataset from Datafiniti. The
Python script to generate Vale linting rules from word usage guidance in the Red Hat Supplementary Style Guide
ssg-vale-rules-gen Python script to generate Vale linting rules from word usage guidance in the Red Hat Supplementary Style Guide. These rules are use
MiniSom is a minimalistic implementation of the Self Organizing Maps
MiniSom Self Organizing Maps MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial N
A socket script to obtain chinese phones-sequence for any english word
Foreign Pronunciation Generator (English-Chinese) We provide a simple socket script for acquiring Chinese pronunciation of English words (phones in ai
This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms
This Project is based on NLTK(Natural Language Toolkit) It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,
Codes for NeurIPS 2021 paper "On the Equivalence between Neural Network and Support Vector Machine".
On the Equivalence between Neural Network and Support Vector Machine Codes for NeurIPS 2021 paper "On the Equivalence between Neural Network and Suppo
Python framework for creating and scaling up production of vector graphics assets.
Board Game Factory Contributors are welcome here! See the end of readme. This is a vector-graphics framework intended for creating and scaling up prod
This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.
Text Summarizer This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text. Team Members This mini-project was
This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".
Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con
An open-source online reverse dictionary.
An open-source online reverse dictionary.
Rates how pog a word or user is. Not random and does have *some* kind of algorithm to it.
PogRater :D Rates how pogchamp a word is :D A fun project coded by JBYT27 using Python3 Have you ever wondered how pog a word is? Well, congrats, you
Creates 3D geometries from 2D vector graphics, for use in geodynamic models
geomIO - creating 3D geometries from 2D input This is the Julia and Python version of geomIO, a free open source software to generate 3D volumes and s
Simple Reddit bot that replies to comments containing a certain word.
reddit-replier-bot Small comment reply bot based on PRAW. This script will scan the comments of a subreddit as they come in and look for a trigger wor
Vector.ai assignment
fabio-tests-nisargatman Low Level Approach: ###Tables: continents: id*, name, population, area, createdAt, updatedAt countries: id*, name, population,
🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy
floret: fastText + Bloom embeddings for compact, full-coverage vectors with spaCy floret is an extended version of fastText that can produce word repr
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region. This repository provides the codebase and dataset for our work WORD: Revisiting Or
Interpretable-contrastive-word-mover-s-embedding
Interpretable-contrastive-word-mover-s-embedding Paper Datasets Here is a Dropbox link to the datasets used in the paper: https://www.dropbox.com/sh/n
Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch
Next Word Prediction Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch 🎬 Project Demo ✔ Application is hosted on Streamlit. You can see t
This code is for a bot which will find a Twitter user's most tweeted word and tweet that word, tagging said user
max_tweeted_word This code is for a bot which will find a Twitter user's most tweeted word and tweet that word, tagging said user The program uses twe
Japanese Long-Unit-Word Tokenizer with RemBertTokenizerFast of Transformers
Japanese-LUW-Tokenizer Japanese Long-Unit-Word (国語研長単位) Tokenizer for Transformers based on 青空文庫 Basic Usage from transformers import RemBertToken
Spatial Interpolation Toolbox is a Python-based GUI that is able to interpolate spatial data in vector format.
Spatial Interpolation Toolbox This is the home to Spatial Interpolation Toolbox, a graphical user interface (GUI) for interpolating geographic vector
GyroSPD: Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices
GyroSPD Code for the paper "Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices" accepted at NeurIPS 2021. Re
Count the frequency of letters or words in a text file and show a graph.
Word Counter By EBUS Coding Club Count the frequency of letters or words in a text file and show a graph. Requirements Python 3.9 or higher matplotlib
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.
DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.
DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI
Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR
Speech_38_ru_commands Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR Программа умеет распознавать 38 ключевы
A Word Level Transformer layer based on PyTorch and 🤗 Transformers.
Transformer Embedder A Word Level Transformer layer based on PyTorch and 🤗 Transformers. How to use Install the library from PyPI: pip install transf
On-device wake word detection powered by deep learning.
Porcupine Made in Vancouver, Canada by Picovoice Porcupine is a highly-accurate and lightweight wake word engine. It enables building always-listening
Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.
Vector AI is a framework designed to make the process of building production grade vector based applications as quickly and easily as possible. Create
10x faster matrix and vector operations
Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations. If yo
Vector Quantization, in Pytorch
Vector Quantization - Pytorch A vector quantization library originally transcribed from Deepmind's tensorflow implementation, made conveniently into a
Next-generation of the non-destructive, node-based 2D image graphics editor
Non-destructive, node-based 2D image graphics editor written in Python, focused on simplicity, speed, elegance, and usability
This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"
Word-Level Coreference Resolution This is a repository with the code to reproduce the experiments described in the paper of the same name, which was a
Wikipedia WordCloud App generate Wikipedia word cloud art created using python's streamlit, matplotlib, wikipedia and wordcloud packages
Wikipedia WordCloud App Wikipedia WordCloud App generate Wikipedia word cloud art created using python's streamlit, matplotlib, wikipedia and wordclou
Efficient matrix representations for working with tabular data
Efficient matrix representations for working with tabular data
[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning
DeepVecFont This is the official Pytorch implementation of the paper: Yizhi Wang and Zhouhui Lian. DeepVecFont: Synthesizing High-quality Vector Fonts
Telegram Group Management Bot based on phython !!!
How to setup/deploy. For easiest way to deploy this Bot click on the below button Mᴀᴅᴇ Bʏ Sᴜᴘᴘᴏʀᴛ Sᴏᴜʀᴄᴇ Find This Bot on Telegram A modular Telegram
split Word file by chapter
split Word file by chapter we use the mircosoft word api to code this tool api url:https://docs.microsoft.com/zh-cn/dotnet/api/ if this tool is good f
Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming
Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification.
Werkzeug has a debug console that requires a pin. It's possible to bypass this with an LFI vulnerability or use it as a local privilege escalation vector.
Werkzeug Debug Console Pin Bypass Werkzeug has a debug console that requires a pin by default. It's possible to bypass this with an LFI vulnerability
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).
Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp
Accurately generate all possible forms of an English word e.g "election" -- "elect", "electoral", "electorate" etc.
Accurately generate all possible forms of an English word Word forms can accurately generate all possible forms of an English word. It can conjugate v
The all new way to turn your boring vector meshes into the new fad in town; Voxels!
Voxelator The all new way to turn your boring vector meshes into the new fad in town; Voxels! Notes: I have not tested this on a rotated mesh. With fu
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.
EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates
SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for individual real-time corpus cluster task。基于single-pass算法思想的自动文本聚类小组件,内置tfidf和doc2vec两种文本向量方法,可自动输出聚类数目、类簇文档集合和簇类大小,用于自有实时数据的聚类任务。
项目的背景 SinglepassTextCluster, an TextCluster tool based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for individ
txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.
txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.
This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"
Word-Level Coreference Resolution This is a repository with the code to reproduce the experiments described in the paper of the same name, which was a
MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving.
MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving. It is a comprehensive framework for research purpose that integrates popular MWP benchmark datasets and typical deep learning-based MWP algorithms.
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.
从flomo导出的笔记中生成词云
flomo-word-cloud 从flomo导出的笔记中生成词云 如何使用? 将本项目克隆到你的电脑上,使用如下的命令,安装所需python库 pip install -r requirements.txt 在项目里新建一个file文件夹,把所有从flomo导出的html文件放入其中 运行main
Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure
Geometric Vector Perceptron Code to accompany Learning from Protein Structure with Geometric Vector Perceptrons by B Jing, S Eismann, P Suriana, RJL T
Neural implicit reconstruction experiments for the Vector Neuron paper
Neural Implicit Reconstruction with Vector Neurons This repository contains code for the neural implicit reconstruction experiments in the paper Vecto
Probe and discover HTTP pathname using brute-force methodology and filtered by specific word or 2 words at once
pathprober Probe and discover HTTP pathname using brute-force methodology and filtered by specific word or 2 words at once. Purpose Brute-forcing webs
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.
relational-rnn-pytorch An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch. Relational Memory Core (
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).
Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte
edgedressing leverages a Windows "feature" in order to force a target's Edge browser to open. This browser is then directed to a URL of choice.
edgedressing One day while experimenting with airpwn-ng, I noticed unexpected GET requests on the target node. The node in question happened to be a W
This is my reading list for my PhD in AI, NLP, Deep Learning and more.
This is my reading list for my PhD in AI, NLP, Deep Learning and more.
Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"
CSCBLI Code for our ACL Findings 2021 paper, "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction". Require
Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)
Graph-to-Tree Learning for Solving Math Word Problems PyTorch implementation of Graph based Math Word Problem solver described in our ACL 2020 paper G
General Virtual Sketching Framework for Vector Line Art (SIGGRAPH 2021)
General Virtual Sketching Framework for Vector Line Art - SIGGRAPH 2021 Paper | Project Page Outline Dependencies Testing with Trained Weights Trainin
Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)
L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021) 🙈 A more detailed readme is co
Just some scripts to export vector tiles to geojson.
Vector tiles to GeoJSON Nowadays modern web maps are usually based on vector tiles. The great thing about vector tiles is, that they are not just imag
Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021
Mask-Align: Self-Supervised Neural Word Alignment This is the implementation of our work Mask-Align: Self-Supervised Neural Word Alignment. @inproceed
Create single line SVG illustrations from your pictures
Create single line SVG illustrations from your pictures
Create single line SVG illustrations from your pictures
Create single line SVG illustrations from your pictures
MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert, MILES uses the bert-base-multilingual-uncased model, as well as simple language-agnostic approaches to complex word identification (CWI) and candidate ranking.
MILES Multilingual Lexical Simplifier Explore the docs » Read LSBert Paper · Report Bug · Request Feature About The Project MILES is a multilingual te
A library for Multilingual Unsupervised or Supervised word Embeddings
MUSE: Multilingual Unsupervised and Supervised Embeddings MUSE is a Python library for multilingual word embeddings, whose goal is to provide the comm
Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)
Certified Robustness to Adversarial Word Substitutions This is the official GitHub repository for the following paper: Certified Robustness to Adversa
Vector Neurons: A General Framework for SO(3)-Equivariant Networks
Vector Neurons: A General Framework for SO(3)-Equivariant Networks Created by Congyue Deng, Or Litany, Yueqi Duan, Adrien Poulenard, Andrea Tagliasacc
Summary statistics of geospatial raster datasets based on vector geometries.
rasterstats rasterstats is a Python module for summarizing geospatial raster datasets based on vector geometries. It includes functions for zonal stat
Top2Vec is an algorithm for topic modeling and semantic search.
Top2Vec is an algorithm for topic modeling and semantic search. It automatically detects topics present in text and generates jointly embedded topic, document and word vectors.
Spatial Action Maps for Mobile Manipulation (RSS 2020)
spatial-action-maps Update: Please see our new spatial-intention-maps repository, which extends this work to multi-agent settings. It contains many ne
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)
A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of
The goal of this library is to generate more helpful exception messages for numpy/pytorch matrix algebra expressions.
Tensor Sensor See article Clarifying exceptions and visualizing tensor operations in deep learning code. One of the biggest challenges when writing co
A package built to support working with spatial data using open source python
EarthPy EarthPy makes it easier to plot and manipulate spatial data in Python. Why EarthPy? Python is a generic programming language designed to suppo
Fiona reads and writes geographic data files
Fiona Fiona reads and writes geographic data files and thereby helps Python programmers integrate geographic information systems with other computer s
Spatial Intention Maps for Multi-Agent Mobile Manipulation (ICRA 2021)
spatial-intention-maps This code release accompanies the following paper: Spatial Intention Maps for Multi-Agent Mobile Manipulation Jimmy Wu, Xingyua
This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.
DeepSceneTextReader This is a c++ project deploying a deep scene text reading pipeline. It reads text from natural scene images. Prerequsites The proj
Deep Learning Chinese Word Segment
引用 本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098 构建 安装好bazel代码构建工具,安装好tensorflow(目前本项目需
ARU-Net - Deep Learning Chinese Word Segment
ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the
Generate vector graphics from a textual caption
VectorAscent: Generate vector graphics from a textual description Example "a painting of an evergreen tree" python text_to_painting.py --prompt "a pai
A very simple framework for state-of-the-art Natural Language Processing (NLP)
A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. Flair is: A powerful NLP library. Flair allo
📝 Wrapper library for text generation / language models at char and word level with RNN in TensorFlow
tensorlm Generate Shakespeare poems with 4 lines of code. Installation tensorlm is written in / for Python 3.4+ and TensorFlow 1.1+ pip3 install tenso
Relevance Vector Machine implementation using the scikit-learn API.
scikit-rvm scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API. Quicks
一款高性能敏感词(非法词/脏字)检测过滤组件,附带繁体简体互换,支持全角半角互换,汉字转拼音,模糊搜索等功能。
一款高性能非法词(敏感词)检测组件,附带繁体简体互换,支持全角半角互换,获取拼音首字母,获取拼音字母,拼音模糊搜索等功能。
sK1 2.0 cross-platform vector graphics editor
sK1 2.0 sK1 2.0 is a cross-platform open source vector graphics editor similar to CorelDRAW, Adobe Illustrator, or Freehand. sK1 is oriented for prepr
Yet another Python binding for fastText
pyfasttext Warning! pyfasttext is no longer maintained: use the official Python binding from the fastText repository: https://github.com/facebookresea
Topic Modelling for Humans
gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
Text preprocessing, representation and visualization from zero to hero.
Text preprocessing, representation and visualization from zero to hero. From zero to hero • Installation • Getting Started • Examples • API • FAQ • Co
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
Basic Utilities for PyTorch Natural Language Processing (NLP)
Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor
🦆 Contextually-keyed word vectors
sense2vec: Contextually-keyed word vectors sense2vec (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detaile
A very simple framework for state-of-the-art Natural Language Processing (NLP)
A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. IMPORTANT: (30.08.2020) We moved our models
Unsupervised text tokenizer for Neural Network-based text generation.
SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu
A little word cloud generator in Python
Linux macOS Windows PyPI word_cloud A little word cloud generator in Python. Read more about it on the blog post or the website. The code is tested ag