133 Repositories
Python topic-vector Libraries
Specification for storing geospatial vector data (point, line, polygon) in Parquet
GeoParquet About This repository defines how to store geospatial vector data (point, lines, polygons) in Apache Parquet, a popular columnar storage fo
KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정한 코드입니다.
KoBERTopic 모델 소개 KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정했습니다. 기존 BERTopic : https://github.com/MaartenGr/BERTopic/tree/05a6790b21009d
Get started with Machine Learning with Python - An introduction with Python programming examples
Machine Learning With Python Get started with Machine Learning with Python An engaging introduction to Machine Learning with Python TL;DR Download all
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations
TopClus The source code used for Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations, published in WWW 2022. Requ
PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
Text Analysis & Topic Extraction on Android App user reviews
AndroidApp_TextAnalysis Hi, there! This is code archive for Text Analysis and Topic Extraction from user_reviews of Android App. Dataset Source : http
CLNTM - Contrastive Learning for Neural Topic Model
Contrastive Learning for Neural Topic Model This repository contains the impleme
On the adaptation of recurrent neural networks for system identification
On the adaptation of recurrent neural networks for system identification This repository contains the Python code to reproduce the results of the pape
Weaviate demo with the text2vec-openai module
Weaviate demo with the text2vec-openai module This repository contains an example of how to use the Weaviate text2vec-openai module. When using this d
BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions
BERTopic BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable
Continual Learning of Long Topic Sequences in Neural Information Retrieval
ContinualPassageRanking Repository for the paper "Continual Learning of Long Topic Sequences in Neural Information Retrieval". In this repository you
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers This is an implementation of A Physics-Informed Vector Quantized Autoencoder for Dat
This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.
Diabetes Prediction Using SVM I explore a diabetes prediction algorithm using a Diabetes dataset. Using a Support Vector Machine for my prediction alg
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov
Mapping a variable-length sentence to a fixed-length vector using BERT model
Are you looking for X-as-service? Try the Cloud-Native Neural Search Framework for Any Kind of Data bert-as-service Using BERT model as a sentence enc
Advanced raster and geometry manipulations
buzzard In a nutshell, the buzzard library provides powerful abstractions to manipulate together images and geometries that come from different kind o
An executor that wraps 3D mesh models and encodes 3D content documents to d-dimension vector.
3D Mesh Encoder An Executor that receives Documents containing point sets data in its blob attribute, with shape (N, 3) and encodes it to embeddings o
Anki vector Music ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more
Anki Vector Music 🎵 A bot that can play music on Telegram Group and Channel Voice Chats Available on telegram as @Anki Vector Music Features 🔥 Thumb
NLP techniques such as named entity recognition, sentiment analysis, topic modeling, text classification with Python to predict sentiment and rating of drug from user reviews.
This file contains the following documents sumbited for Baruch CIS9665 group 9 fall 2021. 1. Dataset: drug_reviews.csv 2. python codes for text classi
Source Code of NeurIPS21 paper: Recognizing Vector Graphics without Rasterization
YOLaT-VectorGraphicsRecognition This repository is the official PyTorch implementation of our NeurIPS-2021 paper: Recognizing Vector Graphics without
News-Articles-and-Essays - NLP (Topic Modeling and Clustering)
NLP T5 Project proposal Topic Modeling and Clustering of News-Articles-and-Essays Students: Nasser Alshehri Abdullah Bushnag Abdulrhman Alqurashi OVER
VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion (Interspeech
Vector space based Information Retrieval System for Text Processing - Information retrieval
Information Retrieval: Text Processing Group 13 Sequence of operations Install Requirements Add given wikipedia files to the corpus directory. Downloa
Descriptor Vector Exchange
Descriptor Vector Exchange This repo provides code for learning dense landmarks without supervision. Our approach is described in the ICCV 2019 paper
SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model
SEOVER-Master This code is the implementation of paper: SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model
Create SVG drawings from vector geodata files (SHP, geojson, etc).
SVGIS Create SVG drawings from vector geodata files (SHP, geojson, etc). SVGIS is great for: creating small multiples, combining lots of datasets in a
MQTT Explorer - MQTT Subscriber client to explore topic hierarchies
mqtt-explorer MQTT Explorer - MQTT Subscriber client to explore topic hierarchies Overview The MQTT Explorer subscriber client is designed to explore
Some bravo or inspiring research works on the topic of curriculum learning.
Awesome Curriculum Learning Some bravo or inspiring research works on the topic of curriculum learning. A curated list of awesome Curriculum Learning
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing. It can be mainly used for non-English language to get accurate and relevant scraped text.
A collection and example code of every topic you need to know about in the basics of Python.
The Python Beginners Guide: Master The Python Basics Tonight This guide is a collection of every topic you need to know about in the basics of Python.
Powerful unsupervised domain adaptation method for dense retrieval.
Powerful unsupervised domain adaptation method for dense retrieval
This repository is an individual project made at BME with the topic of self-driving car simulator and control algorithm.
BME individual project - NEAT based self-driving car This repository is an individual project made at BME with the topic of self-driving car simulator
Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine
Semantic search through Wikipedia with the Weaviate vector search engine Weaviate is an open source vector search engine with build-in vectorization a
apysc is the Python frontend library to create html and js file, that has ActionScript 3 (as3)-like interface.
apysc apysc is the Python frontend library to create HTML and js files, that has ActionScript 3 (as3)-like interface. Notes: Currently developing and
Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.
Deep Learning Dataset Maker Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data. How to use Down
Python script to generate vector graphics of an oriented lattice unit cell
unitcell Python script to generate vector graphics of an oriented lattice unit cell Examples unitcell --type hexagonal --eulers 12 23 34 --axes --crys
PyTorch Implementation of Vector Quantized Variational AutoEncoders.
Pytorch implementation of VQVAE. This paper combines 2 tricks: Vector Quantization (check out this amazing blog for better understanding.) Straight-Th
Pipeline for training LSA models using Scikit-Learn.
Latent Semantic Analysis Pipeline for training LSA models using Scikit-Learn. Usage Instead of writing custom code for latent semantic analysis, you j
Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration
Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration This is the official repository for the EMNLP 2021 long pa
Minimal pure Python library for working with little-endian list representation of bit strings.
bitlist Minimal Python library for working with bit vectors natively. Purpose This library allows programmers to work with a native representation of
Linear programming solver for paper-reviewer matching and mind-matching
Paper-Reviewer Matcher A python package for paper-reviewer matching algorithm based on topic modeling and linear programming. The algorithm is impleme
A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL
🌟 HNSW + PostgreSQL Indexer HNSWPostgreSQLIndexer Jina is a production-ready, scalable Indexer for the Jina neural search framework. It combines the
Semi-automated vocabulary generation from semantic vector models
vec2word Semi-automated vocabulary generation from semantic vector models This script generates a list of potential conlang word forms along with asso
Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"
Graph Neural Topic Model (GNTM) This is the pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Persp
Various Algorithms for Short Text Mining
Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te
Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis
Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis
Re-implementation of the vector capsule with dynamic routing
VectorCapsule Re-implementation of the vector capsule with dynamic routing We implement the vector capsule and dynamic routing via graph neural networ
Conversational text Analysis using various NLP techniques
PyConverse Let me try first Installation pip install pyconverse Usage Please try this notebook that demos the core functionalities: basic usage noteb
SPTAG: A library for fast approximate nearest neighbor search
SPTAG: A library for fast approximate nearest neighbor search SPTAG SPTAG (Space Partition Tree And Graph) is a library for large scale vector approxi
Vector tile server for the Wildfire Predictive Services Unit
wps-tileserver Vector tile server for the Wildfire Predictive Services Unit Overview The intention of this project is to: provide tools to easily spin
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized C
A fast, efficient universal vector embedding utility package.
Magnitude: a fast, simple vector embedding utility library A feature-packed Python package and vector storage file format for utilizing vector embeddi
Python library for interactive topic model visualization. Port of the R LDAvis package.
pyLDAvis Python library for interactive topic model visualization. This is a port of the fabulous R package by Carson Sievert and Kenny Shirley. pyLDA
Source code for our Paper "Learning in High-Dimensional Feature Spaces Using ANOVA-Based Matrix-Vector Multiplication"
NFFT4ANOVA Source code for our Paper "Learning in High-Dimensional Feature Spaces Using ANOVA-Based Matrix-Vector Multiplication" This package uses th
Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure
Geometric Vector Perceptron Implementation of equivariant GVP-GNNs as described in Learning from Protein Structure with Geometric Vector Perceptrons b
Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.
Code for Discovering Topics in Long-tailed Corpora with Causal Intervention ACL2021 Findings Usage 0. Prepare environment Requirements: python==3.6 te
A machine learning project that predicts the price of used cars in the UK
Car Price Prediction Image Credit: AA Cars Project Overview Scraped 3000 used cars data from AA Cars website using Python and BeautifulSoup. Cleaned t
Topic Inference with Zeroshot models
zeroshot_topics Table of Contents Installation Usage License Installation zeroshot_topics is distributed on PyPI as a universal wheel and is available
Biterm Topic Model (BTM): modeling topics in short texts
Biterm Topic Model Bitermplus implements Biterm topic model for short texts introduced by Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. Actua
topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API
NLP Space News Topic Modeling Photos by nasa.gov (1, 2, 3, 4, 5) and extremetech.com Table of Contents Project Idea Data acquisition Primary data sour
neo Tool is great one in binary exploitation topic
neo Tool is great one in binary exploitation topic. instead of doing several missions by many tools and windows, you can now automate this in one tool in one session.. Enjoy it
Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom
Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom. Also, rasterize shapefile vectors as corresponding label image.
Sentiment Analysis Project using Count Vectorizer and TF-IDF Vectorizer
Sentiment Analysis Project This project contains two sentiment analysis programs for Hotel Reviews using a Hotel Reviews dataset from Datafiniti. The
MiniSom is a minimalistic implementation of the Self Organizing Maps
MiniSom Self Organizing Maps MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial N
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,
Codes for NeurIPS 2021 paper "On the Equivalence between Neural Network and Support Vector Machine".
On the Equivalence between Neural Network and Support Vector Machine Codes for NeurIPS 2021 paper "On the Equivalence between Neural Network and Suppo
Python framework for creating and scaling up production of vector graphics assets.
Board Game Factory Contributors are welcome here! See the end of readme. This is a vector-graphics framework intended for creating and scaling up prod
Explainable Zero-Shot Topic Extraction
Zero-Shot Topic Extraction with Common-Sense Knowledge Graph This repository contains the code for reproducing the results reported in the paper "Expl
QuizGame is a quiz with different topics. You can choose a topic and take the quiz
QuizGame is a quiz with different topics. You can choose a topic and take the quiz. In the end you will get your result. The program is under active development, so there may be errors or flaws in it.
Creates 3D geometries from 2D vector graphics, for use in geodynamic models
geomIO - creating 3D geometries from 2D input This is the Julia and Python version of geomIO, a free open source software to generate 3D volumes and s
Vector.ai assignment
fabio-tests-nisargatman Low Level Approach: ###Tables: continents: id*, name, population, area, createdAt, updatedAt countries: id*, name, population,
A project that automatically sends you a Medium article on a topic of your choosing to your email address daily.
Daily Article from Medium ✏️ About A project that automatically sends you a Medium article on a topic of your choosing to your email address daily. No
This repo stores the codes for topic modeling on palliative care journals.
This repo stores the codes for topic modeling on palliative care journals. Data Preparation You first need to download the journal papers. bash 1_down
Spatial Interpolation Toolbox is a Python-based GUI that is able to interpolate spatial data in vector format.
Spatial Interpolation Toolbox This is the home to Spatial Interpolation Toolbox, a graphical user interface (GUI) for interpolating geographic vector
Concept Modeling: Topic Modeling on Images and Text
Concept is a technique that leverages CLIP and BERTopic-based techniques to perform Concept Modeling on images.
GyroSPD: Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices
GyroSPD Code for the paper "Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices" accepted at NeurIPS 2021. Re
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
Auto-Research A no-code utility to generate a detailed well-cited survey with topic clustered sections (draft paper format) and other interesting arti
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.
DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.
DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI
Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.
Vector AI is a framework designed to make the process of building production grade vector based applications as quickly and easily as possible. Create
10x faster matrix and vector operations
Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations. If yo
Vector Quantization, in Pytorch
Vector Quantization - Pytorch A vector quantization library originally transcribed from Deepmind's tensorflow implementation, made conveniently into a
Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA
n-stage Latent Dirichlet Allocation (n-LDA) Proposed n-LDA & A Novel Approach for classical LDA Latent Dirichlet Allocation (LDA) is a generative prob
Next-generation of the non-destructive, node-based 2D image graphics editor
Non-destructive, node-based 2D image graphics editor written in Python, focused on simplicity, speed, elegance, and usability
Efficient matrix representations for working with tabular data
Efficient matrix representations for working with tabular data
[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning
DeepVecFont This is the official Pytorch implementation of the paper: Yizhi Wang and Zhouhui Lian. DeepVecFont: Synthesizing High-quality Vector Fonts
NLP topic mdel LDA - Gathered from New York Times website
NLP topic mdel LDA - Gathered from New York Times website
Werkzeug has a debug console that requires a pin. It's possible to bypass this with an LFI vulnerability or use it as a local privilege escalation vector.
Werkzeug Debug Console Pin Bypass Werkzeug has a debug console that requires a pin by default. It's possible to bypass this with an LFI vulnerability
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).
Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte
Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
Anchored CorEx: Hierarchical Topic Modeling with Minimal Domain Knowledge Correlation Explanation (CorEx) is a topic model that yields rich topics tha
The all new way to turn your boring vector meshes into the new fad in town; Voxels!
Voxelator The all new way to turn your boring vector meshes into the new fad in town; Voxels! Notes: I have not tested this on a rotated mesh. With fu
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.
EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates
SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for individual real-time corpus cluster task。基于single-pass算法思想的自动文本聚类小组件,内置tfidf和doc2vec两种文本向量方法,可自动输出聚类数目、类簇文档集合和簇类大小,用于自有实时数据的聚类任务。
项目的背景 SinglepassTextCluster, an TextCluster tool based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for individ
txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.
txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm
ETM - R package for Topic Modelling in Embedding Spaces
ETM - R package for Topic Modelling in Embedding Spaces This repository contains an R package called topicmodels.etm which is an implementation of ETM
Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure
Geometric Vector Perceptron Code to accompany Learning from Protein Structure with Geometric Vector Perceptrons by B Jing, S Eismann, P Suriana, RJL T
Neural implicit reconstruction experiments for the Vector Neuron paper
Neural Implicit Reconstruction with Vector Neurons This repository contains code for the neural implicit reconstruction experiments in the paper Vecto
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).
Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte