222 Repositories
Python EEND-vector-clustering Libraries
Specification for storing geospatial vector data (point, line, polygon) in Parquet
GeoParquet About This repository defines how to store geospatial vector data (point, lines, polygons) in Apache Parquet, a popular columnar storage fo
MarcoPolo is a clustering-free approach to the exploration of bimodally expressed genes along with group information in single-cell RNA-seq data
MarcoPolo is a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering Overview MarcoPolo
Get started with Machine Learning with Python - An introduction with Python programming examples
Machine Learning With Python Get started with Machine Learning with Python An engaging introduction to Machine Learning with Python TL;DR Download all
Tensorflow 1.13.X implementation for our NN paper: Wei Xia, Sen Wang, Ming Yang, Quanxue Gao, Jungong Han, Xinbo Gao: Multi-view graph embedding clustering network: Joint self-supervision and block diagonal representation. Neural Networks 145: 1-9 (2022)
Multi-view graph embedding clustering network: Joint self-supervision and block diagonal representation Simple implementation of our paper MVGC. The d
Frbmclust - Clusterize FRB profiles using hierarchical clustering, plot corresponding parameters distributions
frbmclust Getting Started Clusterize FRB profiles using hierarchical clustering,
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations
TopClus The source code used for Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations, published in WWW 2022. Requ
Company clustering with K-means/GMM and visualization with PCA, t-SNE, using SSAN relation extraction
RE results graph visualization and company clustering Installation pip install -r requirements.txt python -m nltk.downloader stopwords python3.7 main.
Implementation of SOMs (Self-Organizing Maps) with neighborhood-based map topologies.
py-self-organizing-maps Simple implementation of self-organizing maps (SOMs) A SOM is an unsupervised method for learning a mapping from a discrete ne
CLASSIX is a fast and explainable clustering algorithm based on sorting
CLASSIX Fast and explainable clustering based on sorting CLASSIX is a fast and explainable clustering algorithm based on sorting. Here are a few highl
Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionaries
Dictionary Learning for Clustering on Hyperspectral Images Overview Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionari
python scripts to perform coin die clustering (performed on Riedones3D).
python scripts to perform coin die clustering (performed on Riedones3D).
Crypto Portfolio Clustering with and without optimization techniques (elbow method, PCA).
Crypto Portfolio Clustering Crypto Portfolio Clustering with and without optimization techniques (elbow method, PCA). Analysis This is an anlysis of c
DimReductionClustering - Dimensionality Reduction + Clustering + Unsupervised Score Metrics
Dimensionality Reduction + Clustering + Unsupervised Score Metrics Introduction
Nowadays we don't have time to listen to each and every song that we come across in a playlist.
Nowadays we don't have time to listen to each and every song that we come across in a playlist. so, this project helps you. we used Spotify API for collecting the dataset information and able to do EDA and used K- means clustering technique and created new playlists in Spotify again.
clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)
Moroccan Stocks Clustering Context Hey! we don't always have to forecast time series am I right ? We use k-means to cluster about 70 moroccan stock pr
On the adaptation of recurrent neural networks for system identification
On the adaptation of recurrent neural networks for system identification This repository contains the Python code to reproduce the results of the pape
Weaviate demo with the text2vec-openai module
Weaviate demo with the text2vec-openai module This repository contains an example of how to use the Weaviate text2vec-openai module. When using this d
Clustering is a popular approach to detect patterns in unlabeled data
Visual Clustering Clustering is a popular approach to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a data
Collections for the lasted paper about multi-view clustering methods (papers, codes)
Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories
SI_EXPLAINER_tg_bot: This bot is an assistant for medical professionals in interpreting the results of patient clustering.
SI_EXPLAINER_tg_bot This bot is an assistant for medical professionals in interpreting the results of patient clustering. ABOUT This chatbot was devel
This project has Classification and Clustering done Via kNN and K-Means respectfully
This project has Classification and Clustering done Via kNN and K-Means respectfully. It later tests its efficiency via F1/accuracy/recall/precision for kNN and Davies-Bouldin Index for Clustering. The Data is also visually represented.
[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".
SMYRF: Efficient attention using asymmetric clustering Get started: Abstract We propose a novel type of balanced clustering algorithm to approximate a
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers This is an implementation of A Physics-Informed Vector Quantized Autoencoder for Dat
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth
Instance segmentation by jointly optimizing spatial embeddings and clustering bandwidth This codebase implements the loss function described in: Insta
This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.
Diabetes Prediction Using SVM I explore a diabetes prediction algorithm using a Diabetes dataset. Using a Support Vector Machine for my prediction alg
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov
Mapping a variable-length sentence to a fixed-length vector using BERT model
Are you looking for X-as-service? Try the Cloud-Native Neural Search Framework for Any Kind of Data bert-as-service Using BERT model as a sentence enc
Python Machine Learning Jupyter Notebooks (ML website)
Python Machine Learning Jupyter Notebooks (ML website) Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here) Also
Practical Machine Learning with Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Advanced raster and geometry manipulations
buzzard In a nutshell, the buzzard library provides powerful abstractions to manipulate together images and geometries that come from different kind o
A Java implementation of the experiments for the paper "k-Center Clustering with Outliers in Sliding Windows"
OutliersSlidingWindows A Java implementation of the experiments for the paper "k-Center Clustering with Outliers in Sliding Windows" Dataset generatio
Script and models for clustering LAION-400m CLIP embeddings.
clustering-laion400m Script and models for clustering LAION-400m CLIP embeddings. Models were fit on the first million or so image embeddings. A subje
An executor that wraps 3D mesh models and encodes 3D content documents to d-dimension vector.
3D Mesh Encoder An Executor that receives Documents containing point sets data in its blob attribute, with shape (N, 3) and encodes it to embeddings o
TICC is a python solver for efficiently segmenting and clustering a multivariate time series
TICC TICC is a python solver for efficiently segmenting and clustering a multivariate time series. It takes as input a T-by-n data matrix, a regulariz
Python port of R's Comprehensive Dynamic Time Warp algorithm package
Welcome to the dtw-python package Comprehensive implementation of Dynamic Time Warping algorithms. DTW is a family of algorithms which compute the loc
Anki vector Music ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more
Anki Vector Music 🎵 A bot that can play music on Telegram Group and Channel Voice Chats Available on telegram as @Anki Vector Music Features 🔥 Thumb
Clustering with variational Bayes and population Monte Carlo
pypmc pypmc is a python package focusing on adaptive importance sampling. It can be used for integration and sampling from a user-defined target densi
General Assembly's 2015 Data Science course in Washington, DC
DAT8 Course Repository Course materials for General Assembly's Data Science course in Washington, DC (8/18/15 - 10/29/15). Instructor: Kevin Markham (
A Practitioner's Guide to Natural Language Processing
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, Text Analytics with Python published by Apress/Springer.
K-Means Clustering and Hierarchical Clustering Unsupervised Learning Solution in Python3.
Unsupervised Learning - K-Means Clustering and Hierarchical Clustering - The Heritage Foundation's Economic Freedom Index Analysis 2019 - By David Sal
Machine learning library for fast and efficient Gaussian mixture models
This repository contains code which implements the Stochastic Gaussian Mixture Model (S-GMM) for event-based datasets Dependencies CMake Premake4 Blaz
Data and code from COVID-19 machine learning paper
Machine learning approaches for localized lockdown, subnotification analysis and cases forecasting in São Paulo state counties during COVID-19 pandemi
Source Code of NeurIPS21 paper: Recognizing Vector Graphics without Rasterization
YOLaT-VectorGraphicsRecognition This repository is the official PyTorch implementation of our NeurIPS-2021 paper: Recognizing Vector Graphics without
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering
Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th
Benchmark spaces - Benchmarks of how well different two dimensional spaces work for clustering algorithms
benchmark_spaces Benchmarks of how well different two dimensional spaces work fo
News-Articles-and-Essays - NLP (Topic Modeling and Clustering)
NLP T5 Project proposal Topic Modeling and Clustering of News-Articles-and-Essays Students: Nasser Alshehri Abdullah Bushnag Abdulrhman Alqurashi OVER
VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion (Interspeech
Vector space based Information Retrieval System for Text Processing - Information retrieval
Information Retrieval: Text Processing Group 13 Sequence of operations Install Requirements Add given wikipedia files to the corpus directory. Downloa
Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering
Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering
Self-labelling via simultaneous clustering and representation learning. (ICLR 2020)
Self-labelling via simultaneous clustering and representation learning 🆗 🆗 🎉 NEW models (20th August 2020): Added standard SeLa pretrained torchvis
SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]
Learning to Classify Images without Labels This repo contains the Pytorch implementation of our paper: SCAN: Learning to Classify Images without Label
Descriptor Vector Exchange
Descriptor Vector Exchange This repo provides code for learning dense landmarks without supervision. Our approach is described in the ICCV 2019 paper
SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model
SEOVER-Master This code is the implementation of paper: SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model
Create SVG drawings from vector geodata files (SHP, geojson, etc).
SVGIS Create SVG drawings from vector geodata files (SHP, geojson, etc). SVGIS is great for: creating small multiples, combining lots of datasets in a
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque
Turning images into '9-pan' palettes using KMeans clustering from sklearn.
img2palette Turning images into '9-pan' palettes using KMeans clustering from sklearn. Requirements We require: Pillow, for opening and processing ima
App customer segmentation cohort rfm clustering
CUSTOMER SEGMENTATION COHORT RFM CLUSTERING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU Nên chuyển qua theme màu dark thì sẽ nhìn đẹp hơn https://customer-segmentat
Industrial Image Anomaly Localization Based on Gaussian Clustering of Pre-trained Feature
Industrial Image Anomaly Localization Based on Gaussian Clustering of Pre-trained Feature Q. Wan, L. Gao, X. Li and L. Wen, "Industrial Image Anomaly
Powerful unsupervised domain adaptation method for dense retrieval.
Powerful unsupervised domain adaptation method for dense retrieval
Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs
Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs This repository contains code to accompany the paper "Hierarchical Clustering: O
An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022
Dual Correlation Reduction Network An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022. Any
Predicting Baseball Metric Clusters: Clustering Application in Python Using scikit-learn
Clustering Clustering Application in Python Using scikit-learn This repository contains the prediction of baseball metric clusters using MLB Statcast
PyIOmica (pyiomica) is a Python package for omics analyses.
PyIOmica (pyiomica) This repository contains PyIOmica, a Python package that provides bioinformatics utilities for analyzing (dynamic) omics datasets.
Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine
Semantic search through Wikipedia with the Weaviate vector search engine Weaviate is an open source vector search engine with build-in vectorization a
apysc is the Python frontend library to create html and js file, that has ActionScript 3 (as3)-like interface.
apysc apysc is the Python frontend library to create HTML and js files, that has ActionScript 3 (as3)-like interface. Notes: Currently developing and
Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.
Deep Learning Dataset Maker Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data. How to use Down
Python script to generate vector graphics of an oriented lattice unit cell
unitcell Python script to generate vector graphics of an oriented lattice unit cell Examples unitcell --type hexagonal --eulers 12 23 34 --axes --crys
PyTorch Implementation of Vector Quantized Variational AutoEncoders.
Pytorch implementation of VQVAE. This paper combines 2 tricks: Vector Quantization (check out this amazing blog for better understanding.) Straight-Th
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)
A simple API for working with University of California, Irvine (UCI) Machine Learning (ML) repository Table of Contents Introduction About Page of the
Minimal pure Python library for working with little-endian list representation of bit strings.
bitlist Minimal Python library for working with bit vectors natively. Purpose This library allows programmers to work with a native representation of
Autoencoders pretraining using clustering
Autoencoders pretraining using clustering
clustimage is a python package for unsupervised clustering of images.
clustimage The aim of clustimage is to detect natural groups or clusters of images. Image recognition is a computer vision task for identifying and ve
A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL
🌟 HNSW + PostgreSQL Indexer HNSWPostgreSQLIndexer Jina is a production-ready, scalable Indexer for the Jina neural search framework. It combines the
Semi-automated vocabulary generation from semantic vector models
vec2word Semi-automated vocabulary generation from semantic vector models This script generates a list of potential conlang word forms along with asso
Fast and robust clustering of point clouds generated with a Velodyne sensor.
Depth Clustering This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velo
An Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering
PC-SOS-SDP: an Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering PC-SOS-SDP is an exact algorithm based on the branch-and-bound techn
GroundSeg Clustering Optimized Kdtree
ground seg and clustering based on kitti velodyne data, and a additional optimized kdtree for knn and radius nn search
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms
MatrixProfile MatrixProfile is a Python 3 library, brought to you by the Matrix Profile Foundation, for mining time series data. The Matrix Profile is
Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis
Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis
This is a code repository for the paper "Graph Auto-Encoders for Financial Clustering".
Repository for the paper "Graph Auto-Encoders for Financial Clustering" Requirements Python 3.6 torch torch_geometric Instructions This is a simple c
Mixing up the Invariant Information clustering architecture, with self supervised concepts from SimCLR and MoCo approaches
Self Supervised clusterer Combined IIC, and Moco architectures, with some SimCLR notions, to get state of the art unsupervised clustering while retain
Re-implementation of the vector capsule with dynamic routing
VectorCapsule Re-implementation of the vector capsule with dynamic routing We implement the vector capsule and dynamic routing via graph neural networ
BanditPAM: Almost Linear-Time k-Medoids Clustering
BanditPAM: Almost Linear-Time k-Medoids Clustering This repo contains a high-performance implementation of BanditPAM from BanditPAM: Almost Linear-Tim
Sequence clustering and database creation using mmseqs, from local fasta files
Sequence clustering and database creation using mmseqs, from local fasta files
SPTAG: A library for fast approximate nearest neighbor search
SPTAG: A library for fast approximate nearest neighbor search SPTAG SPTAG (Space Partition Tree And Graph) is a library for large scale vector approxi
Vector tile server for the Wildfire Predictive Services Unit
wps-tileserver Vector tile server for the Wildfire Predictive Services Unit Overview The intention of this project is to: provide tools to easily spin
Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods
ADGC: Awesome Deep Graph Clustering ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets).
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized C
A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning
ICCVW21-TradiCV-Survey-of-LiDAR-Cluster Motivation In contrast to popular end-to-end deep learning LiDAR panoptic segmentation solutions, we propose a
Tutela: an Ethereum and Tornado Cash Anonymity Tool
Tutela: an Ethereum and Tornado Cash Anonymity Tool The repo contains open-source code for Tutela, an anonymity tool for Ethereum and Tornado Cash use
Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential patterns.
Make Complex Heatmaps Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential patterns. H
PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows.
An open-source, low-code machine learning library in Python 🚀 Version 2.3.5 out now! Check out the release notes here. Official • Docs • Install • Tu
A fast, efficient universal vector embedding utility package.
Magnitude: a fast, simple vector embedding utility library A feature-packed Python package and vector storage file format for utilizing vector embeddi
A library for efficient similarity search and clustering of dense vectors.
Faiss Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any
Infomap is a network clustering algorithm based on the Map equation.
Infomap Infomap is a network clustering algorithm based on the Map equation. For detailed documentation, see mapequation.org/infomap. For a list of re
Source code for our Paper "Learning in High-Dimensional Feature Spaces Using ANOVA-Based Matrix-Vector Multiplication"
NFFT4ANOVA Source code for our Paper "Learning in High-Dimensional Feature Spaces Using ANOVA-Based Matrix-Vector Multiplication" This package uses th
Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure
Geometric Vector Perceptron Implementation of equivariant GVP-GNNs as described in Learning from Protein Structure with Geometric Vector Perceptrons b
Unsupervised clustering of high content screen samples
Microscopium Unsupervised clustering and dataset exploration for high content screens. See microscopium in action Public dataset BBBC021 from the Broa
A machine learning project that predicts the price of used cars in the UK
Car Price Prediction Image Credit: AA Cars Project Overview Scraped 3000 used cars data from AA Cars website using Python and BeautifulSoup. Cleaned t